/[webpac]/trunk/all2xml.pl
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/all2xml.pl

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 234 - (view) (annotate) - [select for diffs]
Modified Sun Mar 7 22:51:14 2004 UTC (20 years ago) by dpavlin
File length: 25755 byte(s)
Diff to previous 233 , to selected 101
eval{...} now works for type="swish" also...


Revision 233 - (view) (annotate) - [select for diffs]
Modified Fri Mar 5 23:33:19 2004 UTC (20 years ago) by dpavlin
File length: 25796 byte(s)
Diff to previous 231 , to selected 101
lookup_key and lookup_val types now support filters


Revision 231 - (view) (annotate) - [select for diffs]
Modified Fri Mar 5 22:53:30 2004 UTC (20 years ago) by dpavlin
File length: 25556 byte(s)
Diff to previous 224 , to selected 101
clear memory cache when opening new file lookup


Revision 224 - (view) (annotate) - [select for diffs]
Modified Sun Feb 8 20:16:54 2004 UTC (20 years, 1 month ago) by dpavlin
File length: 25490 byte(s)
Diff to previous 218 , to selected 101
important bug fix for bug introduced in 1.57: it might eat your data
if you are not using filter. This one was hard do find...


Revision 218 - (view) (annotate) - [select for diffs]
Modified Thu Feb 5 10:55:58 2004 UTC (20 years, 1 month ago) by dpavlin
File length: 25489 byte(s)
Diff to previous 215 , to selected 101
Changed never userd format configuration option for import_xml to
marc_format to prevent clash with format for output. If you don't
specify it (as I never do) it will default to 'usmarc' which is probably
the right thing (tm).


Revision 215 - (view) (annotate) - [select for diffs]
Modified Sun Feb 1 22:06:00 2004 UTC (20 years, 1 month ago) by dpavlin
File length: 25462 byte(s)
Diff to previous 207 , to selected 101
brown-bag bug: I was using MARC.pm wrong: now whole file will be loaded
at start of indexing, changing memory usage to much more step-like, but
that enables real progress indicator and few seconds gain in indexing
speed.


Revision 207 - (view) (annotate) - [select for diffs]
Modified Sat Jan 31 21:03:06 2004 UTC (20 years, 1 month ago) by dpavlin
File length: 25315 byte(s)
Diff to previous 199 , to selected 101
thesaurus is finally working... It contains recursive entries to parnet
term, and we actually needed to display narrower terms, so mem_lookup was
created. Important changes:
- you can write eval{"901a" eq "Mikrotezaurus"} within <isis>
  tag and if expression evaluates to false, no content will be outputed
  (It's used to hide microtesarus terms from lover level descriptors)
- mem_lookup.pm now supports formats: you can write something like
  [a:5614];;[d:[a:5614]] and it will correctly embed values


Revision 199 - (view) (annotate) - [select for diffs]
Modified Wed Jan 7 12:29:11 2004 UTC (20 years, 2 months ago) by dpavlin
File length: 25019 byte(s)
Diff to previous 197 , to selected 101
fixed filter delimiter bug


Revision 197 - (view) (annotate) - [select for diffs]
Modified Sun Dec 21 03:27:02 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 25019 byte(s)
Diff to previous 196 , to selected 101
Changed behaviour of creating data for swish_exact when using type="index".
Now every line is separate entry in swish_exact. That will create additional
clutter in index (fields which wouldn't be used because we are not insering
them in index), but you will have to bare with this for now.


Revision 196 - (view) (annotate) - [select for diffs]
Modified Mon Dec 15 00:12:16 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 25247 byte(s)
Diff to previous 195 , to selected 101
correct support for swish_exact when there are repeatable fields


Revision 195 - (view) (annotate) - [select for diffs]
Modified Sun Dec 14 20:50:03 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 24995 byte(s)
Diff to previous 188 , to selected 101
don't repeat field name if same as last, support format_name and
format_delimiter on field level if using iterate_by_page (without this, it's
really hard to get useful formating when using iterate_by_page), don't warn
on rare occasion (which is faulty import_xml definition, but anyway...) when
using append="1"


Revision 188 - (view) (annotate) - [select for diffs]
Modified Sat Nov 29 19:07:00 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 24159 byte(s)
Diff to previous 182 , to selected 101
implemented index_delimiter which enables to to format index entries in format
(values to be inserted in index);;(values to be displayed) if there is
definition of index_delimiter=";;". This will allow you to index (and
search) through values from original database and still have ability to
display lookup fields.


Revision 182 - (view) (annotate) - [select for diffs]
Modified Sat Nov 29 15:59:19 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 23692 byte(s)
Diff to previous 181 , to selected 101
make index with lookup field working with iterate on page


Revision 181 - (view) (annotate) - [select for diffs]
Modified Tue Nov 25 20:19:03 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 23088 byte(s)
Diff to previous 180 , to selected 101
fix swish_exact fields so that they don't show up in display


Revision 180 - (view) (annotate) - [select for diffs]
Modified Tue Nov 25 20:04:24 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 23064 byte(s)
Diff to previous 178 , to selected 101
invalidate memory cache when needed


Revision 178 - (view) (annotate) - [select for diffs]
Modified Mon Nov 24 21:54:19 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 23001 byte(s)
Diff to previous 177 , to selected 101
major improvements: you can select order of scanning in each topic tag
to be eather by line (which is default, repeatable fields in one line will
be unrolled) or page-by-page (using new interate_by_page="1" attribute).
New page-by-page mode is really useful with lookups (because you can
append fields with lookups in same line, but using two tags), but it will
create multiple rows in html output.


Revision 177 - (view) (annotate) - [select for diffs]
Modified Mon Nov 24 01:19:15 2003 UTC (20 years, 3 months ago) by dpavlin
File length: 19332 byte(s)
Diff to previous 170 , to selected 101
support for lookup fields. Implemented using GDBM or TDB (which I recommend
because it's fastest implementation)


Revision 170 - (view) (annotate) - [select for diffs]
Modified Sun Nov 23 15:42:16 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 17207 byte(s)
Diff to previous 164 , to selected 101
Re-wrote parsing for ISO-type data (isis, marc) to use in-memory cache of
format... 10% speed improvement and cleaner code. Include filter functions
just once.


Revision 164 - (view) (annotate) - [select for diffs]
Modified Sat Nov 22 22:04:05 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 16888 byte(s)
Diff to previous 163 , to selected 101
implemented filter which can replace (or be used together with) unac_string
from Text::Unaccent


Revision 163 - (view) (annotate) - [select for diffs]
Modified Thu Nov 20 21:23:40 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 16781 byte(s)
Diff to previous 153 , to selected 101
Added type="swish_exact" to save data into swish index with boundaries
xxbxx data xxexxx. This is helpful to implement exact match from beginning
of query and exact match to full query which are defined using e[nr] field
in web user interface (with same [nr] as f[nr] and v[nr] fields) which
have to have value 1 (from beginning) 2 (from end, not that useful...) or
3 (1+2 - exact match)


Revision 153 - (view) (annotate) - [select for diffs]
Modified Sun Nov 16 22:42:41 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 16190 byte(s)
Diff to previous 144 , to selected 101
implemented formats which can be used to produce links between records
in WebPac (documented in README.links)


Revision 144 - (view) (annotate) - [select for diffs]
Modified Sun Nov 16 11:55:18 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 15286 byte(s)
Diff to previous 138 , to selected 101
fixed filters (again)


Revision 138 - (view) (annotate) - [select for diffs]
Modified Wed Oct 29 23:10:51 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 15252 byte(s)
Diff to previous 137 , to selected 101
Aargh! I should really go to sleep or make PostgeSQL replication or something...


Revision 137 - (view) (annotate) - [select for diffs]
Modified Wed Oct 29 22:57:43 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 15112 byte(s)
Diff to previous 136 , to selected 101
I removed too much: this always added delimiter before first element


Revision 136 - (view) (annotate) - [select for diffs]
Modified Wed Oct 29 22:46:49 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 15074 byte(s)
Diff to previous 135 , to selected 101
another fix for repeatable fields


Revision 135 - (view) (annotate) - [select for diffs]
Modified Wed Oct 29 21:27:00 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 15165 byte(s)
Diff to previous 109 , to selected 101
fix repeatable fields in index data


Revision 109 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 18:50:39 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 15165 byte(s)
Diff to previous 108 , to selected 101
erase also *.PTR files


Revision 108 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 18:20:27 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 14925 byte(s)
Diff to previous 106 , to selected 101
Overcome limit of 32 open databases. Unfortunatly, OpenIsis in current
version (0.9.0) doesn't support close call, so you need patch from:
http://www.rot13.org/~dpavlin/projects/openisis-0.9.0-perl_close.diff


Revision 106 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 17:09:36 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 14721 byte(s)
Diff to previous 104 , to selected 101
check for bogus *.TXT databases (with zero length or 0 records) and
erase them to force OpenIsis to use binary files


Revision 104 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 10:55:35 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13949 byte(s)
Diff to previous 102 , to selected 101
remove fake progress bar also


Revision 102 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 10:54:34 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13917 byte(s)
Diff to previous 101
removed debugging


Revision 101 - (view) (annotate) - [selected]
Modified Mon Jul 14 10:52:13 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13977 byte(s)
Diff to previous 98
- better error reporing from OpenIsis
- added show_progress in global.conf to turn off progress bar


Revision 98 - (view) (annotate) - [select for diffs]
Modified Sun Jul 13 22:29:14 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13585 byte(s)
Diff to previous 97 , to selected 101
fixed ordering


Revision 97 - (view) (annotate) - [select for diffs]
Modified Sun Jul 13 21:57:12 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13602 byte(s)
Diff to previous 90 , to selected 101
ability to join repeatable fields before inseting into index


Revision 90 - (view) (annotate) - [select for diffs]
Modified Sun Jul 13 13:22:50 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13300 byte(s)
Diff to previous 81 , to selected 101
repeatable fields (broken when other input formats where introduced) work
again


Revision 81 - (view) (annotate) - [select for diffs]
Modified Tue Jul 8 22:13:56 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 13117 byte(s)
Diff to previous 74 , to selected 101
the great rename: isis2xml.* -> all2xml.*


Revision 74 - (view) (annotate) - [select for diffs]
Modified Sat Jul 5 22:37:30 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 12774 byte(s)
Diff to previous 67 , to selected 101
support for new feed format which have decimal number of field, semicolumn
and space at beginning of each line (like: 0: data)


Revision 67 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 23:29:27 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 12739 byte(s)
Diff to previous 62 , to selected 101
implemented feed method which calls external program that returns
data line-by-line


Revision 62 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 20:11:48 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 11691 byte(s)
Diff to previous 59 , to selected 101
added MARC file import


Revision 59 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 17:57:11 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 10549 byte(s)
Diff to previous 58 , to selected 101
added config tag which can read any variable from isis2xml.conf file for
that library


Revision 58 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 16:56:40 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 9895 byte(s)
Diff to previous 57 , to selected 101
support type and sub-types (in form type_subtype)


Revision 57 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 15:05:23 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 9738 byte(s)
Diff to previous 56 , to selected 101
don't choke on input which iconv can't convert


Revision 56 - (view) (annotate) - [select for diffs]
Modified Wed Jun 25 12:09:27 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 9727 byte(s)
Diff to previous 54 , to selected 101
use start_row from excel.xml


Revision 54 - (view) (annotate) - [select for diffs]
Modified Mon Jun 23 20:20:32 2003 UTC (20 years, 9 months ago) by dpavlin
File length: 9734 byte(s)
Diff to previous 50 , to selected 101
added Microsoft Excel file import


Revision 50 - (view) (annotate) - [select for diffs]
Modified Sun Jun 1 13:46:42 2003 UTC (20 years, 9 months ago) by dpavlin
File length: 7470 byte(s)
Diff to previous 44 , to selected 101
move database arguments to .conf file


Revision 44 - (view) (annotate) - [select for diffs]
Modified Sat Mar 22 22:51:48 2003 UTC (21 years ago) by dpavlin
File length: 7223 byte(s)
Diff to previous 43 , to selected 101
fix


Revision 43 - (view) (annotate) - [select for diffs]
Modified Sat Mar 22 22:43:05 2003 UTC (21 years ago) by dpavlin
File length: 7232 byte(s)
Diff to previous 42 , to selected 101
fixed alphabet soup -- characters encoding should really work now!


Revision 42 - (view) (annotate) - [select for diffs]
Modified Sat Mar 15 21:48:48 2003 UTC (21 years ago) by dpavlin
File length: 7188 byte(s)
Diff to previous 40 , to selected 101
filter fix && optimisation


Revision 40 - (view) (annotate) - [select for diffs]
Modified Sat Mar 15 21:33:36 2003 UTC (21 years ago) by dpavlin
File length: 7153 byte(s)
Diff to previous 35 , to selected 101
major de-mungling of different codepages: use same codepage inside perl
(as opposed to UTF-8) and in files on disk


Revision 35 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 15:47:40 2003 UTC (21 years ago) by dpavlin
File length: 8276 byte(s)
Diff to previous 34 , to selected 101
last changes; completly broken charsets


Revision 34 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 08:06:07 2003 UTC (21 years ago) by dpavlin
File length: 8189 byte(s)
Diff to previous 32 , to selected 101
append="1" fix


Revision 32 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 07:53:01 2003 UTC (21 years ago) by dpavlin
File length: 8151 byte(s)
Diff to previous 29 , to selected 101
display fields using order="" attribute


Revision 29 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 07:08:54 2003 UTC (21 years ago) by dpavlin
File length: 7880 byte(s)
Diff to previous 21 , to selected 101
repeatable field support, filter functions added, broken charset (again!)


Revision 21 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 00:00:51 2003 UTC (21 years ago) by dpavlin
File length: 7225 byte(s)
Diff to previous 20 , to selected 101
fix


Revision 20 - (view) (annotate) - [select for diffs]
Modified Sat Feb 22 23:49:22 2003 UTC (21 years ago) by dpavlin
File length: 7245 byte(s)
Diff to previous 17 , to selected 101
add filter="name" for fields (to correct strane input data or make variations
for indexing)


Revision 17 - (view) (annotate) - [select for diffs]
Modified Sat Feb 22 21:33:04 2003 UTC (21 years, 1 month ago) by dpavlin
File length: 6454 byte(s)
Diff to previous 13 , to selected 101
fix index insertion


Revision 13 - (view) (annotate) - [select for diffs]
Modified Sun Feb 16 22:41:37 2003 UTC (21 years, 1 month ago) by dpavlin
File length: 6528 byte(s)
Diff to previous 10 , to selected 101
added configuration file with database descriptions,
moved isis.xml definition file in separate directory (in preparation for MARK),
support for different encodings in different files,
various fixes, improvements and badly written parts which will change ;-)


Revision 10 - (view) (annotate) - [select for diffs]
Modified Thu Jan 16 17:35:54 2003 UTC (21 years, 2 months ago) by dpavlin
File length: 5683 byte(s)
Diff to previous 9 , to selected 101
bunch of changes: make design more modular, implement index (partial
implementation) and other small and big changes


Revision 9 - (view) (annotate) - [select for diffs]
Modified Sat Jan 11 19:55:30 2003 UTC (21 years, 2 months ago) by dpavlin
File length: 6713 byte(s)
Diff to previous 7 , to selected 101
renamed "old" index to swish, and introduced index which is -- index;
implemented using PostgreSQL for now.


Revision 7 - (view) (annotate) - [select for diffs]
Modified Sat Jan 11 16:44:03 2003 UTC (21 years, 2 months ago) by dpavlin
File length: 5543 byte(s)
Diff to previous 5 , to selected 101
major modifications to produce first (non-working) version of Web CGI
interface.


Revision 5 - (view) (annotate) - [select for diffs]
Modified Sat Jan 11 06:14:48 2003 UTC (21 years, 2 months ago) by dpavlin
File length: 5107 byte(s)
Diff to previous 4 , to selected 101
require 1.02 version of Text::Unaccent (1.01 can't pass 'make test' here!)


Revision 4 - (view) (annotate) - [select for diffs]
Modified Sun Dec 1 22:51:29 2002 UTC (21 years, 3 months ago) by dpavlin
File length: 5065 byte(s)
Diff to previous 3 , to selected 101
remove subfield definition from values which are displayed and indexed


Revision 3 - (view) (annotate) - [select for diffs]
Modified Sat Nov 30 00:36:34 2002 UTC (21 years, 3 months ago) by dpavlin
File length: 4996 byte(s)
Diff to previous 1 , to selected 101
first really working version -- creates xml file for swish + swish config


Revision 1 - (view) (annotate) - [select for diffs]
Added Sun Nov 24 20:52:11 2002 UTC (21 years, 3 months ago) by dpavlin
File length: 1483 byte(s)
Diff to selected 101
Initial revision


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26