This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
Log of /branches/humanistika/all2xml.pl
Parent Directory
|
Revision Log
Revision
286 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Mar 14 19:44:57 2004 UTC
(19 years, 9 months ago)
by
dpavlin
File length: 26152 byte(s)
Diff to
previous 277
merge changes from trunk to branches, converted all import_xml
Revision
263 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Fri Mar 12 15:06:58 2004 UTC
(19 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25532 byte(s)
Diff to
previous 259
ported r260 from hidra branch: moved eval to parse_format.pm where it
belongs. Also changed eval format to: eval{v901^a eq "Mikrotezaurus"}
(please note same format as in ISIS formating language)
Revision
259 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Mar 11 18:23:59 2004 UTC
(19 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25885 byte(s)
Diff to
previous 256
ported 257:258 from hidra branch
all2xml.pl - fix for swish without filter
openisis/perl/OpenIsis.pm - removed warning
Revision
255 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Tue Mar 9 12:17:05 2004 UTC
(19 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 26092 byte(s)
Diff to
previous 234
ported r248:252 from hidra branch:
r248: much improved installation instructions, especially for Debian
GNU/Linux distributions
r249: changed use of Spreadsheet::ParseExcel and MARC to require/import so
that dependency on those modules can be resolved in runtime.
r250: finished installation documentation
r251: removing dependency on HTML::Parser would ease installation
r252: smaller eval{} fiexes. eval{} logic should really move to
parse_format.pm
Revision
224 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Feb 8 20:16:54 2004 UTC
(19 years, 10 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25490 byte(s)
Diff to
previous 218
important bug fix for bug introduced in 1.57: it might eat your data
if you are not using filter. This one was hard do find...
Revision
218 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Feb 5 10:55:58 2004 UTC
(19 years, 10 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25489 byte(s)
Diff to
previous 215
Changed never userd format configuration option for import_xml to
marc_format to prevent clash with format for output. If you don't
specify it (as I never do) it will default to 'usmarc' which is probably
the right thing (tm).
Revision
215 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Feb 1 22:06:00 2004 UTC
(19 years, 10 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25462 byte(s)
Diff to
previous 207
brown-bag bug: I was using MARC.pm wrong: now whole file will be loaded
at start of indexing, changing memory usage to much more step-like, but
that enables real progress indicator and few seconds gain in indexing
speed.
Revision
207 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Jan 31 21:03:06 2004 UTC
(19 years, 10 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25315 byte(s)
Diff to
previous 199
thesaurus is finally working... It contains recursive entries to parnet
term, and we actually needed to display narrower terms, so mem_lookup was
created. Important changes:
- you can write eval{"901a" eq "Mikrotezaurus"} within <isis>
tag and if expression evaluates to false, no content will be outputed
(It's used to hide microtesarus terms from lover level descriptors)
- mem_lookup.pm now supports formats: you can write something like
[a:5614];;[d:[a:5614]] and it will correctly embed values
Revision
197 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Dec 21 03:27:02 2003 UTC
(19 years, 11 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 25019 byte(s)
Diff to
previous 196
Changed behaviour of creating data for swish_exact when using type="index".
Now every line is separate entry in swish_exact. That will create additional
clutter in index (fields which wouldn't be used because we are not insering
them in index), but you will have to bare with this for now.
Revision
195 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Dec 14 20:50:03 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 24995 byte(s)
Diff to
previous 188
don't repeat field name if same as last, support format_name and
format_delimiter on field level if using iterate_by_page (without this, it's
really hard to get useful formating when using iterate_by_page), don't warn
on rare occasion (which is faulty import_xml definition, but anyway...) when
using append="1"
Revision
188 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Nov 29 19:07:00 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 24159 byte(s)
Diff to
previous 182
implemented index_delimiter which enables to to format index entries in format
(values to be inserted in index);;(values to be displayed) if there is
definition of index_delimiter=";;". This will allow you to index (and
search) through values from original database and still have ability to
display lookup fields.
Revision
178 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Mon Nov 24 21:54:19 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 23001 byte(s)
Diff to
previous 177
major improvements: you can select order of scanning in each topic tag
to be eather by line (which is default, repeatable fields in one line will
be unrolled) or page-by-page (using new interate_by_page="1" attribute).
New page-by-page mode is really useful with lookups (because you can
append fields with lookups in same line, but using two tags), but it will
create multiple rows in html output.
Revision
177 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Mon Nov 24 01:19:15 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 19332 byte(s)
Diff to
previous 170
support for lookup fields. Implemented using GDBM or TDB (which I recommend
because it's fastest implementation)
Revision
170 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Nov 23 15:42:16 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 17207 byte(s)
Diff to
previous 164
Re-wrote parsing for ISO-type data (isis, marc) to use in-memory cache of
format... 10% speed improvement and cleaner code. Include filter functions
just once.
Revision
163 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Nov 20 21:23:40 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 16781 byte(s)
Diff to
previous 153
Added type="swish_exact" to save data into swish index with boundaries
xxbxx data xxexxx. This is helpful to implement exact match from beginning
of query and exact match to full query which are defined using e[nr] field
in web user interface (with same [nr] as f[nr] and v[nr] fields) which
have to have value 1 (from beginning) 2 (from end, not that useful...) or
3 (1+2 - exact match)
Revision
153 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Nov 16 22:42:41 2003 UTC
(20 years ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 16190 byte(s)
Diff to
previous 144
implemented formats which can be used to produce links between records
in WebPac (documented in README.links)
Revision
106 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Mon Jul 14 17:09:36 2003 UTC
(20 years, 5 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 14721 byte(s)
Diff to
previous 104
check for bogus *.TXT databases (with zero length or 0 records) and
erase them to force OpenIsis to use binary files
Revision
101 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Mon Jul 14 10:52:13 2003 UTC
(20 years, 5 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 13977 byte(s)
Diff to
previous 98
- better error reporing from OpenIsis
- added show_progress in global.conf to turn off progress bar
Revision
74 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Jul 5 22:37:30 2003 UTC
(20 years, 5 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 12774 byte(s)
Diff to
previous 67
support for new feed format which have decimal number of field, semicolumn
and space at beginning of each line (like: 0: data)
Revision
40 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Mar 15 21:33:36 2003 UTC
(20 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 7153 byte(s)
Diff to
previous 35
major de-mungling of different codepages: use same codepage inside perl
(as opposed to UTF-8) and in files on disk
Revision
20 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Feb 22 23:49:22 2003 UTC
(20 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 7245 byte(s)
Diff to
previous 17
add filter="name" for fields (to correct strane input data or make variations
for indexing)
Revision
13 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Feb 16 22:41:37 2003 UTC
(20 years, 9 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 6528 byte(s)
Diff to
previous 10
added configuration file with database descriptions,
moved isis.xml definition file in separate directory (in preparation for MARK),
support for different encodings in different files,
various fixes, improvements and badly written parts which will change ;-)
Revision
10 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Jan 16 17:35:54 2003 UTC
(20 years, 10 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 5683 byte(s)
Diff to
previous 9
bunch of changes: make design more modular, implement index (partial
implementation) and other small and big changes
Revision
9 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Jan 11 19:55:30 2003 UTC
(20 years, 11 months ago)
by
dpavlin
Original Path:
trunk/all2xml.pl
File length: 6713 byte(s)
Diff to
previous 7
renamed "old" index to swish, and introduced index which is -- index;
implemented using PostgreSQL for now.
This form allows you to request diffs between any two revisions of this file.
For each of the two "sides" of the diff,
enter a numeric revision.