/[webpac]/trunk/parse_format.pm
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/parse_format.pm

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 678 - (view) (annotate) - [select for diffs]
Modified Sun Feb 27 23:07:35 2005 UTC (19 years, 1 month ago) by dpavlin
File length: 7461 byte(s)
Diff to previous 488 , to selected 90
Experimental support for dBase .dbf files. Usege like this in all2xml.conf:

[hda]
       dbf_file=/data/drustvene/hda/ISO.DBF
       type=dbf
       dbf_codepage=cp852
       dbf_mapping=<<_END_OF_MAP_
ID_BROJ                001
ISBN_BROJ      010
SKUPINA1       200
SKUPINA2       205
SKUPINA4       210
SKUPINA5       215
SKUPINA6       225
SKUPINA7       300
ANOTACIJA      330
PREDMET1       610
PREDMET2       610
PREDMET3       510
UDK            675
REDALICA       700
SIGNATURA      990
_END_OF_MAP_

dbf type will use <isis> tag in import_xml and dbf_codepage will
override codepage specified in import_xml file.

Small code refactoring.



Revision 488 - (view) (annotate) - [select for diffs]
Modified Wed Sep 29 17:22:24 2004 UTC (19 years, 6 months ago) by dpavlin
File length: 7318 byte(s)
Diff to previous 384 , to selected 90
changes to support UTF-8 encoding from
SpreadSheet::ParseExcel::FmtDefault.

You will have to modify line 69 from
	return pack('C*', unpack('n*', $sTxt));
to following which returns utf-8:
	return pack('U*', unpack('n*', $sTxt));



Revision 384 - (view) (annotate) - [select for diffs]
Modified Wed Jul 7 20:58:58 2004 UTC (19 years, 8 months ago) by dpavlin
File length: 7193 byte(s)
Diff to previous 381 , to selected 90
don't be greedy when trying to find end of eval{...}
This enables inserting { } into field after eval


Revision 381 - (view) (annotate) - [select for diffs]
Modified Wed Jul 7 17:34:42 2004 UTC (19 years, 8 months ago) by dpavlin
File length: 7192 byte(s)
Diff to previous 297 , to selected 90
if field in eval isn't repeatable use first value,
return eval errors


Revision 297 - (view) (annotate) - [select for diffs]
Modified Fri Apr 2 23:30:44 2004 UTC (19 years, 11 months ago) by dpavlin
File length: 7054 byte(s)
Diff to previous 293 , to selected 90
removed unneeded warning


Revision 293 - (view) (annotate) - [select for diffs]
Modified Mon Mar 29 19:41:12 2004 UTC (20 years ago) by dpavlin
File length: 7053 byte(s)
Diff to previous 263 , to selected 90
bug fix: eval now honours codepage settings


Revision 263 - (view) (annotate) - [select for diffs]
Modified Fri Mar 12 15:06:58 2004 UTC (20 years ago) by dpavlin
File length: 7033 byte(s)
Diff to previous 187 , to selected 90
ported r260 from hidra branch: moved eval to parse_format.pm where it
belongs. Also changed eval format to: eval{v901^a eq "Mikrotezaurus"}
(please note same format as in ISIS formating language)


Revision 187 - (view) (annotate) - [select for diffs]
Modified Sat Nov 29 18:58:34 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 6502 byte(s)
Diff to previous 176 , to selected 90
support for subfields in fields 10/11


Revision 176 - (view) (annotate) - [select for diffs]
Modified Mon Nov 24 01:16:04 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 6446 byte(s)
Diff to previous 170 , to selected 90
fix for wierd prefixes (consisting of chars and numbers)


Revision 170 - (view) (annotate) - [select for diffs]
Modified Sun Nov 23 15:42:16 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 6179 byte(s)
Diff to previous 105 , to selected 90
Re-wrote parsing for ISO-type data (isis, marc) to use in-memory cache of
format... 10% speed improvement and cleaner code. Include filter functions
just once.


Revision 105 - (view) (annotate) - [select for diffs]
Modified Mon Jul 14 17:08:37 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 4931 byte(s)
Diff to previous 92 , to selected 90
renamed get_sf to (isis|marc)_sf to avoid warning abouts re-definining of
function


Revision 92 - (view) (annotate) - [select for diffs]
Modified Sun Jul 13 13:42:17 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 4931 byte(s)
Diff to previous 90
repeatable fields in feeds are currently not supported


Revision 90 - (view) (annotate) - [selected]
Modified Sun Jul 13 13:22:50 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 4800 byte(s)
Diff to previous 78
repeatable fields (broken when other input formats where introduced) work
again


Revision 78 - (view) (annotate) - [select for diffs]
Modified Sat Jul 5 23:39:04 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 4736 byte(s)
Diff to previous 67 , to selected 90
fix character conversion


Revision 67 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 23:29:27 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 4682 byte(s)
Diff to previous 62 , to selected 90
implemented feed method which calls external program that returns
data line-by-line


Revision 62 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 20:11:48 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 3471 byte(s)
Diff to previous 57 , to selected 90
added MARC file import


Revision 57 - (view) (annotate) - [select for diffs]
Modified Fri Jul 4 15:05:23 2003 UTC (20 years, 8 months ago) by dpavlin
File length: 3289 byte(s)
Diff to previous 54 , to selected 90
don't choke on input which iconv can't convert


Revision 54 - (view) (annotate) - [select for diffs]
Modified Mon Jun 23 20:20:32 2003 UTC (20 years, 9 months ago) by dpavlin
File length: 3158 byte(s)
Diff to previous 46 , to selected 90
added Microsoft Excel file import


Revision 46 - (view) (annotate) - [select for diffs]
Modified Sun Mar 23 01:14:59 2003 UTC (21 years ago) by dpavlin
File length: 1808 byte(s)
Diff to previous 43 , to selected 90
speed-up


Revision 43 - (view) (annotate) - [select for diffs]
Modified Sat Mar 22 22:43:05 2003 UTC (21 years ago) by dpavlin
File length: 2106 byte(s)
Diff to previous 40 , to selected 90
fixed alphabet soup -- characters encoding should really work now!


Revision 40 - (view) (annotate) - [select for diffs]
Modified Sat Mar 15 21:33:36 2003 UTC (21 years ago) by dpavlin
File length: 1666 byte(s)
Diff to previous 23 , to selected 90
major de-mungling of different codepages: use same codepage inside perl
(as opposed to UTF-8) and in files on disk


Revision 23 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 06:50:55 2003 UTC (21 years, 1 month ago) by dpavlin
File length: 1512 byte(s)
Diff to previous 22 , to selected 90
fixed parser, added support for 'mfn' field


Revision 22 - (view) (annotate) - [select for diffs]
Modified Sun Feb 23 00:01:08 2003 UTC (21 years, 1 month ago) by dpavlin
File length: 1301 byte(s)
Diff to previous 10 , to selected 90
*** empty log message ***


Revision 10 - (view) (annotate) - [select for diffs]
Added Thu Jan 16 17:35:54 2003 UTC (21 years, 2 months ago) by dpavlin
File length: 987 byte(s)
Diff to selected 90
bunch of changes: make design more modular, implement index (partial
implementation) and other small and big changes


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26