/[webpac]/openisis/0.9.9e/doc/IIF.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /openisis/0.9.9e/doc/IIF.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 604 - (show annotations)
Mon Dec 27 21:49:01 2004 UTC (19 years, 4 months ago) by dpavlin
File MIME type: text/plain
File size: 7714 byte(s)
import of new openisis release, 0.9.9e

1 IIF, MARC and Z39.50
2
3
4 IIF is the "Information Interchange Format", a record serialization format
5 specified in ISO standard 2709, also published as ANSI
6 > http://www.niso.org/standards/resources/Z39-2.pdf Z39.2.
7 IIF is mostly a plaintext format, in that almost any information is encoded
8 using ASCII characters (no binary numbers) and the only control characters
9 used are byte values 29 (record terminator RT), 30 (field terminator FT)
10 and 31 (as subfield delimiter).
11
12
13 > http://www.loc.gov/marc/ MARC
14 ("MAchine Readable Catalogue") is actually a family of largely incompatible
15 standards (
16 > http://www.loc.gov/marc/marcdocz.html USMARC
17 ,
18 > http://www.ifla.org/VI/3/p1996-1/sec-uni.htm UNIMARC
19 , UKMARC, ...) that evolved from MARC I (1965).
20 While the main concern of the MARC standards is to specify actual data models
21 (assigning tags and subfield codes, which can be used perfectly well in
22 Malete, CDS/ISIS or other databases), they also specify a variant of IIF as
23 suggested common format for data exchange, which we here refer to as "MARC".
24 (This file syntax seems to be mostly the same for all MARC standards).
25
26
27 > ftp://ftp.loc.gov/pub/z3950/official/part1.txt Z39.50
28 is a network protocol to search and retrieve records.
29 It supports various query "languages", the most commonly used of which
30 is called Type-1 query. Type-1 is similar to the queries as supported
31 by Malete and CDS/ISIS, however, much more general and complex.
32 Terms can be searched for in any indexed field or with restriction
33 to one or more "attributes".
34
35 Attributes are basically the tags used in the index, which are almost always
36 different from those used in records. While it is common for records to use
37 any of the various MARCs or even completely different formats, the attributes
38 used in bibliographical systems are typically those specified by the Bib-1
39 attribute set (e.g. assigning 4 to title).
40
41
42 Z39.50 allows a client to select a record format from various conversions
43 supported by a server. When a MARC format is selected,
44 the data is actually transmitted serialized according to IIF.
45
46
47 * IIF and MARC serialized records
48
49 IIF specifies a serialization for records. Like the Malete record data file,
50 an IIF file is simply a stream of such records; there is no additional
51 file header.
52
53 A record has
54 - a 24 byte leader, containing 16 bytes structural data
55 and 8 bytes application data (x, imported as "MARC leader").
56 The format for MARC is LLLLLxxxxx22BBBBBxxx4500.
57 The Ls and Bs are total record length (including leader and a terminating RT)
58 and start of data (field values, after an FT terminating the dictionary).
59 The first '2' denotes that every field starts with two indicator bytes,
60 the second is the subfield identifier length including the delimiter char.
61 - a "dictionary" array with one entry per field containing 3 bytes tag,
62 and n and m bytes for length and offset.
63 n and m are digits at leader offset 20 and 21, MARC uses 4 and 5.
64 In general IIF, leader byte 22 may specify a number of implementation
65 defined entry bytes.
66 - the actual field values, each terminated by the FT character.
67
68 As opposed to folklore, MARC does NOT use a '$' as subfield delimiter,
69 nor a '#' for unused indicators. Rather, the examples in the specs
70 use a '$' to REPRESENT the subfield delimiter control character 31 (^_),
71 and a '#' to REPRESENT a blank. The RT(29, ^]) is sometimes represented as '\'
72 and the FT(30, ^^) as '^' or '@'.
73
74
75 * Malete IIF import and export
76
77 The malete tool provides two rather simplistic
78 > CmdLine commands
79 iifimp and iifexp.
80
81 The command specific options are:
82 - Ffile
83 specify full filename for the IIF files.
84 Default is the basename of the Malete database with extension .iif.
85 On UNIX, a filename '-' selects stdin/out.
86 - Nomarc (literally)
87 do not assume the MARC structure 22/450 on import. Requires proper IIF data.
88 - P[iic]
89 on export, prepend indicators ii and, where needed, subfield c.
90 A single -P uses two blanks as indicators and subfield '0'.
91 Suggested to produce at least syntactically correct MARC.
92 - Rid (literally)
93 on import, use a numeric control number (1st field, if it has tag 1)
94 as record id. Note that on export, the record id is always used as
95 control number unless the record already has one,
96 since this is specified as a must not only by MARC, but by IIF.
97
98 * creating proper IIF from WinIsis
99
100 In Database-Export, set the subfield separator to \031 and
101 output line length to 0.
102
103 If the fields do not contain valid MARC data, use a reformatting FST like
104 $
105 001 0 MFN
106 044 0 |00^a|,v44
107 024 0 |00^a|,v24
108 026 0 |00|,v26
109 070 0 (|00^a|,v70/)
110 $
111 Make sure, that
112 - the first output field is tag 1 containing some unique id
113 - every field starts with two indicator characters
114 (really should be blank, but that would be stripped during export)
115 - the indicators are followed by a delimiter and subfield identifier
116 Still the output is not 100% correct, since WinIsis sets
117 number of indicators and identifier length to 0, where MARC specifies 2.
118 However, many other MARC processors, including zebraidx, ignore these settings.
119
120
121 * making MARC data available via Z39.50
122
123 MARC records can be made easily available using indexdata's
124 > http://www.indexdata.dk/zebra/ zebra.
125
126 If records in your IIF file use tags and subfields conforming to, say, USmarc,
127 simply check out the test/usmarc example in the zebra distribution.
128 Put your data in the records subdir and run "zebraidx update records; zebrasrv".
129
130 If your data was exported from WinIsis, you may want to put a line
131 "encoding Cp850" in the .abs file.
132
133
134 You must use recordType: grs.marc.something, meaning that it's general
135 structured data in some marc file format.
136 The sample usmarc.abs uses the "marc usmarc.mar" statement,
137 and usmarc.mar (in the zebra/tab directory) contains "reference USmarc",
138 stating that the marc input actually IS in USmarc.
139 This need not be the truth, it just means that the records will be served
140 as is, if a client asks for USmarc.
141 However, only the tags listed in "elm" statements in the .abs files
142 will be indexed.
143
144
145 Note that zebra's indexing support is not as flexible as that of CDS/ISIS:
146 you can only select fields or subfields to be indexed in one of a couple
147 of modes (like word or phrase). To take full advantage of sophisticated
148 CDS/ISIS FSTs, include them in your export reformatting FST.
149 Use some otherwise unused field tags to hold the index terms and "elm"
150 statements to map them to bib-1 attributes.
151 Omit those fields from the display mapping.
152
153
154 To keep the data in its native format (say CDS), change the elm
155 statements to map the fields to index to the corresponding bib-1 attributes
156 for searching, e.g. "elm 024 Conference-name !",
157 and, instead of using the "marc usmarc.mar" statement,
158 create one or more maptabs to map the full record to one or more
159 USmarc a/o other presentation formats as applicable.
160 Check out the gils-usmarc.map example in the zebra/tab directory.
161
162
163 Consult the
164 > http://www.indexdata.dk/zebra/doc/ zebra documentation
165 for details.
166
167
168 * links
169 - ISO2709 "Information Interchange Format", a.k.a. ANSI/NISO
170 > http://www.niso.org/standards/resources/Z39-2.pdf Z39.2
171 - Machine Readable Catalogues
172 > http://www.loc.gov/marc/specifications/specrecstruc.html (US) MARC 21
173 ,
174 > http://lcweb.loc.gov/marc/ overview
175 , references at the
176 > http://www.oasis-open.org/cover/marc.html Cover Pages
177 - Z39.50
178 > ftp://ftp.loc.gov/pub/z3950/official/ official spec
179 , overview at
180 > http://www.oclcpica.org/content/45/pdf/z3950_handbook_paper.pdf OCLC|Pica
181 , links at
182 > http://www.indexdata.dk/technologies/z3950/ indexdata
183 , makers of excellent free Z39.50 software.
184 - Uncle Aung's
185 > http://uncleaung.com/zisis/ Zisis
186
187 ---
188 $Id: IIF.txt,v 1.5 2004/09/23 11:44:04 kripke Exp $

  ViewVC Help
Powered by ViewVC 1.1.26