1 |
* what is isis |
2 |
|
3 |
Isis is a simple, yet powerful database system with a large installed |
4 |
base since the 80s. Since it's well suited for bibliographic data, |
5 |
it's commonly used in libraries, and since it's very low cost, |
6 |
especially in those running on a low budget. |
7 |
|
8 |
* introduction to the isis db |
9 |
|
10 |
An isis DB is a list of rows of unspecified structure, each identified |
11 |
by a unique number, the rowid (a.k.a. mfn). Each row is a list of |
12 |
fields, and each field has number (tag) and a string value. Within a |
13 |
row there may be zero, one or more fields with a given tag. While the |
14 |
field's value usually is a textual representation of data in one or |
15 |
the other character encoding (commonly one of the IBM/DOS code pages), |
16 |
it may actually contain arbitrary bytes. This is closely modelled |
17 |
after ISO2709 "Information Interchange Format" (IIF, a.k.a. ANSI/NISO |
18 |
> http://www.niso.org/standards/resources/Z39-2.pdf Z39.2 |
19 |
) |
20 |
|
21 |
* subfields |
22 |
|
23 |
There is a convention to encode multiple fields in one by separating |
24 |
them with a '^' followed by one character tagging the subfield. So the |
25 |
field value '^afoo^bbar^bbaz' represents a field having one 'a' |
26 |
subfield with value 'foo' and two 'b' subfields 'bar' and 'baz'. An |
27 |
other separator char may be used, e.g. ASCII character 31 ("Unit |
28 |
Separator") is used in the |
29 |
> http://www.loc.gov/marc/specifications/specrecstruc.html MARC standard. |
30 |
|
31 |
* formatting |
32 |
|
33 |
There is a formatting language, with literal text, field and subfield |
34 |
variables, if-else branches (on field existance) and for loops (over |
35 |
field repetitions) (roughly speaking). |
36 |
|
37 |
* indexing |
38 |
|
39 |
An index is build by converting a row into a list of words (optionally |
40 |
applying formats) and stuffing every word, qualified by the position |
41 |
of it's occurence in the row, into a B+-Tree (which is actually spread |
42 |
to six files). Searching for a word or word prefix is possible with or |
43 |
without qualifying the position (field). Since all fields can be |
44 |
combined into one index, it is usually not necessary (but possible) to |
45 |
set up multiple indexes. |
46 |
|
47 |
* queries |
48 |
|
49 |
A query language allows for combination of word lookups using and, or |
50 |
and not(without) operators. This is very similar to the "Type-1" query of |
51 |
> ftp://ftp.loc.gov/pub/z3950/official/part1.txt Z39.50. |
52 |
|
53 |
* usage |
54 |
|
55 |
While isis lacks most features of RDBMS like complex relations between |
56 |
different entities, it's flexibility comes in handy for many |
57 |
catalogues and directories with highly varying records and one single |
58 |
level of substructure, which today are usually modelled in XML |
59 |
documents rather than table rows. In other words, isis is an ideal |
60 |
storage for many XML applications. The flexible indexing mechanism |
61 |
combines the best of full text searching and structured retrieval. |
62 |
|