1 |
About cases and trunks: La Maleta and the Malete Object Model. |
2 |
|
3 |
* DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT |
4 |
|
5 |
This document describes two data structures: |
6 |
- la maleta (suitcase or Malete Array MA) |
7 |
is a flexible and lightweight two dimensional array, |
8 |
which can be represented (stored, exchanged) as |
9 |
and provides an interface to a Malete Record |
10 |
- el maletero (car boot, trunk or Malete Object MO) |
11 |
is an extended maleta, supporting a DOM-style tree of contained "objects". |
12 |
The term "object" here, like in the somewhat mislabeled DOM, |
13 |
relates to structure, not to behaviour (methods). |
14 |
|
15 |
|
16 |
* la maleta - the Malete Array |
17 |
|
18 |
While the actual implementation of a maleta (e.g. by means of an actual |
19 |
two dimensional array) is not part of this specification, |
20 |
the concepts are probably easiest to understand by thinking in terms |
21 |
of a Malete Record as described in |
22 |
> RecStruct |
23 |
. |
24 |
|
25 |
In the first dimension, there is a list of fields (pairs of tags and values). |
26 |
Every fields value is typically structured into subfields. |
27 |
The first (index 0) field's value (header) is considered special, |
28 |
it typically contains some record's id a/o control information. |
29 |
Other fields (body) constitute the record's contents. |
30 |
|
31 |
A maleta is considered to be a field (it's field 0) augmented by a body. |
32 |
|
33 |
|
34 |
Like any array-like data structure, the maleta uses an index expression |
35 |
to address it's parts for either reading them or assigning to them. |
36 |
Only values can be assigned to; tags can only be inserted or deleted. |
37 |
|
38 |
Like the PHP array, it has a builtin cursor (in the first dimension) |
39 |
for the concept of a current field. |
40 |
|
41 |
Like the Perl array, it supports slices (addressing multiple parts at once) |
42 |
in both dimensions. |
43 |
|
44 |
|
45 |
Here we describe the textual representation of an index, |
46 |
which implementations will typically parse into an internal representation. |
47 |
|
48 |
The parts of an index are, optional, but in this order: |
49 |
- a field spec selecting one or more fields by tag or position |
50 |
- a subfield spec selecting one or more subfields by id or position |
51 |
- a range spec selecting an offset and length |
52 |
- a key spec selecting a key to match |
53 |
Every part has an operator and value (id). |
54 |
An index may address multiple fields or subfields. |
55 |
Selecting both depends on the implementation supporting nested lists. |
56 |
Implementations may ignore whitespace in index expressions. |
57 |
|
58 |
|
59 |
The field spec uses a numerical value as tag or position. |
60 |
Addressing a single field sets the cursor to that field: |
61 |
- '-' first: |
62 |
reset to head and move to next (having tag=id, if given). |
63 |
- '+' next: |
64 |
move to the next element (having tag=id, if given) without resetting |
65 |
- (none) current: |
66 |
no change with no id or if cursor is on tag=id, else first. |
67 |
- '@' index: |
68 |
selects the ith element, using id as index (0 is head). |
69 |
|
70 |
Addressing multiple fields: |
71 |
- '--' loop: |
72 |
loop elements having id, returning a list of the individual results. |
73 |
Without an id, the list contains alternating tags and values. |
74 |
- '++' end: |
75 |
loop at end; useful to append fields |
76 |
- '@@' values: |
77 |
loops, returning a list of values. |
78 |
|
79 |
The subfield spec defaults to none, selecting the entire field value. |
80 |
- '^' subfield: |
81 |
selects the value (with id cut) of the subfield with this id. |
82 |
Id may also be the pseudo subfield '&', selecting the tag, |
83 |
or '@', selecting the cursor position. |
84 |
- '?' test: |
85 |
returns boolean 0/1 whether the field a/o subfield (with id) exists. |
86 |
- '!' break: |
87 |
returns the field or (with id) subfield value, breaks processing else. |
88 |
- '#' position: |
89 |
with a number, selects the ith subfield value, including any id. |
90 |
|
91 |
Addressing multiple subfields: |
92 |
- '^^' subfields: |
93 |
returns a list of subfield values (with id cut) for the given id or all. |
94 |
Without an id, the list contains alternating ids and values. |
95 |
- '##' position: |
96 |
returns a list of unmodified values (i.e. a split on subfield delimiter). |
97 |
|
98 |
A range spec can have one or both, in that order, of the following: |
99 |
- '*' offset: |
100 |
cuts the first offset bytes (not characters) |
101 |
- '.' length: |
102 |
cuts to the first length bytes |
103 |
|
104 |
A keyspec is part of setting the cursor, doing a next while the selected |
105 |
data does not match the specified key. When used with a test or break, |
106 |
it applies to the data (not the boolean result), and, with empty field spec, |
107 |
returns false or breaks, instead of moving to next. |
108 |
- '==' exact: |
109 |
checks for exact match |
110 |
- '=%' prefix: |
111 |
checks for prefix match |
112 |
- '=:' contains: |
113 |
checks for substring |
114 |
- '=~' expr: |
115 |
evaluate key as regular expression (optional extension) |
116 |
|
117 |
Index expressions are independent of any metadata. |
118 |
Especially they do not know anything about fixed subfields, |
119 |
but only check for the delimiter character. |
120 |
Fixed subfields may be accessed using ranges. |
121 |
|
122 |
However, a helper procedure can be set to rewrite bad expressions, e.g. |
123 |
turning field and subfield names into tags, subfield identifiers and ranges. |
124 |
|
125 |
|
126 |
Minimal implementation requirements: |
127 |
- tags may be limited to the range 0 to 65534, inclusive |
128 |
- position ('#'), offset ('*') and length ('.') may be limited |
129 |
to the range 0 to 255, inclusive |
130 |
|
131 |
* array operators |
132 |
|
133 |
Basic operations on maletas are |
134 |
- getting a single index |
135 |
returns the value or list (empty value or list if not found) |
136 |
- getting multiple indexes |
137 |
A failing test or break stops processing. |
138 |
A positive test does not produce an output value. |
139 |
Returns a list of the values returned by each index |
140 |
(unless there is only one non-test index). |
141 |
|
142 |
In Tcl, get is the default operation. Examples, assuming a maleta called v: |
143 |
$ |
144 |
v 24 ;# select the current (or first) field 24 |
145 |
v 24^a ^b ;# list of a and b subfield of current field 24 |
146 |
foreach {i v} [v ^^] { puts "$i=$v" } ;# list all subfields of current |
147 |
v --24 ;# list of all 24 values |
148 |
v td^width ;# helper should rewrite this to 100^w |
149 |
v -24?a:foo .2 ;# the MARC indicators of first 24 field where ^a contains foo |
150 |
$v->get("-24?a:foo", ".2"); # more verbose in PHP, Perl |
151 |
v.get("-24?a:foo .2"); // no varargs in Java, split at blanks |
152 |
$ |
153 |
|
154 |
Assignment (set), like retrieval, takes any number of string parameters. |
155 |
Implementations should also support passing multiple values in one |
156 |
parameter as a list, maleta or serialized record. |
157 |
Depending on the environment, this may require a different or overloaded |
158 |
method. |
159 |
|
160 |
An index addressing a single value (i.e. not a test) takes the next parameter |
161 |
as new value. If the addressed item does not exist, it is created. |
162 |
Assigning no value (there is no next parameter) deletes an item. |
163 |
|
164 |
If multiple items are addressed, all following parameters (or the elements |
165 |
of a single list parameter) are applied in turn. |
166 |
Excess parameters create new items, lacking parameters delete items. |
167 |
|
168 |
|
169 |
Tcl uses a '=' parameter as assignment operator, '=@' to assign from a list. |
170 |
Examples: |
171 |
$ |
172 |
v ^a = $a ^b = $b ;# set current subfields a and b to the variables |
173 |
v --24 = foo bar baz ;# rec has now exactly 3 24 fields |
174 |
v --24 =@ {foo bar baz} ;# same |
175 |
$v->set("--24", "foo", "bar", "baz"); |
176 |
$ |
177 |
|
178 |
Insertion is a variant of assignment addressing newly created items. |
179 |
|
180 |
|
181 |
* el maletero - the Malete Object |
182 |
|
183 |
A maletero (or MO) is a maleta where every field is itself a maletero, |
184 |
i.e. can have a body. It's body fields are called childs. |
185 |
|
186 |
A maletero corresponds to a region (contigous sequence of fields) |
187 |
in a plain Malete record by means of counted or delimited structures. |
188 |
|
189 |
|
190 |
Maleteros come in three flavours: |
191 |
- list (plain vanilla): |
192 |
The maletero behaves exactly like a maleta. |
193 |
All childs are treated as simple fields, regardless of their tag. |
194 |
The MO maps one-to-one to it's record. |
195 |
This is the most efficient mode where no complex childs are needed. |
196 |
- struct (+ strawberry, chocolate): |
197 |
Childs with non-negative tags are treated as simple. |
198 |
A child with a negative tag -n corresponds to a region spanning n fields. |
199 |
This includes one field for the child's tag and header |
200 |
and any fields it's childs correspond to in turn. |
201 |
When looping or setting the cursor, |
202 |
counted subrecords are recognized and their body is skipped over. |
203 |
- mom (with fruit and liquor): |
204 |
in this DOM-style mode, only fields with tag 0 are simple (textnodes). |
205 |
Every child with a positive tag orresponds to a delimited structure. |
206 |
An implementation may or may not support counted structures in mom mode. |
207 |
|
208 |
A maletero provides object handles to it's parent and childs, |
209 |
either by modifying the current handle or by creating new handles. |
210 |
New handles can be based on a copy of the corresponding record |
211 |
or use region in the same record. The latter may not be supported |
212 |
by all implementations or make the objects immutable |
213 |
to avoid conflicting concurrent modifications. |
214 |
|
215 |
|
216 |
* implementations |
217 |
|
218 |
A basic implementation may provide only a maleta, |
219 |
which is sufficient for traditional CDS/ISIS style record access. |
220 |
|
221 |
A complete implementation may provide only a maletero which can be used |
222 |
as maleta (like in english trunk means both car boot and suitcase). |
223 |
|
224 |
A particularly efficient implementation may provide both separately. |
225 |
|
226 |
|
227 |
The initial implementation is a Tcl extension (written in C), |
228 |
optionally augmented by a Tcl module (written in Tcl). |
229 |
The abstract model, however, can be similarly implemented in other languages. |
230 |
|
231 |
--- |
232 |
$Id: MOM.txt,v 1.3 2004/05/03 13:04:36 kripke Exp $ |