/[webpac]/openisis/current/doc/xmlisis.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /openisis/current/doc/xmlisis.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 237 - (show annotations)
Mon Mar 8 17:43:12 2004 UTC (16 years, 7 months ago) by dpavlin
File MIME type: text/plain
File size: 3538 byte(s)
initial import of openisis 0.9.0 vendor drop

1 * some notes on the relation of XML and ISIS
2
3
4 XML is in widespread use as a lingua franca
5 for glueing software components together.
6 Several tools for this can be found at xml.apache.org.
7
8 What is missing here is an efficient, easy to use
9 way of storing XML data. Only the most trivial cases
10 are easily mapped onto the relational data model,
11 which uses flat records, consisting of a fixed number
12 of fields. The data structures modelled in XML
13 typically have a variable number of childs.
14 Hierarchical databases like ADABAS C are well suited
15 and actually used by SoftwareAG in their Tamino XML DB,
16 but aren't widely and freely available.
17
18
19 * ISIS to XML
20
21 ISIS records can be easily and canonically converted to XML.
22 Anything up to the first subfield delimiter is the body (a text node),
23 subfields are attributes
24 (strictly XML-ish this is ok only for non-repeated subfields).
25 Other special subdivisions of field content like the typical
26 <key word> may split to real child nodes.
27
28 The result (as generated by make pdemo) may look like:
29 $
30 <isisrec id="148">
31 <v69>
32 <key>Educational Psychology</key>
33 <key>universities</key>
34 <key>Kenya</key>
35 </v69>
36 <v70>Okatcha, F.M.M.O.</v70>
37 <v30 a="1 p."/>
38 <v24>Personal statement</v24>
39 <v26 c="1976"/>
40 <v12 p="Tbilisi, USSR" d="1976">Symposium on the Psychological Bases of Programmed Learning </v12>
41 </isisrec>
42 $
43
44 Instead of tag numbers and subfield characters,
45 symbolic names from the FDT may be used.
46
47
48 * XML to ISIS
49
50 XML data structures can be
51 easily and efficiently mapped to the data model of ISO2709.
52
53 The general conversion (based on a SAX parser) works as follows:
54 - when encountering an opening tag, look up it's name in the FDT.
55 If there is no FDT provided, create one on the fly.
56 If the FDT does not contain the tag name,
57 create a new entry using tag number max(100,1+highest tag in FDT).
58 Create a field using the tag number found and field value '+'.
59 - when encountering an attribute, look up it's name in the
60 > Meta metadata
61 Create a new subfield entry if needed using code 'a'
62 or 1+highest code used (for this tag).
63 Append a subfield using the code found.
64 - When encountering an empty tag (the current field ends with /&gt;),
65 change the starting '+' to '-'.
66 - When encountering a text node, add a field using tag number 0
67 with the node's body as value.
68 - When encountering a closing tag, look up it's name as for opening tags,
69 add a field with an empty value.
70 - As additional optimization, most text nodes can be eliminated
71 by using the initial value of a node to represent an immediatly
72 following text node.
73
74 For example look at RDF (
75 > http://www.w3.org/RDF
76 ,
77 > http://archive.dstc.edu.au/RDU/reports/RDF-Idiot
78 ).
79 A structure like
80 $
81 <DC:Creator parseType="Resource">
82 <vCard:FN> Dr Jacky J Crystal </vCard:FN>
83 <vCard:TITLE> Director </vCard:TITLE>
84 <vCard:EMAIL> jacky@dstc.com.au </vCard:EMAIL>
85 <vCard:ROLE> Researcher </vCard:ROLE>
86 </DC:Creator>
87 $
88 canonically maps to
89 $
90 100 +^aResource
91 101 +
92 0 Dr Jacky J Crystal
93 101
94 102 +
95 ...
96 $
97 or, with text-node elimination, to
98 $
99 100 +^aResource
100 101 -Dr Jacky J Crystal
101 102 -Director
102 ...
103 100
104 $
105 using about half the bytes it takes to store the original.
106
107 If they had made an attribute what can be an attribute
108 (not substructered, not repeatable) instead of a child,
109 it would read (with explicitly assigned subfield codes)
110 much more efficiently like
111 $
112 100 ^pResource^fDr Jacky J Crystal^tDirector^ejacky@dstc.com.au^rResearcher
113 $
114
115 Also see
116 > unirec
117 and
118 > Struct
119
120 ---
121 $Id: xmlisis.txt,v 1.7 2003/06/23 14:43:42 kripke Exp $

  ViewVC Help
Powered by ViewVC 1.1.26