/[webpac]/openisis/current/doc/xmlisis.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /openisis/current/doc/xmlisis.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 237 - (hide annotations)
Mon Mar 8 17:43:12 2004 UTC (20 years, 1 month ago) by dpavlin
File MIME type: text/plain
File size: 3538 byte(s)
initial import of openisis 0.9.0 vendor drop

1 dpavlin 237 * some notes on the relation of XML and ISIS
2    
3    
4     XML is in widespread use as a lingua franca
5     for glueing software components together.
6     Several tools for this can be found at xml.apache.org.
7    
8     What is missing here is an efficient, easy to use
9     way of storing XML data. Only the most trivial cases
10     are easily mapped onto the relational data model,
11     which uses flat records, consisting of a fixed number
12     of fields. The data structures modelled in XML
13     typically have a variable number of childs.
14     Hierarchical databases like ADABAS C are well suited
15     and actually used by SoftwareAG in their Tamino XML DB,
16     but aren't widely and freely available.
17    
18    
19     * ISIS to XML
20    
21     ISIS records can be easily and canonically converted to XML.
22     Anything up to the first subfield delimiter is the body (a text node),
23     subfields are attributes
24     (strictly XML-ish this is ok only for non-repeated subfields).
25     Other special subdivisions of field content like the typical
26     <key word> may split to real child nodes.
27    
28     The result (as generated by make pdemo) may look like:
29     $
30     <isisrec id="148">
31     <v69>
32     <key>Educational Psychology</key>
33     <key>universities</key>
34     <key>Kenya</key>
35     </v69>
36     <v70>Okatcha, F.M.M.O.</v70>
37     <v30 a="1 p."/>
38     <v24>Personal statement</v24>
39     <v26 c="1976"/>
40     <v12 p="Tbilisi, USSR" d="1976">Symposium on the Psychological Bases of Programmed Learning </v12>
41     </isisrec>
42     $
43    
44     Instead of tag numbers and subfield characters,
45     symbolic names from the FDT may be used.
46    
47    
48     * XML to ISIS
49    
50     XML data structures can be
51     easily and efficiently mapped to the data model of ISO2709.
52    
53     The general conversion (based on a SAX parser) works as follows:
54     - when encountering an opening tag, look up it's name in the FDT.
55     If there is no FDT provided, create one on the fly.
56     If the FDT does not contain the tag name,
57     create a new entry using tag number max(100,1+highest tag in FDT).
58     Create a field using the tag number found and field value '+'.
59     - when encountering an attribute, look up it's name in the
60     > Meta metadata
61     Create a new subfield entry if needed using code 'a'
62     or 1+highest code used (for this tag).
63     Append a subfield using the code found.
64     - When encountering an empty tag (the current field ends with /&gt;),
65     change the starting '+' to '-'.
66     - When encountering a text node, add a field using tag number 0
67     with the node's body as value.
68     - When encountering a closing tag, look up it's name as for opening tags,
69     add a field with an empty value.
70     - As additional optimization, most text nodes can be eliminated
71     by using the initial value of a node to represent an immediatly
72     following text node.
73    
74     For example look at RDF (
75     > http://www.w3.org/RDF
76     ,
77     > http://archive.dstc.edu.au/RDU/reports/RDF-Idiot
78     ).
79     A structure like
80     $
81     <DC:Creator parseType="Resource">
82     <vCard:FN> Dr Jacky J Crystal </vCard:FN>
83     <vCard:TITLE> Director </vCard:TITLE>
84     <vCard:EMAIL> jacky@dstc.com.au </vCard:EMAIL>
85     <vCard:ROLE> Researcher </vCard:ROLE>
86     </DC:Creator>
87     $
88     canonically maps to
89     $
90     100 +^aResource
91     101 +
92     0 Dr Jacky J Crystal
93     101
94     102 +
95     ...
96     $
97     or, with text-node elimination, to
98     $
99     100 +^aResource
100     101 -Dr Jacky J Crystal
101     102 -Director
102     ...
103     100
104     $
105     using about half the bytes it takes to store the original.
106    
107     If they had made an attribute what can be an attribute
108     (not substructered, not repeatable) instead of a child,
109     it would read (with explicitly assigned subfield codes)
110     much more efficiently like
111     $
112     100 ^pResource^fDr Jacky J Crystal^tDirector^ejacky@dstc.com.au^rResearcher
113     $
114    
115     Also see
116     > unirec
117     and
118     > Struct
119    
120     ---
121     $Id: xmlisis.txt,v 1.7 2003/06/23 14:43:42 kripke Exp $

  ViewVC Help
Powered by ViewVC 1.1.26