/[webpac]/openisis/current/doc/formatting.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /openisis/current/doc/formatting.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 237 - (show annotations)
Mon Mar 8 17:43:12 2004 UTC (20 years, 1 month ago) by dpavlin
File MIME type: text/plain
File size: 10148 byte(s)
initial import of openisis 0.9.0 vendor drop

1 * supported features
2
3 As of version 0.8.6, openisis supports only a *very* limited
4 subset of the formatting language.
5 It should however be sufficient for the most important purposes:
6 building indexes and basic preprocessing.
7
8 - all kinds of literals (',",|)
9 - all modes (P,D,H)
10 - Vn[]^c[] field selector including repeated subfields
11 - implicit and explicit loops
12
13 The rest of this text describes sort of a merger of formatting
14 features from both WinISIS and CISIS/wwwisis,
15 which openisis may attempt to support one day.
16
17
18 * formatting basics
19
20 Formatting in openisis is separated into two tasks,
21 that used to be mixed up in traditional ISIS software.
22 - record processing
23 In openisis, execution of a ("print"-)format actually transforms
24 one or more records into one new ("output") record.
25 It loops and selects fields, applies Mxx modes, REFs other records,
26 but screen formatting directives are simply added as special fields.
27 In terms of relational databases, a format defines a view.
28 - screen rendering
29 It is then the task of separate rendering engines to turn those
30 printformat fields into well-indented plaintext or HTML or Postscript
31 or TeX or Windows GDI commands or you name it.
32
33
34 * elements of formatting
35
36 A formatting expression is a series of literals and functions.
37 Funtions use zero, one or more data items as parameters.
38 Functions that expect one of the parameters to the left are called operators.
39
40 * input types
41
42 The expected type of a function parameter can be one of:
43 - s string (auto-concatenated auto-stringified any)
44 - n numeric expression (including string, boolean)
45 - o output (auto-stringified, but not concatenated)
46 - v variable or iterator
47 - r row iterator (list of rowids)
48
49 Immediatly after the function name, the following can be used:
50 - c a single character
51 - a alphanumeric bareword (identifier in the C language)
52 - i integer literal
53 - x anything
54
55
56 * output types
57
58 The output of an expression is zero, one or more fields,
59 which can be string, numeric or value.
60 The type of the last field determines which operators are recognized.
61
62 Field tags are positive for values (i.e. the output of value iterators),
63 zero for literals and negative in the range -1 .. -999 for printformats.
64 Other negative tags are only temporary:
65 value iterators are finally evaluated to emit values,
66 conditional literals and numbers are later eliminated or
67 changed to zero, resulting in a literal.
68
69
70 * context
71
72 during processing, we have the following context:
73 o output record (not changed)
74 r input record/s, db and loop; changed by REF, LR and loops
75 f format, mode and variables; changed by @, Mcc
76 x frame: function, signature, parameter type and position
77 new frames are opened for functions, operators and blocks
78
79
80 * show stoppers: ( ) , .. ]
81
82 During processing the format string is scanned left to right
83 and fields are appended to the ouput record as encountered.
84 Function "calls" can be explicit (parameter list is enclosed in parentheses),
85 or implicit, taking only one parameter (operators and literals like V24, X3).
86
87 An opening parentheses following an operator or literal function makes the
88 frame explicit. A parameter type of i or a is changed to n or s, resp.
89 Otherwise a ( starts
90 - in numeric context: an anonymous (n) arithmetic parentheses
91 - in string context: an S(s_) function
92 - in output context inside a loop: an indentation function
93 - otherwise starts an explicit loop
94
95 A closing parenthes closes whatever was opened by the last opening one.
96 A comma ',', or range '..' closes a parameter,
97 which closes an operator or saturated implicit loop.
98 If there was no parameter to close, an empty string or 0 number is emitted,
99 thus [..] is equivalent to [0..0].
100
101
102 * functions and type coercion
103
104 Every function has a signature which denotes the types of the
105 expected parameters.
106 Functions and switches are put on the stack, the type expected
107 from the following expressions is set according to their signature.
108 Whenever a comma or the closing parentheses is encountered,
109 the fields added for this parameter are converted to the expected type.
110 This is not much work for numeric or boolean types,
111 since expressions are evaluated while parsing.
112 Where a string is expected, we might have a record (i.e. multiple strings),
113 which is then collapsed as with the S function:
114 all printformats are discarded and other strings are concatenated.
115 When a function sees it's closing ')', it replaces the parameters
116 pushed on the output record by the function value calculated from them.
117 The return value is tagged with the first data tag encountered,
118 even from records evaluating to empty.
119
120 The signature is denoted by a string, where
121 - the 1st char gives the left hand operand ('_' for none)
122 - the 2nd char is a digit of required params or '_' for specials
123 - the following chars give types of each parameter in turn
124 - an trailing '_' denotes, that the last type may be repeated
125
126
127 * record loops: "x" |x| + (s_) WHILE n (s_) CONTINUE BREAK OCC IOCC
128
129 A loop is either started explicitly by a '(' in record context,
130 or implicitly by a conditional literal or a V value iterator.
131 An explicit loop is closed by the matching ')',
132 while an implicit one ends when saturated (after first iterator),
133 with a comma ',' or any other V iterator.
134 The loop content is then repeatedly executed while incrementing
135 the OCC counter from 1 (and possibly subject to a preceding WHILE).
136 On each turn there is a "last" flag (initially true), which is cleared by any
137 iterator that expects to have more fields (this may even be true if
138 there was no OCCth field, e.g. if the OCCth field didn't have some subfield),
139 consequently, if there are no iterators, the first turn is the last.
140 Iterators also set or clear a "had" state (initially unknown).
141 During each execution of the loop, we're then in string type context:
142 - a "" conditional emits it's contents only on OCC=1
143 if before the first iterator, else on last=true
144 - a || conditional emits it's contents on had!=false
145 - a + undoes an immediatly preceding conditional on OCC==1
146 and sets had to false if last is true
147 - an iterator adds the OCCth of (the selected) occurences (see below)
148 - if an iterator has no OCCth occurence, an immediatly preceding
149 || is undone and had is set to false
150 - other tokens are processed normally
151
152
153 * field iterator and operators: Vi Di Ni [n_] ^c
154 A field clause is always evaluated in a loop over OCC.
155 The iterator can be modified by the range and subfield operators.
156 What are the occurences in question, depends on the selected options:
157 field occ range, subfield selector and subfield range.
158 The integer list n_ may be given as x..y, where a missing x is 1 and
159 missing y or the keyword LAST means up to last occurence.
160 If no range is specified, that means all like [..].
161 If a subfield is specified, that subfield is selected.
162 If a subfield range is given, those occurences of the subfield are used:
163 if no range is specified for the field, subfield occs are relative
164 to the record, else relative to each individual field occ,
165 so use V71[..]^a[1] to get the first occ of a in each occ of v71.
166 With 71=^afoo^ax, 71=^abar^ay and 71=^abaz^az,
167 V71^a[1..3] will give the first three total occurences of subfield a,
168 i.e. foo, x and bar, whereas in CISIS it would give foo, bar and baz.
169
170 * other values and operators on values: Ei Si :=
171
172 * operators on string: *n .n (n) (n,n)
173 The * and . string operators modify the top (the last pushed field)
174 by removing the first n chars or all but the first n chars, resp.
175 An () indentation operator is a somewhat late indentation printformat,
176 which exchanges itself with the previous field.
177
178 * arithmetic operators on numbers: * / + -
179 * relational operators on numbers: = <> < <= > >=
180 * relational operators on strings: = <> < <= > >= :
181 * boolean operators on numbers: AND NOT OR
182
183 * literals: 123 123.45 'x' "x" |x| /*x*/ !cxc
184 * switches: IF n THEN o ELSE o FI SELECT s CASE s: o ELSECASE o ENDSEL
185 * format: @a MPL MPU MHL MHU MDL MDU
186
187 * db: DB MSTNAME MFN[i] L(s) NPST(s) NPOST(s) LR(s) LR(s[,n,n]) REF(r,o)
188 * xternal db: L->a(s) LR->a(s) NPOST->a(s) REF->a(r,o)
189
190 * printformats
191 # / % { } !cxc () QC QJ B I UL Ci Xi Fi FSi CLi NEWLINE(s) LW(n) PICT(s)
192 M(n[,n]) TAB[i] BOX[i] NP[i] NC[i] BPICT(s[,n])
193 FONTS(a_) COLS(s_) LINK(s_)
194
195 * other functions
196 &a(s_) CAT(s) GETENV(s) PUTENV(s) SYSTEM(s) DATE[i] PROC(s)
197 DATETIME DATEONLY VAL(s) RMAX(n_) RMIN(n_) RSUM(n_) RAVR(n_)
198 LEFT(s,n) RIGHT(s,n) SS(n,n,s) MID(s,n,n) REPLACE(s,s,s) INSTR(s,s) SIZE(s)
199 F(n) F(n,n) F(n,n,n) TYPE(s) TYPE(s,s) S(s_) LAST
200 NOCC(s_) P(s_) A(s_)
201
202 * unsupported syntax
203 L([s]s) NPOST([s]s) REF([s]j,...) -- hopeless, use winisis notation
204
205
206
207 * list of tokens by syntax
208
209 - empty tokens
210 stopper: , .. ) ] THEN ELSE FI CASE ELSECASE ENDSEL CONTINUE BREAK
211 state: MPL MPU MHL MHU MDL MDU
212 values: DB MSTNAME OCC IOCC # / % { } QC QJ B I UL DATETIME DATEONLY LAST
213 - immediate literal
214 i MFN[i] DATE[i] TAB[i] BOX[i] NP[i] NC[i] Ci Xi Fi FSi CLi Vi Di Ni Ei Si
215 @a L->a(s) LR->a(s) NPOST->a(s) REF->a(r,o)
216 ^c 'x' "x" |x| /*x*/ !cxc
217 - syntax blocks (jump & run)
218 o ( o ) WHILE n ( o ) REF( r, o )
219 IF n THEN o ELSE o FI SELECT s CASE s: o ELSECASE o ENDSEL
220 - operators
221 [n..n] ^c (n[,n]) := . * / + - = <> < <= > >= : AND OR NOT
222 - all others are properly braced functions
223 - ambiguous tokens
224 S F might be Si, Fi or S(), F()
225 / + * might be arithmetic or string operators or a newline
226 = <> < <= > >= might compare strings or numbers
227 := assigns number to Ei, string else
228 ( opens one or the other frame ...
229
230
231 * tokenizing & processing
232
233 * read a token and literal
234 - get (longest matching) token
235 - if token accepts a literal, get the literal
236 - if token accepts an opening (, get it
237 - resolve S/F ambiguity syntactically depending on presence of i literal
238
239 * process
240 - if token is possibly an operator of higher precedence,
241 check operator ambiguities,
242 coerce field according to operators wishes
243 and go opening the operator frame
244 - else
245 coerce field according to frame context
246 - if we had a field in o-context or token is a stopper,
247 close the parameter
248 - if the frame is implicit and saturated or token is a frame closer,
249 close the frame and start over processing
250 - add the token or literal

  ViewVC Help
Powered by ViewVC 1.1.26