1 |
dpavlin |
604 |
HTLD - the hypertext linker |
2 |
|
|
|
3 |
|
|
DRAFT |
4 |
|
|
|
5 |
|
|
|
6 |
|
|
* overview |
7 |
|
|
|
8 |
|
|
HTLD does to hypertext files, what ld.so does to dynamically linked binaries. |
9 |
|
|
It resolves references to parts stored in independent files or variables, |
10 |
|
|
inserting them at the specified locations into the output stream. |
11 |
|
|
|
12 |
|
|
It actually is in no way confined to "hypertext" or even to text, |
13 |
|
|
however, this is a typical usage and there is special support for |
14 |
|
|
> http://www.ietf.org/rfc/rfc2396.txt URL |
15 |
|
|
and HTML encoding. |
16 |
|
|
|
17 |
|
|
|
18 |
|
|
There is a standalone htld binary, which is typically the interpreter |
19 |
|
|
of executable htld documents (just like ld.so and friends are the |
20 |
|
|
interpreters to run dynamically linked binaries). |
21 |
|
|
|
22 |
|
|
Engines for dynamic web content may also contain functions to resolve |
23 |
|
|
htld files. |
24 |
|
|
|
25 |
|
|
|
26 |
|
|
* variables |
27 |
|
|
|
28 |
|
|
Since htld is typically run as a NPH CGI, it uses a "standard" |
29 |
|
|
> http://CGI-Spec.Golux.Com/draft-coar-cgi-v11-03-clean.html CGI 1.1 |
30 |
|
|
environment to provide variables to be linked in. |
31 |
|
|
|
32 |
|
|
- 0 is the directory path (a/b/c) of the request (/a/b/c/d.html) |
33 |
|
|
- 1,2,... are the directory path segments |
34 |
|
|
- variables starting with an ASCII letter are fetched from |
35 |
|
|
(their first occurrence in) QUERY_STRING |
36 |
|
|
|
37 |
|
|
The path should usually be determined from REQUEST_URI, |
38 |
|
|
which unfortunately is not in the standard (but set by fnord). |
39 |
|
|
|
40 |
|
|
Other htld linkers may use different sources for variables. |
41 |
|
|
|
42 |
|
|
|
43 |
|
|
* the HTLD file format |
44 |
|
|
|
45 |
|
|
A htld file consist of a textual header, a blank line and a body |
46 |
|
|
(which typically contains html or similar). |
47 |
|
|
Header lines and the closing blank line must be terminated by linefeed |
48 |
|
|
characters (byte value 10), a carriage return just like any other character |
49 |
|
|
is considered part of the line. |
50 |
|
|
|
51 |
|
|
|
52 |
|
|
The first line typically specifies the htld interpreter like '#!/bin/htld'. |
53 |
|
|
Other header lines contain offsets (relative to the body) in ascending order |
54 |
|
|
plus instructions for content to insert at these offsets. |
55 |
|
|
|
56 |
|
|
Unlike server side includes as featured by Apache and other webservers, |
57 |
|
|
htld will in no way parse the body nor replace anything in the body, |
58 |
|
|
but rather require all locations for linking to be precomputed. |
59 |
|
|
Some ht compiler may be used to turn SSI-style comments in a html file |
60 |
|
|
into a htld linkable object. |
61 |
|
|
|
62 |
|
|
|
63 |
|
|
Linking instructions are stored in a Malete record: |
64 |
|
|
- 1 _offset_ _path_ |
65 |
|
|
include the contents of file _path_ at offset. |
66 |
|
|
If the file appears to be a htld file, it is linked up recursively. |
67 |
|
|
See below for details. |
68 |
|
|
- 2 _offset_ _var_ |
69 |
|
|
include the URL encoded value of var at offset |
70 |
|
|
(as found in QUERY_STRING or the REQUEST_URI). |
71 |
|
|
- 3 _offset_ _var_ |
72 |
|
|
include the plain value of var at offset |
73 |
|
|
(after applying any URL decoding) |
74 |
|
|
- 4 _offset_ _var_ |
75 |
|
|
include the value of var at offset |
76 |
|
|
(with URL decoding and minimal HTML encoding of lt, gt, amp and quot) |
77 |
|
|
- 5 _var_ _default_ |
78 |
|
|
declares a default value to be used for varname, if not found in QUERY_STRING. |
79 |
|
|
_default_ will be treated as URL encoded (used unmodified for 2). |
80 |
|
|
Only effective in instructions following this. |
81 |
|
|
|
82 |
|
|
An include _path_ is handled by first replacing |
83 |
|
|
any $varname in _path_ by varname's URL encoded value |
84 |
|
|
(applying the same precautions for valid path elements as in fnord). |
85 |
|
|
Then anything following a question mark is stripped and prepended |
86 |
|
|
to QUERY_STRING during the include (effectively overriding vars; |
87 |
|
|
the original path, however, is untouched during recursive includes). |
88 |
|
|
The remaining path is used to stat the file to include, |
89 |
|
|
so it should usually be relative to the webroot. |
90 |
|
|
Implementations may support _path_ being a local CGI (or UCGI socket). |
91 |
|
|
Support for remote includes is not planned. |
92 |
|
|
|
93 |
|
|
Offsets may be ommitted in the first (after declarations) |
94 |
|
|
and last instruction for header and footer, resp. |
95 |
|
|
Headers and footer includes are ignored on recursive includes. |
96 |
|
|
The header file is typically used to include anything up to the |
97 |
|
|
opening body tag, so the htld doc's body is a proper HTML fragment |
98 |
|
|
suitable to be included elsewhere. |
99 |
|
|
If _path_ is omitted, literal html and body tags are used. |
100 |
|
|
|
101 |
|
|
|
102 |
|
|
* links |
103 |
|
|
|
104 |
|
|
> http://www.ietf.org/rfc/rfc1945.txt HTTP/1.0 |
105 |
|
|
> http://www.ietf.org/rfc/rfc2616.txt HTTP/1.1 |
106 |
|
|
> http://www.ietf.org/rfc/rfc2617.txt HTTP Authentication |