/[wait]/trunk/README

This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!

Diff of /trunk/README

Parent Directory | Revision Log | View Patch Patch

-cvs-head/README
revision 20 by cvs2svn,
Tue May  9 11:29:45 2000 UTC
+trunk/README
revision 107 by dpavlin,
Tue Jul 13 12:45:55 2004 UTC
 Line 102 
 DESCRIPTION
      addition a query and a display module must be choosen.
    Access
      The access module defines which documents are members of a database.
      Usually an access module is a tied hash, whose keys are the Ids of the
      documents (did = document id) and whose values are the documents
-     themselves. The indexing process loops over the keys using `FIRSTKEY'
+     themselves. The indexing process loops over the keys using "FIRSTKEY"
-     and `NEXTKEY'. Documents are retrieved with `FETCH'.
+     and "NEXTKEY". Documents are retrieved with "FETCH".
-     By convention access modules should be members of the `WAIT::Document'
+     By convention access modules should be members of the "WAIT::Document"
-     hierarchy. Have a look at the `WAIT::Document::Split' module to get the
+     hierarchy. Have a look at the "WAIT::Document::Split" module to get the
      idea.
    Parse
      The task of the parse module is to split the documents into logical
-     parts via the `split' method. E.g. the `WAIT::Parse::Nroff' splits
+     parts via the "split" method. E.g. the "WAIT::Parse::Nroff" splits
      manuals piped through nroff(1) into the sections *name*, *synopsis*,
      *options*, *description*, *author*, *example*, *bugs*, *text*, *see*,
-     and *environment*. Here is the implementation of `WAIT::Parse::Base'
+     and *environment*. Here is the implementation of "WAIT::Parse::Base"
      which handles documents with a pretty simple tagged format:
        AU: Pfeifer, U.; Fuhr, N.; Huynh, T.
-Line 150 
 DESCRIPTION
+Line 148 
 DESCRIPTION
      we need a second method (*tag*) which marks the regions of the document
      with tags for the different attributes. This tagged form is used by the
      display module to hilight search terms in the documents. Besides the
-     tags for the attributes, the method might assign the special tags `_b'
+     tags for the attributes, the method might assign the special tags "_b"
-     and `_i' for indicating bold and italic regions.
+     and "_i" for indicating bold and italic regions.
        sub tag {
          my @result;
-Line 172 
 DESCRIPTION
+Line 170 
 DESCRIPTION
          return @result;               # we don't go for speed
        }
-     Obviously one could implement `split' via `tag'. The reason for having
+     Obviously one could implement "split" via "tag". The reason for having
-     two functions is speed. We need to call `split' for each document when
+     two functions is speed. We need to call "split" for each document when
      indexing a collection. Therefore speed is essential. On the other hand,
-     `tag' is called in order to display a single document and may be a
+     "tag" is called in order to display a single document and may be a
      little slower. It may care about tagging bold and italic regions. See
-     `WAIT::Parse::Nroff' how this might decrease performance.
+     "WAIT::Parse::Nroff" how this might decrease performance.
    Filter definition
      From the Information Retrieval perspective, the hardest part of the
      system is the filter module. The database administrator defines for each
      attribute, how the contents should be processed before it is stored in
-Line 196 
 DESCRIPTION
+Line 193 
 DESCRIPTION
              [ 'isotr', 'isolc', 'split2', 'stop', 'Stem']
-     The function `isotr' replaces unknown characters by blanks. `isolc'
+     The function "isotr" replaces unknown characters by blanks. "isolc"
-     transforms to lower case. `split2' splits into words and removes words
+     transforms to lower case. "split2" splits into words and removes words
-     shorter than two characters. `stop' removes the freeWAIS-sf stopwords
+     shorter than two characters. "stop" removes the freeWAIS-sf stopwords
-     and `Stem' applies the Porter algorithm for computing the stem of the
+     and "Stem" applies the Porter algorithm for computing the stem of the
      words.
      The filter definition for a collection defines a set of pipelines for

 Legend:



Removed from v.20
 


changed lines


 
Added in v.107
 Legend:



Removed from v.20
 


changed lines


 
Added in v.107
-Removed from v.20
+Added in v.107

	ViewVC Help
Powered by ViewVC 1.1.26