/[hyperestraier]/upstream/0.5.3/doc/intro-en.html
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /upstream/0.5.3/doc/intro-en.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 10 - (hide annotations)
Wed Aug 3 15:25:48 2005 UTC (18 years, 10 months ago) by dpavlin
File MIME type: text/html
File size: 12629 byte(s)
import of upstream 0.5.3

1 dpavlin 2 <?xml version="1.0" encoding="UTF-8"?>
2    
3     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4    
5     <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
6    
7     <head>
8     <meta http-equiv="Content-Language" content="en" />
9     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
10     <meta http-equiv="Content-Style-Type" content="text/css" />
11     <meta name="author" content="Mikio Hirabayashi" />
12     <meta name="keywords" content="Hyper Estraier, Estraier, full-text search, API" />
13     <meta name="description" content="Introduction of Hyper Estraier" />
14     <link rel="contents" href="./" />
15     <link rel="alternate" href="intro-en.html" hreflang="en" title="the English version" />
16     <link rel="stylesheet" href="common.css" />
17     <link rel="icon" href="icon16.png" />
18     <link rev="made" href="mailto:mikio@users.sourceforge.net" />
19     <title>Introduction of Hyper Estraier Version 1</title>
20     </head>
21    
22     <body>
23    
24     <h1>Introduction</h1>
25    
26     <div class="note">Copyright (C) 2004-2005 Mikio Hirabayashi</div>
27 dpavlin 10 <div class="note">Last Update: Mon, 01 Aug 2005 00:50:38 +0900</div>
28 dpavlin 2 <div class="navi">[<a href="intro-ja.html" hreflang="ja">Japanese</a>] [<a href="index.html">HOME</a>]</div>
29    
30     <hr />
31    
32     <h2 id="tableofcontents">Table of Contents</h2>
33    
34     <ol>
35     <li><a href="#introduction">Introduction</a></li>
36     <li><a href="#installation">Installation</a></li>
37     <li><a href="#deployment">Deployment</a></li>
38     <li><a href="#complement">Complement</a></li>
39     </ol>
40    
41     <hr />
42    
43     <h2 id="introduction">Introduction</h2>
44    
45     <p>Hyper Estraier is a full-text search system. You can search lots of documents for some documents including specified words. If you run a web site, it is useful as your own search engine for pages in your site. Also, it is useful as search utilities of mail boxes and file servers.</p>
46    
47     <p>The characteristic of Hyper Estraier is the following.</p>
48    
49     <ul>
50     <li>High performance of search</li>
51     <li>High scalability of target documents</li>
52     <li>Perfect recall ratio by N-gram method</li>
53     <li>Phrase search, attribute search, and similarity search</li>
54     <li>Multilingualism with Unicode</li>
55     <li>Independent of file format and repository</li>
56     <li>Simple and powerful API</li>
57     <li>Supporting P2P architecture</li>
58     </ul>
59    
60     <p>To begin with, please try <a href="http://rbbs.sourceforge.jp/cgi-bin/estdemo-en/estseek.cgi">the demonstration site of Hyper Estraier</a> so that you will grasp overview of Hyper Estraier.</p>
61    
62     <p>Hyper Estraier has two aspects. One is as a library to construct a full-text search system. That is, API (application programming interface) is provided for programmers. It enables for you to embed advanced functions of full-text search into your applications.</p>
63    
64     <p>The other is as an application of the API described above. A command and a CGI script are provided. Using them, you can construct a typical full-text search system without any programming.</p>
65    
66     <p>This document describes how to construct a full-text search system with the command and the CGI script, seeing a subject matter of a search system of a web site. Let's start with learning of the command and then step to the API.</p>
67    
68     <hr />
69    
70     <h2 id="installation">Installation</h2>
71    
72     <p>This section describes how to install Hyper Estraier with the source package. As for a binary package, see its installation manual.</p>
73    
74     <h3>Preparation</h3>
75    
76     <p>Hyper Estraier is available on UNIX-like systems and Windows NT series. At least, the following environment are supported.</p>
77    
78     <ul>
79     <li>Linux 2.2 and later (IA32/IA64/AMD64/SPARC/Alpha)</li>
80     <li>FreeBSD 4.9 and later (IA32/Alpha)</li>
81     <li>Solaris 8 and later (IA32/SPARC)</li>
82     <li>Mac OS X 10.2 and later (PowerPC)</li>
83     <li>HP-UX 11.11 and later (IA64/PA-RISC)</li>
84     <li>Windows 2000 and later (IA32)</li>
85     </ul>
86    
87     <p>`gcc' 2.95 or later and `make' are required to install Hyper Estraier with the source package. They are installed by default on Linux, FreeBSD and so on.</p>
88    
89     <p>As Hyper Estraier depends on the following libraries, install them beforehand.</p>
90    
91     <ul>
92     <li><a href="http://www.gnu.org/software/libiconv/">libiconv</a> : for conversion of character encodings. 1.9.1 or later is suggested.</li>
93     <li><a href="http://www.gzip.org/zlib/">zlib</a> : for loss-less data compression. 1.2.1 or later is suggested.</li>
94     <li><a href="http://qdbm.sourceforge.net">QDBM</a> : for embedded database. 1.8.27 or later is required.</li>
95     </ul>
96    
97     <p>As well, it is suggested to build QDBM with enabling zlib (--enable-zlib) so that the index of Hyper Estraier becomes smaller. Note that QDBM 1.8.26 or earlier is not supported.</p>
98    
99     <h3>Installation</h3>
100    
101     <p>When an archive file of Hyper Estraier is extracted, change the current working directory to the generated directory and perform installation.</p>
102    
103     <p>Run the configuration script.</p>
104    
105     <pre>./configure
106     </pre>
107    
108     <p>Build programs.</p>
109    
110     <pre>make
111     </pre>
112    
113     <p>Perform self-diagnostic test.</p>
114    
115     <pre>make check
116     </pre>
117    
118     <p>Install programs. This operation must be carried out by the root user.</p>
119    
120     <pre>make install
121     </pre>
122    
123     <h3>Result</h3>
124    
125     <p>When a series of work finishes, the following files will be installed.</p>
126    
127     <pre>/usr/local/include/estraier.h
128     /usr/local/include/estmtdb.h
129     /usr/local/lib/libestraier.a
130     /usr/local/lib/libestraier.so.2.0.0
131     /usr/local/lib/libestraier.so.2
132     /usr/local/lib/libestraier.so
133     /usr/local/lib/pkgconfig/hyperestraier.pc
134     /usr/local/bin/estcmd
135     /usr/local/bin/estmttest
136     /usr/local/bin/estmaster
137     /usr/local/bin/estcall
138     /usr/local/bin/estload
139     /usr/local/bin/estconfig
140 dpavlin 10 /usr/local/bin/estwolefind
141 dpavlin 2 /usr/local/libexec/estseek.cgi
142     /usr/local/share/hyperestraier/estseek.conf
143     /usr/local/share/hyperestraier/estseek.tmpl
144     /usr/local/share/hyperestraier/estseek.top
145     /usr/local/share/hyperestraier/estresult.dtd
146     /usr/local/share/hyperestraier/estraier.idl
147     /usr/local/share/hyperestraier/locale/...
148     /usr/local/share/hyperestraier/filter/...
149     /usr/local/share/hyperestraier/increm/...
150     /usr/local/share/hyperestraier/doc/...
151     </pre>
152    
153     <h3>Mac OS X, HP-UX, and Windows</h3>
154    
155     <p>On Mac OS X, perform `make mac' instead of `make', and `make check-mac' instead of `make check', and `make install-mac' instead of `make install'. As well, `libqdbm.dylib' and so on are created instead of `libestraier.so' and so on.</p>
156    
157     <p>On HP-UX, perform `make hpux' instead of `make', and `make check-hpux' instead of `make check', and `make install-hpux' instead of `make install'. As well, `libqdbm.sl' is created instead of `libestraier.so' and so on.</p>
158    
159     <p>On Windows, the Cygwin environment is required for building. Moreover, MinGW versions of zlib, libiconv, QDBM, and Pthreads are required. On that basis, perform `make win'. No installation command is provided for Windows.</p>
160    
161     <hr />
162    
163     <h2 id="deployment">Deployment</h2>
164    
165     <p>This section describes how to create an index, and deploy the CGI script.</p>
166    
167     <h3>Administration Command</h3>
168    
169     <p>A database called inverted index is used in order to search for documents quickly. That is, you should make the index containing target documents before you search some of them.</p>
170    
171     <p>estcmd is provided to administrate indexes. estcmd handles each file on the file system of the local host, as each document. estcmd can register documents to the index and remove them from the index. Moreover, estcmd can gather documents under a directory and register them as a job lot. Supported file formats are plain-text, HTML, and e-mail (MIME).</p>
172    
173     <p>As other formats are also supported by using filters, the method is mentioned later.</p>
174    
175     <h3>Indexing</h3>
176    
177     <p>It is presupposed that you run a web site and its contents are under `/home/www/public_html'. Then, let's register them into the index as `/home/www/casket'.</p>
178    
179     <pre>cd /home/www
180     estcmd gather -sd casket /home/www/public_html
181     </pre>
182    
183 dpavlin 10 <p>Files under `/home/www/public_html' are gathered and registered into a new index named as `casket'. That's all for indexing.</p>
184 dpavlin 2
185     <h3>Deployment of the CGI Script</h3>
186    
187     <p>It is presupposed that the URL of a directory for CGI scripts is `http://www.estraier.ad.jp/cgi-bin/' and its local path is `/home/www/cgi-bin'. Then, let's deploy requisite files into there.</p>
188    
189     <pre>cd /home/www/cgi-bin/
190     cp /usr/local/libexec/estseek.cgi .
191     cp /usr/local/share/hyperestraier/estseek.* .
192     </pre>
193    
194     <p>`/usr/local/libexec/estseek.cgi', `estseek.(conf|tmpl|top)' in `/usr/local/share/hyperestraier/' are copied into `/home/www/cgi-bin/'. estseek.cgi is the CGI script. estseek.conf is the configuration file. estseek.tmpl is the template file. estseek.top is for the message of the top page.</p>
195    
196 dpavlin 10 <p>Open estseek.conf with a text editor and modify it. Most items are not needed to be modified, except for `indexname', `lprefix', and `gprefix'. Do as the following.</p>
197 dpavlin 2
198     <pre>indexname: /home/www/casket
199     ...
200     lprefix: file:///home/www/public_html/
201     gprefix: http://www.estraier.ad.jp/
202     ...
203     </pre>
204    
205     <p>`indexname' specifies the path of the index. `lprefix' specifies the local path of the document root directory. `gprefix' specifies the URL of the document root directory.</p>
206    
207     <h3>Let's Try It</h3>
208    
209     <p>All set? Let's access the URL `http://www.estraier.ad.jp/cgi-bin/estseek.cgi' with your favorite web browser. How to use is described on the page.</p>
210    
211     <h3>Updating the Index</h3>
212    
213     <p>When some documents in your site are modified or new documents are added, please update the index at regular intervals. Though it is okay to delete the index and remake it, incremental registration is useful.</p>
214    
215     <p>The `-sd' option added when indexing is to record modification time of each document. And it is useful for incremental registration. Let's perform the following command.</p>
216    
217     <pre>cd /home/www
218     estcmd gather -cl -sd -cm casket /home/www/public_html
219     </pre>
220    
221     <p>The option `-cm' is to ignore files which are not modified. The option `-cl' is to clean up data of overwritten documents.</p>
222    
223     <h3>Reflection of Deleted Documents</h3>
224    
225     <p>If some documents in your site are deleted, please reflect them to the index. Let's perform the following command.</p>
226    
227     <pre>cd /home/www
228     estcmd purge -cl casket
229     </pre>
230    
231     <p>All records in the index are scanned and records of deleted documents are removed. The option `-cl' is to clean up data of overwritten documents.</p>
232    
233     <h3>Optimization</h3>
234    
235     <p>Iteration of `gather' and `purge' makes the index fat gradually. Optimization is to eliminate the dispensable regions and keeps the index small.</p>
236    
237     <pre>cd /home/www
238     estcmd optimize casket
239     </pre>
240    
241     <p>If `gather' or `purge' is performed without the `-cl' option, records of deleted documents are not deleted though deletion marks was applied to them. `optimize' is useful to delete such void regions.</p>
242    
243 dpavlin 10 <h3>Automated Administration</h3>
244 dpavlin 2
245     <p>`cron' enables you to automate operations for administration. Register the following script to `crontab'.</p>
246    
247     <pre>/usr/local/bin/estcmd gather -cl -sd -cm /home/www/casket
248     /usr/local/bin/estcmd purge -cl /home/www/casket
249     </pre>
250    
251     <h3>For more detail</h3>
252    
253 dpavlin 9 <p>Detail information of the command and the CGI script is described in <a href="uguide-en.html">the user's guide</a>. Moreover, for information of the API, see <a href="pguide-en.html">the programming guide</a>.</p>
254 dpavlin 2
255     <hr />
256    
257     <h2 id="complement">Complement</h2>
258    
259     <p>This section describes how to contact the author and the license of Hyper Estraier.</p>
260    
261     <h3>Contact</h3>
262    
263 dpavlin 10 <p>Hyper Estraier was written and is maintained by <a href="http://qdbm.sourceforge.net/mikio/">Mikio Hirabayashi</a>. You can contact the author by e-mail to `mikio@users.sourceforge.net'. However, as for topics which can be shared among other users, please send it to one of the mailing lists. To join the mailing list, refer to `<a href="http://lists.sourceforge.net/lists/listinfo/hyperestraier-users">http://lists.sourceforge.net/lists/listinfo/hyperestraier-users</a>'.</p>
264 dpavlin 2
265     <h3>License</h3>
266    
267     <p>Hyper Estraier is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License or any later version.</p>
268    
269     <p>Hyper Estraier is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.</p>
270    
271     <p>You should have received a copy of the GNU Lesser General Public License along with Hyper Estraier (See the file `COPYING'); if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.</p>
272    
273     <hr />
274    
275     </body>
276    
277     </html>
278    
279     <!-- END OF FILE -->

  ViewVC Help
Powered by ViewVC 1.1.26