/[swish]/trunk
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk

View Directory Listing Directory Listing


Sticky Revision:

Revision 107 - Directory Listing
Modified Sat Jul 9 13:14:25 2005 UTC (13 years, 8 months ago) by dpavlin
highlite to last word characters to catch suffixes


Revision 106 - Directory Listing
Modified Sat Jul 9 13:10:22 2005 UTC (13 years, 8 months ago) by dpavlin
experiment with HyperEstraier perl module


Revision 105 - Directory Listing
Modified Sat Jul 9 13:09:57 2005 UTC (13 years, 8 months ago) by dpavlin
fixed


Revision 104 - Directory Listing
Modified Sat Apr 30 23:35:27 2005 UTC (13 years, 10 months ago) by dpavlin
merge index slices into single index


Revision 103 - Directory Listing
Modified Sat Apr 30 23:29:27 2005 UTC (13 years, 10 months ago) by dpavlin
fixed warning


Revision 102 - Directory Listing
Modified Sat Apr 30 23:29:14 2005 UTC (13 years, 10 months ago) by dpavlin
check indexes, re-index if needed


Revision 101 - Directory Listing
Modified Sat Apr 30 20:21:10 2005 UTC (13 years, 10 months ago) by dpavlin
small fixes


Revision 100 - Directory Listing
Modified Sat Apr 30 20:21:02 2005 UTC (13 years, 10 months ago) by dpavlin
extract title from beginning of document if no other data is found


Revision 99 - Directory Listing
Modified Sat Apr 30 20:20:42 2005 UTC (13 years, 10 months ago) by dpavlin
support for multipe directories


Revision 98 - Directory Listing
Modified Sun Apr 24 18:09:01 2005 UTC (13 years, 10 months ago) by dpavlin
added --skipoutput (for testing)


Revision 97 - Directory Listing
Modified Sun Apr 24 16:44:13 2005 UTC (13 years, 10 months ago) by dpavlin
small changes


Revision 96 - Directory Listing
Modified Sun Apr 24 16:34:21 2005 UTC (13 years, 10 months ago) by dpavlin
added merge splitting in slices


Revision 95 - Directory Listing
Modified Sun Apr 24 16:33:53 2005 UTC (13 years, 10 months ago) by dpavlin
added --exclude path


Revision 94 - Directory Listing
Modified Sun Apr 24 16:33:21 2005 UTC (13 years, 10 months ago) by dpavlin
added experimental slicing before merge (not used)


Revision 93 - Directory Listing
Modified Mon Nov 22 17:09:44 2004 UTC (14 years, 3 months ago) by dpavlin
better cleanup


Revision 92 - Directory Listing
Modified Mon Nov 22 17:09:23 2004 UTC (14 years, 3 months ago) by dpavlin
skip symlinks


Revision 91 - Directory Listing
Modified Tue Sep 14 19:29:50 2004 UTC (14 years, 6 months ago) by dpavlin
fixed warning


Revision 90 - Directory Listing
Modified Wed Sep 1 14:12:57 2004 UTC (14 years, 6 months ago) by dpavlin
support for array of arrays in highlite, this way you may
fill alternative spelling from e.g. Lingua::Spelling::Alternative
and get correct highlightning


Revision 89 - Directory Listing
Modified Tue Aug 31 09:04:15 2004 UTC (14 years, 6 months ago) by dpavlin
extract snippet and highlite into separate module


Revision 88 - Directory Listing
Modified Tue Aug 31 07:47:05 2004 UTC (14 years, 6 months ago) by dpavlin
ignore not words (-computer) in queries when highliting


Revision 87 - Directory Listing
Modified Mon Aug 30 16:59:17 2004 UTC (14 years, 6 months ago) by dpavlin
much, much better snippets


Revision 86 - Directory Listing
Modified Mon Aug 30 11:16:39 2004 UTC (14 years, 6 months ago) by dpavlin
better snippets


Revision 85 - Directory Listing
Modified Mon Aug 30 11:14:24 2004 UTC (14 years, 6 months ago) by dpavlin
extract metadata for LJ


Revision 84 - Directory Listing
Modified Sun Aug 29 21:19:13 2004 UTC (14 years, 6 months ago) by dpavlin
if pdf file doesn't have a title, display filesname and page number


Revision 83 - Directory Listing
Modified Sun Aug 29 18:26:58 2004 UTC (14 years, 6 months ago) by dpavlin
produce valid html, escape characters in snippet


Revision 82 - Directory Listing
Modified Sun Aug 29 18:17:15 2004 UTC (14 years, 6 months ago) by dpavlin
added maximum size of content to extract snippet from (16k), smaller other
improvements


Revision 81 - Directory Listing
Modified Sat Aug 28 22:15:59 2004 UTC (14 years, 6 months ago) by dpavlin
implement snippets of content and highlighthing of words


Revision 80 - Directory Listing
Modified Sat May 22 18:33:33 2004 UTC (14 years, 10 months ago) by dpavlin
major improvement: added <path2title> to configuration so that you can specify
part of path to add prefix (collection title) to results,
code cleanup (removed unused parts of code), specified but non-existant
affix and findaffix files will be skipped


Revision 79 - Directory Listing
Modified Sun Apr 18 08:36:35 2004 UTC (14 years, 11 months ago) by dpavlin
new search URL


Revision 78 - Directory Listing
Modified Sun Apr 18 08:11:22 2004 UTC (14 years, 11 months ago) by dpavlin
modified configuration to include frameset which will have search on
top and normal mailman or search results on bottom


Revision 77 - Directory Listing
Modified Sun Apr 18 06:31:38 2004 UTC (14 years, 11 months ago) by dpavlin
index MailMan archives


Revision 76 - Directory Listing
Modified Sat Apr 17 18:41:21 2004 UTC (14 years, 11 months ago) by dpavlin
Pages: translation


Revision 75 - Directory Listing
Modified Sat Apr 17 18:34:45 2004 UTC (14 years, 11 months ago) by dpavlin
new navigation: previous page (<<), previous set (..), pages (1..x),
next set (..), next page (>>)


Revision 74 - Directory Listing
Modified Wed Apr 7 12:54:21 2004 UTC (14 years, 11 months ago) by dpavlin
fix title extraction (again)


Revision 73 - Directory Listing
Modified Tue Apr 6 19:21:07 2004 UTC (14 years, 11 months ago) by dpavlin
print collection name before link - collection name
is part of document title up to first " :: " delimiter


Revision 72 - Directory Listing
Modified Tue Apr 6 15:06:58 2004 UTC (14 years, 11 months ago) by dpavlin
pdf pagination now works correctly


Revision 71 - Directory Listing
Modified Sat Apr 3 15:15:36 2004 UTC (14 years, 11 months ago) by dpavlin
remove empty lines before <html> so that swish parser will catch <title>
correctly


Revision 70 - Directory Listing
Modified Fri Mar 19 09:46:33 2004 UTC (15 years ago) by dpavlin
update SourceForge repository


Revision 69 - Directory Listing
Modified Thu Mar 18 23:07:21 2004 UTC (15 years ago) by dpavlin
more verbose adding of titles


Revision 68 - Directory Listing
Modified Thu Mar 18 11:14:49 2004 UTC (15 years ago) by dpavlin
don't save empty pages in index


Revision 67 - Directory Listing
Modified Wed Mar 17 12:22:26 2004 UTC (15 years ago) by dpavlin
if path is specified use progspider


Revision 66 - Directory Listing
Modified Wed Mar 17 12:19:42 2004 UTC (15 years ago) by dpavlin
index pdf files page-by-page


Revision 65 - Directory Listing
Modified Wed Mar 17 12:19:14 2004 UTC (15 years ago) by dpavlin
fixed back-references in regexps


Revision 63 - Directory Listing
Modified Fri Feb 6 13:29:39 2004 UTC (15 years, 1 month ago) by dpavlin
convert pdf files when indexing with progspider


Revision 62 - Directory Listing
Modified Fri Feb 6 13:27:51 2004 UTC (15 years, 1 month ago) by dpavlin
small improvements


Revision 61 - Directory Listing
Modified Thu Jan 29 18:26:19 2004 UTC (15 years, 1 month ago) by dpavlin
better extracting of titles


Revision 60 - Directory Listing
Modified Thu Jan 29 18:25:55 2004 UTC (15 years, 1 month ago) by dpavlin
fix for pages_in_set when there are no results (I should really report this
as a bug!)


Revision 59 - Directory Listing
Modified Mon Jan 26 08:08:41 2004 UTC (15 years, 1 month ago) by dpavlin
implemented usage of SWISH::API instead of SWISH::Fork, new pages
using Data::Pageset


Revision 58 - Directory Listing
Modified Mon Jan 26 08:05:39 2004 UTC (15 years, 1 month ago) by dpavlin
use HTML or HTML2 parser


Revision 57 - Directory Listing
Modified Sun Jan 25 16:49:50 2004 UTC (15 years, 1 month ago) by dpavlin
various fixes


Revision 56 - Directory Listing
Modified Fri Jan 23 13:10:40 2004 UTC (15 years, 1 month ago) by dpavlin
better support for DocBook generated files


Revision 55 - Directory Listing
Modified Tue Jan 20 20:36:32 2004 UTC (15 years, 2 months ago) by dpavlin
moved rot13.config to config/ dir


Revision 54 - Directory Listing
Modified Tue Jan 20 18:42:05 2004 UTC (15 years, 2 months ago) by dpavlin
make script less chatty


Revision 53 - Directory Listing
Modified Tue Jan 20 18:41:38 2004 UTC (15 years, 2 months ago) by dpavlin
configuration moved to config/ directory


Revision 52 - Directory Listing
Modified Tue Jan 20 18:40:52 2004 UTC (15 years, 2 months ago) by dpavlin
common configuration for file-sytem indexing


Revision 51 - Directory Listing
Modified Tue Jan 20 18:40:06 2004 UTC (15 years, 2 months ago) by dpavlin
better removal of JavaScript


Revision 50 - Directory Listing
Modified Tue Jan 20 18:13:32 2004 UTC (15 years, 2 months ago) by dpavlin
support for 0-size files


Revision 49 - Directory Listing
Modified Tue Jan 20 16:02:27 2004 UTC (15 years, 2 months ago) by dpavlin
example configuraion which craws www.rot13.org


Revision 48 - Directory Listing
Modified Tue Jan 20 16:01:13 2004 UTC (15 years, 2 months ago) by dpavlin
removed debugging output


Revision 47 - Directory Listing
Modified Tue Jan 20 15:58:15 2004 UTC (15 years, 2 months ago) by dpavlin
Start parallel swish-e to index multiple sets of documents.
More info at: http://blog.rot13.org/index.cgi/id_14


Revision 46 - Directory Listing
Modified Sat Jan 17 23:57:55 2004 UTC (15 years, 2 months ago) by dpavlin
- moved text/html content filtering to filter.pm to faciliate code re-use
- added progspider which can be used with -S prog to crawl files and
  use filtering subroutines


Revision 45 - Directory Listing
Modified Wed Nov 19 12:07:07 2003 UTC (15 years, 4 months ago) by dpavlin
fixes and improvements


Revision 44 - Directory Listing
Modified Mon Aug 4 16:41:14 2003 UTC (15 years, 7 months ago) by dpavlin
added some html and URI of indexed content


Revision 43 - Directory Listing
Modified Sun Aug 3 21:36:16 2003 UTC (15 years, 7 months ago) by dpavlin
added template make and shell script which merges all indexes


Revision 42 - Directory Listing
Modified Tue Jul 29 10:40:58 2003 UTC (15 years, 7 months ago) by dpavlin
better handling of chars in URL, support for
<!-- noindex -->, <!-- index --> which is supported natively in swish 2.4


Revision 41 - Directory Listing
Modified Sun Jun 1 12:13:36 2003 UTC (15 years, 9 months ago) by dpavlin
- support for more than one affix or findaffix file at same time


Revision 40 - Directory Listing
Modified Sun Jun 1 11:45:19 2003 UTC (15 years, 9 months ago) by dpavlin
- support for listing of files in .tar.gz; decompressing of .gz and .bz2
  content
- changed order of arguments for swishspider: now baseurl,url (but it's
  backwards compatibile, so your old configurations will work)
- do html fixup just on html files (to prevent binary archive corruption)
- crawl sites that have frames


Revision 39 - Directory Listing
Modified Sun Jun 1 11:41:39 2003 UTC (15 years, 9 months ago) by dpavlin
support for affix and findaffix in same configuration file


Revision 38 - Directory Listing
Modified Tue May 20 21:01:11 2003 UTC (15 years, 10 months ago) by dpavlin
more logical place for translation


Revision 37 - Directory Listing
Modified Tue May 20 20:57:31 2003 UTC (15 years, 10 months ago) by dpavlin
updated Croatian translation


Revision 36 - Directory Listing
Modified Tue May 20 20:41:09 2003 UTC (15 years, 10 months ago) by dpavlin
additional properties example


Revision 35 - Directory Listing
Modified Tue May 20 20:10:16 2003 UTC (15 years, 10 months ago) by dpavlin
fix "Use of uninitialized value" in apache error.log


Revision 34 - Directory Listing
Modified Sun May 4 12:08:29 2003 UTC (15 years, 10 months ago) by dpavlin
added optional title, and fixed strip url


Revision 33 - Directory Listing
Modified Sun May 4 01:31:31 2003 UTC (15 years, 10 months ago) by dpavlin
usage for "strip url" option, fix for indexing of whole host (without
path in URL argument)


Revision 32 - Directory Listing
Modified Wed Apr 30 12:40:09 2003 UTC (15 years, 10 months ago) by dpavlin
added make_config.pl which creates swish config file
added checkbox to hide document properties (like content, size etc)
remove comments between <html> and <head> which confuse swish


Revision 31 - Directory Listing
Modified Mon Mar 24 16:14:44 2003 UTC (15 years, 11 months ago) by dpavlin
save document title too


Revision 30 - Directory Listing
Modified Mon Mar 24 09:57:44 2003 UTC (16 years ago) by dpavlin
added instructions about formating of html before indexing it (and added
ability to unroll wrongly splited tags in form which is acceptable to swish)


Revision 29 - Directory Listing
Modified Mon Mar 24 09:04:57 2003 UTC (16 years ago) by dpavlin
escape special characters in title


Revision 28 - Directory Listing
Modified Fri Mar 21 22:23:06 2003 UTC (16 years ago) by dpavlin
don't store index dir under CVS


Revision 27 - Directory Listing
Modified Fri Mar 21 22:01:59 2003 UTC (16 years ago) by dpavlin
search for explicit path, added examples


Revision 26 - Directory Listing
Modified Fri Mar 21 21:28:21 2003 UTC (16 years ago) by dpavlin
added limit to path (and save swishdocpath to database to enable that)


Revision 25 - Directory Listing
Modified Fri Mar 21 21:27:51 2003 UTC (16 years ago) by dpavlin
ignore some files in CVS


Revision 24 - Directory Listing
Modified Fri Mar 21 21:16:55 2003 UTC (16 years ago) by dpavlin
better design


Revision 23 - Directory Listing
Modified Fri Mar 21 21:10:51 2003 UTC (16 years ago) by dpavlin
added limit to path


Revision 22 - Directory Listing
Modified Tue Mar 18 20:24:57 2003 UTC (16 years ago) by dpavlin
properties are optional


Revision 21 - Directory Listing
Modified Tue Mar 18 20:20:11 2003 UTC (16 years ago) by dpavlin
support for different properties in output (aside from standard ones) and
formatting of output for each hit


Revision 20 - Directory Listing
Modified Tue Mar 18 19:08:56 2003 UTC (16 years ago) by dpavlin
better explanation


Revision 19 - Directory Listing
Modified Sun Mar 16 22:08:17 2003 UTC (16 years ago) by dpavlin
I use findaffix output and not affix :-)


Revision 18 - Directory Listing
Modified Sun Mar 16 21:59:10 2003 UTC (16 years ago) by dpavlin
decode all strings before output to charset defined in xml file


Revision 17 - Directory Listing
Modified Sun Mar 16 21:45:23 2003 UTC (16 years ago) by dpavlin
all.xml is english template, while rot13.xml is croatian one


Revision 16 - Directory Listing
Modified Sun Mar 16 21:44:42 2003 UTC (16 years ago) by dpavlin
moved all text into configuration file


Revision 15 - Directory Listing
Modified Sun Mar 16 21:31:55 2003 UTC (16 years ago) by dpavlin
support for image map and skip pictures (speedup)


Revision 12 - Directory Listing
Modified Sun Mar 16 21:20:22 2003 UTC (16 years ago) by dpavlin
Initial revision


Revision 11 - Directory Listing
Modified Sun Mar 16 21:16:41 2003 UTC (16 years ago) by dpavlin
Initial revision


Revision 8 - Directory Listing
Modified Sun Mar 16 21:06:43 2003 UTC (16 years ago) by dpavlin
Initial revision


Revision 7 - Directory Listing
Modified Sun Mar 16 21:02:29 2003 UTC (16 years ago) by dpavlin
Initial revision


Revision 4 - Directory Listing
Modified Tue Jun 4 07:04:34 2002 UTC (16 years, 9 months ago) by dpavlin
Initial revision


Revision 1 - Directory Listing
Added Tue Jun 4 06:39:53 2002 UTC (16 years, 9 months ago) by dpavlin
Initial revision


  ViewVC Help
Powered by ViewVC 1.1.26