/[swish]/trunk/spider/swishspider
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/spider/swishspider

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 46 - (view) (annotate) - [select for diffs]
Modified Sat Jan 17 23:57:55 2004 UTC (16 years, 9 months ago) by dpavlin
File length: 3462 byte(s)
Diff to previous 45 , to selected 1
- moved text/html content filtering to filter.pm to faciliate code re-use
- added progspider which can be used with -S prog to crawl files and
  use filtering subroutines


Revision 45 - (view) (annotate) - [select for diffs]
Modified Wed Nov 19 12:07:07 2003 UTC (16 years, 11 months ago) by dpavlin
File length: 4871 byte(s)
Diff to previous 42 , to selected 1
fixes and improvements


Revision 42 - (view) (annotate) - [select for diffs]
Modified Tue Jul 29 10:40:58 2003 UTC (17 years, 2 months ago) by dpavlin
File length: 4033 byte(s)
Diff to previous 40 , to selected 1
better handling of chars in URL, support for
<!-- noindex -->, <!-- index --> which is supported natively in swish 2.4


Revision 40 - (view) (annotate) - [select for diffs]
Modified Sun Jun 1 11:45:19 2003 UTC (17 years, 4 months ago) by dpavlin
File length: 3897 byte(s)
Diff to previous 32 , to selected 1
- support for listing of files in .tar.gz; decompressing of .gz and .bz2
  content
- changed order of arguments for swishspider: now baseurl,url (but it's
  backwards compatibile, so your old configurations will work)
- do html fixup just on html files (to prevent binary archive corruption)
- crawl sites that have frames


Revision 32 - (view) (annotate) - [select for diffs]
Modified Wed Apr 30 12:40:09 2003 UTC (17 years, 5 months ago) by dpavlin
File length: 3668 byte(s)
Diff to previous 30 , to selected 1
added make_config.pl which creates swish config file
added checkbox to hide document properties (like content, size etc)
remove comments between <html> and <head> which confuse swish


Revision 30 - (view) (annotate) - [select for diffs]
Modified Mon Mar 24 09:57:44 2003 UTC (17 years, 7 months ago) by dpavlin
File length: 3449 byte(s)
Diff to previous 15 , to selected 1
added instructions about formating of html before indexing it (and added
ability to unroll wrongly splited tags in form which is acceptable to swish)


Revision 15 - (view) (annotate) - [select for diffs]
Modified Sun Mar 16 21:31:55 2003 UTC (17 years, 7 months ago) by dpavlin
File length: 2986 byte(s)
Diff to previous 1
support for image map and skip pictures (speedup)


Revision 1 - (view) (annotate) - [selected]
Added Tue Jun 4 06:39:53 2002 UTC (18 years, 4 months ago) by dpavlin
File length: 2899 byte(s)
Initial revision


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26