/[swish]/trunk/spider/swishspider
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/spider/swishspider

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 46 - (view) (annotate) - [select for diffs]
Modified Sat Jan 17 23:57:55 2004 UTC (20 years, 2 months ago) by dpavlin
File length: 3462 byte(s)
Diff to previous 45
- moved text/html content filtering to filter.pm to faciliate code re-use
- added progspider which can be used with -S prog to crawl files and
  use filtering subroutines


Revision 45 - (view) (annotate) - [select for diffs]
Modified Wed Nov 19 12:07:07 2003 UTC (20 years, 4 months ago) by dpavlin
File length: 4871 byte(s)
Diff to previous 42
fixes and improvements


Revision 42 - (view) (annotate) - [select for diffs]
Modified Tue Jul 29 10:40:58 2003 UTC (20 years, 7 months ago) by dpavlin
File length: 4033 byte(s)
Diff to previous 40
better handling of chars in URL, support for
<!-- noindex -->, <!-- index --> which is supported natively in swish 2.4


Revision 40 - (view) (annotate) - [select for diffs]
Modified Sun Jun 1 11:45:19 2003 UTC (20 years, 9 months ago) by dpavlin
File length: 3897 byte(s)
Diff to previous 32
- support for listing of files in .tar.gz; decompressing of .gz and .bz2
  content
- changed order of arguments for swishspider: now baseurl,url (but it's
  backwards compatibile, so your old configurations will work)
- do html fixup just on html files (to prevent binary archive corruption)
- crawl sites that have frames


Revision 32 - (view) (annotate) - [select for diffs]
Modified Wed Apr 30 12:40:09 2003 UTC (20 years, 10 months ago) by dpavlin
File length: 3668 byte(s)
Diff to previous 30
added make_config.pl which creates swish config file
added checkbox to hide document properties (like content, size etc)
remove comments between <html> and <head> which confuse swish


Revision 30 - (view) (annotate) - [select for diffs]
Modified Mon Mar 24 09:57:44 2003 UTC (21 years ago) by dpavlin
File length: 3449 byte(s)
Diff to previous 15
added instructions about formating of html before indexing it (and added
ability to unroll wrongly splited tags in form which is acceptable to swish)


Revision 15 - (view) (annotate) - [select for diffs]
Modified Sun Mar 16 21:31:55 2003 UTC (21 years ago) by dpavlin
File length: 2986 byte(s)
Diff to previous 1
support for image map and skip pictures (speedup)


Revision 1 - (view) (annotate) - [select for diffs]
Added Tue Jun 4 06:39:53 2002 UTC (21 years, 9 months ago) by dpavlin
File length: 2899 byte(s)
Initial revision


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26