Michael Peter Christen
3de784c8dd
replaced more split and replaceAll missing pattern pre-compilation with
...
pre-compiled pattern
12 years ago
orbiter
5aa5202adf
fixes for filesystem indexing
12 years ago
orbiter
354ef8000d
- added 'deleteold' option to crawler which causes that documents are
...
deleted which are selected by a crawl filter (host or subpath)
- site crawl used this option be default now
- made option to deleteDomain() concurrency
12 years ago
Michael Peter Christen
0716a24737
added more / all new crawl profile fields into crawl profile editor
12 years ago
Michael Peter Christen
4a14122ba7
in case that a crawl profile has a collection assigned, use the
...
collection to show a name in the web interface. This should prevent that
much too long names make the interface unusable.
12 years ago
Michael Peter Christen
ac9540dfb6
removed options for stopwords which are not used
12 years ago
Michael Peter Christen
c25d7bcb80
- added concurrency for robots.txt loading
...
- changed data model for domain counter
12 years ago
Michael Peter Christen
85ca07b90e
when a new crawl is started, an equal crawl, if still running, is
...
terminated and the corresponding crawl profile is deleted (this also
clears the crawl queue entries for that crawl profile)
12 years ago
Michael Peter Christen
5f0ab25382
removed the option to prevent removal of & parts inside of the
...
MultiProtocolURI during normalform computation because that should
always be done and also be done during initialization of the
MultiProtocolURI Object. The new normalform method takes only one
argument which should be 'true' unless you know exactly what you are
doing.
12 years ago
Michael Peter Christen
53789555b9
fix for crawl start filter
12 years ago
Michael Peter Christen
76d218fbef
fixes to crawl profiles
12 years ago
Michael Peter Christen
1533bfd63b
refactoring
13 years ago
Michael Peter Christen
8219a445f3
refactoring
13 years ago
Michael Peter Christen
f879a344e7
fix for no depth limit default value
13 years ago
Michael Peter Christen
00c1c777fa
refactoring
13 years ago