Michael Peter Christen
af465cdca5
fix for wrong robots.txt loading for https protocol
...
see also: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4579
12 years ago
Michael Peter Christen
8f3bd0c387
fix for smb crawl situation (lost too many urls)
12 years ago
orbiter
5aa5202adf
fixes for filesystem indexing
12 years ago
Michael Peter Christen
71ed8e5e07
bugfixes for crawler
12 years ago
Michael Peter Christen
0fe8be7981
enhaced data structures for balancer and latency computation which
...
should produce a bit better prognosis about forced waiting times.
12 years ago
Michael Peter Christen
c25d7bcb80
- added concurrency for robots.txt loading
...
- changed data model for domain counter
12 years ago
Michael Peter Christen
2d9e577ad0
replaced the custom robots.txt loader by the standard http loader
12 years ago
Michael Peter Christen
a33e2742cb
- removed unnecessary synchronized and deadlock in crawler
...
- removed problem with monitoring object on Balancer.wait
- added missing user agent settings
12 years ago
Michael Peter Christen
5f0ab25382
removed the option to prevent removal of & parts inside of the
...
MultiProtocolURI during normalform computation because that should
always be done and also be done during initialization of the
MultiProtocolURI Object. The new normalform method takes only one
argument which should be 'true' unless you know exactly what you are
doing.
12 years ago
Michael Peter Christen
00c1c777fa
refactoring
13 years ago