Commit Graph

32 Commits (c88c30a5c52dafb46c6d3eb401d23aa5feed63f1)

Author SHA1 Message Date
Michael Peter Christen 910a496c9f replaced http links with https
4 months ago
Michael Peter Christen 23f1dc3741 addressing/fixing some concurrency issues from
2 years ago
Michael Peter Christen d19872fd26 making sure that crawl queues are closed correctly to prevent data loss
3 years ago
Michael Peter Christen e6a87e0426 enhanced crawler
3 years ago
Lina Ceballos a96752f5ab adding SPDX license and copyright headers
4 years ago
Michael Peter Christen 63f58e4785 enhanced strategy in host browser
4 years ago
luccioman 4ee14ff3c5 Fixed NullPointerException case on malformed crawl queue folder name
6 years ago
luccioman 46b5249c20 Removed time condition on HostBalancer initialization in JUnit test.
7 years ago
luccioman 39e081ef38 Fixed display of crawler pending URLs counts in HostBrowser.html page.
8 years ago
luccioman f0639d810c Customized name for Threads still using the default "Thread-n" pattern.
8 years ago
reger 708bcbb042 one more replacement to use cached hosthash vs. calculated
8 years ago
reger 22db449f2a to prevent crawler to concurrently access and alter same crawl queue
8 years ago
reger 7789c32c82 delete crawl queue on init exception
9 years ago
reger 379e9b330d use supplied url port to get robots.txt in crawlers hostqueue
9 years ago
reger b5371ea8c1 read/init crawl queue in a thread
9 years ago
reger 3e742d1e34 Init remote crawler on demand
10 years ago
Michael Peter Christen 5bb52f79be reduce number of calls to queue.size() because that may be a bottleneck
10 years ago
Michael Peter Christen a34f837592 better delete all files in path when removing host crawl stack
10 years ago
orbiter 4ae7aead28 addon to latest fix
10 years ago
Michael Peter Christen 49d91b94c3 npe fix in crawler
10 years ago
orbiter e9163e7e10 fix for malformed hostpath names in crawl balancer
10 years ago
Michael Peter Christen 06ab72d1af enhanced crawler host round-robin strategy
10 years ago
Michael Peter Christen 49886fab08 enhanced debugging
11 years ago
orbiter d7d38f9135 made number of open files in crawler configurable and increased default
11 years ago
orbiter 97983ba89f fixed generics warnings for generic array instantiation that appeared
11 years ago
reger 1600414450 fix NPE on continuing crawls after YaCy restart
11 years ago
Michael Peter Christen c1c1be8f02 fix for slow crawling and better logging in balancer
11 years ago
orbiter 2f63bd0261 enhanced Host Balancer strategy: fair round robin
11 years ago
Michael Peter Christen 8b32dd5f9e special strategy for balancer: do not remove targets with zero wait time
11 years ago
Michael Peter Christen 9c6228d948 fix for deadlocks in crawler
11 years ago
Michael Peter Christen 06afb568e2 new Strategies in Balancer:
11 years ago
Michael Peter Christen da86f150ab - added a new Crawler Balancer: HostBalancer and HostQueues:
11 years ago