Commit Graph

55 Commits (ce0c8247fcf7e5dc0a13aa21040f025201229367)

Author SHA1 Message Date
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
low012 6f4f957e50 *) cleaning up the code a little bit
14 years ago
orbiter 2c549ae341 fixed a number of small bugs:
14 years ago
orbiter 37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter 3f93a0cc8f redesign of remote proxy settings
15 years ago
orbiter 06ff0c5b06 fixes for metadata retrieval and presentation
15 years ago
orbiter fc5efcc05a enhanced and fixed OAI-PMH import
15 years ago
orbiter 1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 5841ee83d3 refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
low012 5e4f267a36 *) added subversion properties and edited a few comments
15 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
f1ori f814e0fa81 enable warnings and fix most of it
16 years ago
orbiter ce1adf9955 serialized all logging using concurrency:
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter 024da2916b refactoring of logging
16 years ago
orbiter 47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets.
16 years ago
orbiter 826ca79735 refactoring and new architecture to store the files of the web cache:
16 years ago
orbiter d09ddabd09 corrected a design mistake (5-byte hashes not necessary)
16 years ago
orbiter 77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet
16 years ago
orbiter 80a7bc93d6 - added statistical evaluation about domains that appear during crawling
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
danielr 17b7845eb5 * refactoring
17 years ago
danielr 3bb870bfcd added final where possible
17 years ago
danielr b63a519456 fixed build problem
17 years ago
danielr e12438142e - warning instead of NPE if urlHash is not in index of CrawResults
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter 6f1a3fce05 BF Bugfix
17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
17 years ago
orbiter d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back)
17 years ago
orbiter 7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
17 years ago
orbiter d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
17 years ago
orbiter 541b817502 refactoring of switchboard queueing
17 years ago
orbiter a8a5df4a51 - more dublin core naming of page metadata
17 years ago
orbiter c527969185 - enhanced monitoring of ranking parameters
17 years ago
fuchsi 0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
17 years ago