Commit Graph

59 Commits (0229029dcf91f587b87e9fe9f98d2a51229836c4)

Author SHA1 Message Date
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 14 years ago
orbiter 5b579e21a3 code cleanup 14 years ago
low012 2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions 14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'. 14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain 14 years ago
low012 6f4f957e50 *) cleaning up the code a little bit 14 years ago
orbiter 2c549ae341 fixed a number of small bugs: 15 years ago
orbiter 37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification 15 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora: 15 years ago
orbiter 3f93a0cc8f redesign of remote proxy settings 15 years ago
orbiter 06ff0c5b06 fixes for metadata retrieval and presentation 15 years ago
orbiter fc5efcc05a enhanced and fixed OAI-PMH import 15 years ago
orbiter 1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790 15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over: 15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes. 15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/ 16 years ago
orbiter 5e8038ac4d - refactoring of blacklists 16 years ago
orbiter 5841ee83d3 refactoring 16 years ago
orbiter ce8dc575ca refactoring 16 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root 16 years ago
orbiter 735e2737e3 * added index segments 16 years ago
low012 5e4f267a36 *) added subversion properties and edited a few comments 16 years ago
orbiter 1d8d51075c refactoring: 16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency. 16 years ago
f1ori f814e0fa81 enable warnings and fix most of it 16 years ago
orbiter ce1adf9955 serialized all logging using concurrency: 16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation 16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex: 16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes. 16 years ago
orbiter 14a1c33823 refactoring of wordIndex class 16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index 16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued) 16 years ago
orbiter 024da2916b refactoring of logging 16 years ago
orbiter 47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets. 17 years ago
orbiter 826ca79735 refactoring and new architecture to store the files of the web cache: 17 years ago
orbiter d09ddabd09 corrected a design mistake (5-byte hashes not necessary) 17 years ago
orbiter 77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet 17 years ago
orbiter 80a7bc93d6 - added statistical evaluation about domains that appear during crawling 17 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once: 17 years ago
danielr 17b7845eb5 * refactoring 17 years ago
danielr 3bb870bfcd added final where possible 17 years ago
danielr b63a519456 fixed build problem 17 years ago
danielr e12438142e - warning instead of NPE if urlHash is not in index of CrawResults 17 years ago
danielr 7feae906aa - organize imports 17 years ago
orbiter 6f1a3fce05 BF Bugfix 17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet), 17 years ago
orbiter d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back) 17 years ago
orbiter 7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering 17 years ago
orbiter d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision. 17 years ago