Commit Graph

91 Commits (958ff4778e9aab7b6910b71a67342a8b50a81b6b)

Author SHA1 Message Date
low012 2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
14 years ago
orbiter 30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
f1ori fafab7a8fe * provide option to delete cached snippet fetching failures
14 years ago
orbiter 10ae8d961b - cora package has now no dependencies to other yacy packages and becomes a 'base' package (refactoring)
14 years ago
low012 38fdf43587 *) renamed classes according to standard Java coding conventions
14 years ago
orbiter 790e0b1894 - enhanced index deletion in IndexControlRWIs_p: delete also robots.txt database and cache if demanded
14 years ago
orbiter 863065abc4 added user agent logging to access tracker
14 years ago
orbiter aacf572a26 - enhancements for search speed
15 years ago
orbiter d5dc88a351 shop cleanup button only if servlet was called without post/put arguments.
15 years ago
orbiter 97ee278931 enhanced search speed:
15 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter b18a7606a0 some performance hacks and fixed after reading dump in
15 years ago
orbiter 1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter 4782d2c438 fix for search bug that appeared when looking at page 3 of results or further
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 491ba6a1ba - some refactoring in workflow
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
16 years ago
orbiter 5841ee83d3 refactoring
16 years ago
orbiter ce8dc575ca refactoring
16 years ago
orbiter bea3b99aff moved table and util classes
16 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
16 years ago
orbiter 4446acc8cd moved kelondro order
16 years ago
orbiter 735e2737e3 * added index segments
16 years ago
low012 5e4f267a36 *) added subversion properties and edited a few comments
16 years ago
orbiter af3a696fc4 added a fast-fail concept in search processes. The search now has better control if all the remote searches may bring any result. If all processes are finished, then all search tasks fail fast.
16 years ago
orbiter 61748285c3 more refactoring of search
16 years ago
orbiter 72ac5bd80f refactoring of search process.
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter 0e8647d62f refactoring of search classes
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter bc6dd8194b refactoring: moved search query class to new search package
16 years ago
orbiter 945777aa80 replaced rwi term counting method by one that computes the maximum of the blobs that contibute to the RWI. An addition of the blob sizes is wrong/incorrect and does not reflect the real size. Truncation the size operation to the maximum of all blobs is also incorrect, but not as wrong as the sum of all blob sizes wich double-counts many rwi entries.
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter 09987e93fd fixed some more bad handling of byte[]
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
orbiter 89ec3acb3e - full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes.
16 years ago