Commit Graph

184 Commits (3c71e1c8728c567abecfed8a89f43b0df5dbcf60)

Author SHA1 Message Date
Michael Peter Christen 06a78eecb7 code simplification
12 years ago
Michael Peter Christen 18f989dfb1 - refactoring (load -> getMetadata)
12 years ago
orbiter e816b88b55 changed behaviour of metadata storage: in case that any solr is
12 years ago
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
12 years ago
Michael Peter Christen 76202f068e extended abstraction of local and remote solr index using one front-end
12 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
12 years ago
orbiter 05a3ffd03a patches to ensure that solr connectors are active ony if they have a
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
orbiter c7afa8bc48 using SwitchboardConstants for solr attributes
13 years ago
sixcooler 2c5b68d932 more abstraction of error message
13 years ago
Michael Peter Christen 9758c521ab abstraction of error message
13 years ago
sixcooler 9b6e4e46ca fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4430
13 years ago
Michael Peter Christen d3964253ae - added @SuppressWarnings to unused servlet method parameters
13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class:
13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
13 years ago
Michael Peter Christen 2bbb6c52cf added option to clean the triplestore when deleting the index
13 years ago
Roland 'Quix0r' Haeder edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug
13 years ago
Michael Peter Christen ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass
13 years ago
Michael Peter Christen a1fe65b115 performance hacks
13 years ago
reger 6696cb1313 bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
13 years ago
Michael Peter Christen 4298f00d2d fixed bad usage of given words
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Christen eff966f396 fix for search process (it was aborted too early during remote search)
13 years ago
Michael Christen 85bd4cc8bc better lookup for peer names
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen e9dc99fe15 added rules to set specific RWIs as private RWIs which are not
13 years ago
Michael Christen f14faf503b better ranking because we wait a very little time during the search
13 years ago
orbiter a7df70221e refactoring
13 years ago
orbiter 2c3161b4ac refactoring:
13 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
13 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
13 years ago
orbiter 2c595a6a47 added new methods to count the number of objects in RWIs. lots of refactoring was necessary to introduce new Rating class and to unify naming of methods
13 years ago
orbiter 115abc8917 - more attributes for search progress bar
14 years ago
orbiter 535b6b953c more hacks to omit superfluous string object allocation
14 years ago
orbiter 87082f407e less String object creation during search
14 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
14 years ago
orbiter 10e2f588f8 - enhanced ybr ranking computation
14 years ago
orbiter b45701d20f this is a re-implementation of the YaCy Block Rank feature
14 years ago
orbiter 123375bfba added a new yacy protocol servlet 'idx'. This returns an index to one of the data entities that is stored in YaCy.
14 years ago
orbiter 5b579e21a3 code cleanup
14 years ago
orbiter 039126cfaf better handling of on/off switched solr indexing
14 years ago
orbiter 3d5104d357 - fixed a bug in crawl start with file name (npe in new url)
14 years ago
low012 2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
14 years ago
orbiter 30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
f1ori fafab7a8fe * provide option to delete cached snippet fetching failures
14 years ago
orbiter 10ae8d961b - cora package has now no dependencies to other yacy packages and becomes a 'base' package (refactoring)
14 years ago
low012 38fdf43587 *) renamed classes according to standard Java coding conventions
14 years ago
orbiter 790e0b1894 - enhanced index deletion in IndexControlRWIs_p: delete also robots.txt database and cache if demanded
14 years ago
orbiter 863065abc4 added user agent logging to access tracker
14 years ago
orbiter aacf572a26 - enhancements for search speed
14 years ago
orbiter d5dc88a351 shop cleanup button only if servlet was called without post/put arguments.
14 years ago
orbiter 97ee278931 enhanced search speed:
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter b18a7606a0 some performance hacks and fixed after reading dump in
15 years ago
orbiter 1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter 4782d2c438 fix for search bug that appeared when looking at page 3 of results or further
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 491ba6a1ba - some refactoring in workflow
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter 5841ee83d3 refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
low012 5e4f267a36 *) added subversion properties and edited a few comments
15 years ago
orbiter af3a696fc4 added a fast-fail concept in search processes. The search now has better control if all the remote searches may bring any result. If all processes are finished, then all search tasks fail fast.
15 years ago
orbiter 61748285c3 more refactoring of search
15 years ago
orbiter 72ac5bd80f refactoring of search process.
15 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter 0e8647d62f refactoring of search classes
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter bc6dd8194b refactoring: moved search query class to new search package
16 years ago
orbiter 945777aa80 replaced rwi term counting method by one that computes the maximum of the blobs that contibute to the RWI. An addition of the blob sizes is wrong/incorrect and does not reflect the real size. Truncation the size operation to the maximum of all blobs is also incorrect, but not as wrong as the sum of all blob sizes wich double-counts many rwi entries.
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter 09987e93fd fixed some more bad handling of byte[]
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
orbiter 89ec3acb3e - full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes.
16 years ago
orbiter 44e01afa5b - refactoring
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 83792d9233 more refactoring
16 years ago
orbiter 209f25f5f5 refactoring to integrate indexCell data structures
16 years ago
orbiter 7f67238f8b refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago