Commit Graph

58 Commits (70ba74b23a8e470635c5f6fc1324e38898e696a3)

Author SHA1 Message Date
Michael Peter Christen 61c5e40687 - replaced the properties object in AnchorURL with distinct variables
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
Michael Peter Christen 7ab5093321 added new solr title_exact_signature_l and
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
reger 3897bb4409 added (manual) urldb migration (link on: Index Administraton -> Federated Solr Index)
12 years ago
Michael Peter Christen 34f8786508 removed dependency of vocabulary navigation from Jena and it's
12 years ago
Michael Peter Christen 72f165d58b added a Boost class which stores solr query boost values. The class can
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of & parts inside of the
12 years ago
Michael Peter Christen 3d33a5bdf6 turned the synonyms_t Text field into a multi-valued String field
12 years ago
orbiter 3190347814 added a synonyms_t field to solr and a process to read synonym files.
12 years ago
Michael Peter Christen 8219a445f3 refactoring
12 years ago
orbiter 63762d8f89 removed kelondro dependencies from cora
12 years ago
orbiter d9173ba7ed added more solr fields to integrate values from URIMetadataRow. All
12 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
Michael Peter Christen 801972fe6f fix for url camel case parser and sentence reader
13 years ago
orbiter 78fc3cf8f8 refactoring and new usage of SentenceReader: this class appeared as one
13 years ago
Michael Peter Christen 94d54e2d91 added recognition of multi-word terms in vocabulary matching
13 years ago
Michael Peter Christen 8b53771db2 changed behavior of navigation processing:
13 years ago
Michael Peter Christen 5fc6524ca8 - moved triple store to net.yacy.cora.lod (should be generalized there
13 years ago
Michael Peter Christen e0d8643226 - performance hacks
13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
13 years ago
Michael Peter Christen a58dc4a91f added autotagging to document condenser:
13 years ago
Michael Peter Christen 254adea51c small fixes
13 years ago
Al Sutton 8993cac4d8 Initial performance improvements
13 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process
13 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
14 years ago
orbiter deda54d684 - relaxed matching of string-search (this is now case-insensitive)
14 years ago
orbiter 15e3a57b4e removed unused functions in condenser
14 years ago
orbiter c17d102bd8 enhanced speed for OrderedScoreMap inc method and size comparisment in concurrent environments
14 years ago
orbiter 156cf02703 - added an index constraint 'has location' to the condenser
14 years ago
orbiter f3baaca920 - enhancements to DNS IP caching and crawler speed
14 years ago
orbiter 694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 5892fff51f introduction of dht-burst modes: this can expand the number of target peers in some cases where a better heuristic is needed. The problematic cases are either when a muti-word search is made (still a hard case for our term-oriented DHT) or when a network operator wants that all robinson peers are asked. We therefore introduced two new network steering values that switch on more peers during the peer selection. Because the number of peers can now be very large, the number of maximum httpc connections was also increased.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
orbiter 3ca06d6290 patch for http://forum.yacy-websuche.de/viewtopic.php?p=21460#p21460
14 years ago
orbiter 4e2c14efbb fixed bugs in parser and ftp client
14 years ago
orbiter f0651e5f2f added image search to yacyinteractive.html
14 years ago
orbiter b769cce433 - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only
14 years ago
low012 9b3fae9496 *) cleaning up the code a little bit
14 years ago
orbiter 58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
14 years ago
orbiter 0d363a94d7 more performance hacks
14 years ago
orbiter 570ca577c6 performance hacks
14 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 5a4684f21f allow words with length >= 2 (you can't search for 'wm' with 3-letter words...)
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago