Commit Graph

331 Commits (78e7aadb26ad38c30daa1a845b2d9cee3843c853)

Author SHA1 Message Date
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE,
11 years ago
orbiter 252c525709 fixed feed api servlet and and enhanced RSSReader class
11 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
sixcooler 7d53ac86a3 fix for Blacklist (-Administration)
11 years ago
Roland Haeder e2ee412160 Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS'
11 years ago
Roland Haeder 59225487ea Fix for blacklist export, also applied the filename filter here
11 years ago
Michael Peter Christen 4c242f9af9 always use a default value for boolean options to have transparency for
11 years ago
orbiter 86b514cf46 added load info to status_p.xml
11 years ago
orbiter 056b42f5aa - added information about segment count to status_p.xml
11 years ago
orbiter 232100301c removed double-ocurring value assignments
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification
12 years ago
orbiter 2be456e7fb added a postprocessing field into api/status_p.xml to show if the
12 years ago
orbiter c4efb612e2 added list of crawls to status_p.xml
12 years ago
orbiter dac88561ae minimum access time has a tight connection to ClientIdentification,
12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
orbiter c8e94ad7c7 fix for citation search in case that the citation is very fresh
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and
12 years ago
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it
12 years ago
Michael Peter Christen 008288719c fix for schema export to consider also automatically generated
12 years ago
Michael Peter Christen 58e1e6fa2b fixes to schema
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen b6de1f42dc Full redesign of solr connection architecture. This was done to support
12 years ago
Michael Peter Christen dee8b24d3c better error handling for bookmarks
12 years ago
Michael Peter Christen 3834829b37 bugfixes and more logging for solr connector
12 years ago
Michael Peter Christen 99185d7048 one more fix for author_sxt
12 years ago
Michael Peter Christen b6ae6262f6 - add the copyField author_sxt only if author exists
12 years ago
Michael Peter Christen e23a596c1d added a copyField for author_sxt for automated schema generation
12 years ago
Michael Peter Christen 244b157299 fix for external solr schema definition
12 years ago
reger f301336adf fix: no results with configuration citation reference index switched off
12 years ago
Michael Peter Christen cb5cbec14d distinguishing modified query string and original query string
12 years ago
Michael Peter Christen 3de784c8dd replaced more split and replaceAll missing pattern pre-compilation with
12 years ago
Michael Peter Christen 8fc3679c66 using more pre-compile pattern for split methods
12 years ago
Michael Peter Christen 4eab3aae60 removed overhead by preventing generation of full search results when
12 years ago
Michael Peter Christen 952e143580 FINALLY YaCy can now search for full strings using double- or
12 years ago
orbiter 5dfd6359cb redesign of the QueryParams class: introduced QueryGoal which holds the
12 years ago
Michael Peter Christen 5fd3b93661 added deletion of hosts during crawl start if deleteold option was given
12 years ago
Michael Peter Christen d64445c3cb because we have the inurl:<term> - searchmodifier, we don't actually
12 years ago
Michael Peter Christen 2d9e577ad0 replaced the custom robots.txt loader by the standard http loader
12 years ago
Michael Peter Christen ccc3760a47 Refactoring and redesign of data architecture to make URIMetadataRow
12 years ago
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
12 years ago
Michael Peter Christen 21fe8339b4 - enhanced generation of url objects
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of &amp; parts inside of the
12 years ago
Michael Peter Christen abab291162 made the index schema retrieval public and allow cross-domain retrieval
12 years ago