Commit Graph

351 Commits (d1091e79f83591502fdc08444aca84b733300a71)

Author SHA1 Message Date
orbiter f8f88d4e81 replaced pdblue-homebrew buttons with bootstrap standard buttons
11 years ago
Michael Peter Christen 85a427ec54 support for multiple sitemaps in robots.txt
11 years ago
Michael Peter Christen bcd9dd9e1d enhanced concurrent loading by using a fixed set of concurrent loader
11 years ago
Michael Peter Christen fdaeac374a - enhanced postprocessing speed and memory footprint (by using HashMaps
11 years ago
Michael Peter Christen 1bbc0fe6d2 added a properties file format for the status_p api to support reading
11 years ago
Michael Peter Christen e40511f307 extended the status_p api with disk space information
11 years ago
Michael Peter Christen 0f6b72f24b do not use luke requests for remote solr servers if the result is
11 years ago
orbiter f6e441dd77 refactoring
11 years ago
Michael Peter Christen 6e59ca4ebf removed jena library and all code that depended on jena. When jena was
11 years ago
reger 193b8235c2 remove double jquery-1.3.1.js and adjust header links to jquery-1.3.2
11 years ago
Michael Peter Christen 77531850b5 reverted crawling strategy from latest commit.
11 years ago
Michael Peter Christen c0da966dfa enhanced crawler speed
11 years ago
reger 97e84439fb adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString
11 years ago
reger e05320b776 upd: to open more external links in new browser-tab
11 years ago
Michael Peter Christen 74466d731a use pre-compiled patterns in ymark
11 years ago
Michael Peter Christen 0db8e34625 enhanced webgraph processing
11 years ago
orbiter 19a051bec8 more monitoring for postprocessing and enhanced layout in Crawler
11 years ago
Michael Peter Christen fceac8cffd more monitoring for postprocessing
11 years ago
Michael Peter Christen 9d5895f643 enhanced and fixed postprocessing
11 years ago
Michael Peter Christen 1a4a69c226 set more logger to 'final static'
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE,
11 years ago
orbiter 252c525709 fixed feed api servlet and and enhanced RSSReader class
11 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
sixcooler 7d53ac86a3 fix for Blacklist (-Administration)
11 years ago
Roland Haeder e2ee412160 Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS'
11 years ago
Roland Haeder 59225487ea Fix for blacklist export, also applied the filename filter here
11 years ago
Michael Peter Christen 4c242f9af9 always use a default value for boolean options to have transparency for
11 years ago
orbiter 86b514cf46 added load info to status_p.xml
11 years ago
orbiter 056b42f5aa - added information about segment count to status_p.xml
11 years ago
orbiter 232100301c removed double-ocurring value assignments
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification
12 years ago
orbiter 2be456e7fb added a postprocessing field into api/status_p.xml to show if the
12 years ago
orbiter c4efb612e2 added list of crawls to status_p.xml
12 years ago
orbiter dac88561ae minimum access time has a tight connection to ClientIdentification,
12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
orbiter c8e94ad7c7 fix for citation search in case that the citation is very fresh
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and
12 years ago
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it
12 years ago
Michael Peter Christen 008288719c fix for schema export to consider also automatically generated
12 years ago
Michael Peter Christen 58e1e6fa2b fixes to schema
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen b6de1f42dc Full redesign of solr connection architecture. This was done to support
12 years ago
Michael Peter Christen dee8b24d3c better error handling for bookmarks
12 years ago