Commit Graph

445 Commits (b7417ac329d1327770c41bb3c9e46266f9dd536a)

Author SHA1 Message Date
Michael Peter Christen 0550b54d56 added fix to postprocessing: avoid caching of postprocessing collection 11 years ago
Michael Peter Christen 1db476c67e fix for bad table iteration 11 years ago
orbiter 3ffe19b85c replaced old /api/table_p.xml servlet with /Tables_p.xml to avoid double 11 years ago
Michael Peter Christen 07c5b57953 removed warnings 11 years ago
reger f5967dfedf add filter to citation page and a on/off button 11 years ago
Michael Peter Christen 0bfc69b29b more ipv6 bugfixes 11 years ago
Marc Nause 1e6e69bc40 Finished implementation of UPNP: 11 years ago
orbiter 3ac31614a3 added option to reverse-sort YaCy tables (internal API change only) 11 years ago
Michael Peter Christen 2a52c6f0f1 using htroot/api/blacklists as source folder: removed package 11 years ago
reger 6654d314f1 add rss version to api/feed.rss 11 years ago
orbiter 2371d6b8db target linktexts must be string to enable search facets on these fields 11 years ago
orbiter 22ce4fb4dd better error handling for remote solr queries and exists-checks 11 years ago
Michael Peter Christen 2de159719b added an option to set 'obey nofollow' for links with rel="nofollow" 11 years ago
Michael Peter Christen 8514bffc22 enhanced postprocessing status report 11 years ago
orbiter 59160984cc timeline performance update 11 years ago
orbiter 2073e69034 fix for long periods in timeline 11 years ago
Michael Peter Christen 8c52f0651b refactoring of AccessTracker events & timeline fix 11 years ago
Michael Peter Christen 74206a10c7 refactoring 11 years ago
Michael Peter Christen 36e623d8bf enhanced metadata enrichment for media file type search: 11 years ago
Michael Peter Christen 8fd72b5e8b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 11 years ago
Michael Peter Christen 81d0f01a6f added 'synchronous' and 'commit' flags in push api 11 years ago
Marc Nause f443cfa32d Improvements and bugfixes for recording actions of blacklist API. 11 years ago
orbiter 4177c9cf05 fix for crawl start check 11 years ago
Michael Peter Christen 74c249288a added a push api to make it possible to upload files directly without 11 years ago
Michael Peter Christen b3b174e2b8 fixed webgraph postprocessing and status display in Crawler_p servlet 11 years ago
Michael Peter Christen 2520590b45 migrated from pdfbox 1.8.4 to 1.8.5. They have a very long bugfix list 11 years ago
Marc Nause 4723329e29 Improved blacklist XML/JSON API. 11 years ago
orbiter 0d8072aa99 removed warnings 11 years ago
Marc Nause f98ccf952f Improved Blacklist API: 11 years ago
Marc Nause 0d88f292dc Key for parameter "blacklist name" is "list" in all servlets now. 11 years ago
Marc Nause c97da1a0d8 First draft of a blacklist API. 11 years ago
reger 727dfb5875 refactore URIMetadataNode to further unify interaction with index 11 years ago
Michael Peter Christen dd12dd392f introduction of a data structure for HyperlinkEdges which should use 11 years ago
Michael Peter Christen a37d067692 refactoring 11 years ago
orbiter c250fac9f4 linkstructure refactoring to get more options for clickdepth analysis 11 years ago
Michael Peter Christen bd886054cb new structure and enhancements for link graph computation: 11 years ago
Michael Peter Christen e8ddd415a8 enhanced the new link structure graph 11 years ago
Michael Peter Christen 7f5733638b fix for linkstructure computation: now also detecting dead links 11 years ago
orbiter 18f9c40302 moved Edge class out of linkstructure servlet as this does not work on 11 years ago
Michael Peter Christen a6bb9be97e - added d3.js for visualizations using embedded svg 11 years ago
Michael Peter Christen 48fbfa60c1 bugfix to inbound/outbound identification 11 years ago
Michael Peter Christen a3b7366aee Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 11 years ago
reger 92811d7850 fix: 3 more links pointing to old /xml path 11 years ago
Michael Peter Christen 656e2ce62a replacing direct html table cellspacing with css set-up for cellspacing 11 years ago
orbiter f8f88d4e81 replaced pdblue-homebrew buttons with bootstrap standard buttons 11 years ago
Michael Peter Christen 85a427ec54 support for multiple sitemaps in robots.txt 11 years ago
Michael Peter Christen bcd9dd9e1d enhanced concurrent loading by using a fixed set of concurrent loader 11 years ago
Michael Peter Christen fdaeac374a - enhanced postprocessing speed and memory footprint (by using HashMaps 11 years ago
Michael Peter Christen 1bbc0fe6d2 added a properties file format for the status_p api to support reading 11 years ago
Michael Peter Christen e40511f307 extended the status_p api with disk space information 11 years ago
Michael Peter Christen 0f6b72f24b do not use luke requests for remote solr servers if the result is 11 years ago
orbiter f6e441dd77 refactoring 11 years ago
Michael Peter Christen 6e59ca4ebf removed jena library and all code that depended on jena. When jena was 11 years ago
reger 193b8235c2 remove double jquery-1.3.1.js and adjust header links to jquery-1.3.2 11 years ago
Michael Peter Christen 77531850b5 reverted crawling strategy from latest commit. 11 years ago
Michael Peter Christen c0da966dfa enhanced crawler speed 11 years ago
reger 97e84439fb adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString 11 years ago
reger e05320b776 upd: to open more external links in new browser-tab 11 years ago
Michael Peter Christen 74466d731a use pre-compiled patterns in ymark 11 years ago
Michael Peter Christen 0db8e34625 enhanced webgraph processing 11 years ago
orbiter 19a051bec8 more monitoring for postprocessing and enhanced layout in Crawler 12 years ago
Michael Peter Christen fceac8cffd more monitoring for postprocessing 12 years ago
Michael Peter Christen 9d5895f643 enhanced and fixed postprocessing 12 years ago
Michael Peter Christen 1a4a69c226 set more logger to 'final static' 12 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not 12 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user 12 years ago
Michael Peter Christen 76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE, 12 years ago
orbiter 252c525709 fixed feed api servlet and and enhanced RSSReader class 12 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 12 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value 12 years ago
sixcooler 7d53ac86a3 fix for Blacklist (-Administration) 12 years ago
Roland Haeder e2ee412160 Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS' 12 years ago
Roland Haeder 59225487ea Fix for blacklist export, also applied the filename filter here 12 years ago
Michael Peter Christen 4c242f9af9 always use a default value for boolean options to have transparency for 12 years ago
orbiter 86b514cf46 added load info to status_p.xml 12 years ago
orbiter 056b42f5aa - added information about segment count to status_p.xml 12 years ago
orbiter 232100301c removed double-ocurring value assignments 12 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler 12 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet 12 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification 12 years ago
orbiter 2be456e7fb added a postprocessing field into api/status_p.xml to show if the 12 years ago
orbiter c4efb612e2 added list of crawls to status_p.xml 12 years ago
orbiter dac88561ae minimum access time has a tight connection to ClientIdentification, 12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog: 12 years ago
orbiter c8e94ad7c7 fix for citation search in case that the citation is very fresh 12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be 12 years ago
Michael Peter Christen 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and 12 years ago
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it 12 years ago
Michael Peter Christen 008288719c fix for schema export to consider also automatically generated 12 years ago
Michael Peter Christen 58e1e6fa2b fixes to schema 12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'. 12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link 12 years ago
Michael Peter Christen b6de1f42dc Full redesign of solr connection architecture. This was done to support 12 years ago
Michael Peter Christen dee8b24d3c better error handling for bookmarks 12 years ago
Michael Peter Christen 3834829b37 bugfixes and more logging for solr connector 12 years ago
Michael Peter Christen 99185d7048 one more fix for author_sxt 12 years ago
Michael Peter Christen b6ae6262f6 - add the copyField author_sxt only if author exists 12 years ago
Michael Peter Christen e23a596c1d added a copyField for author_sxt for automated schema generation 12 years ago
Michael Peter Christen 244b157299 fix for external solr schema definition 12 years ago
reger f301336adf fix: no results with configuration citation reference index switched off 12 years ago