Commit Graph

4503 Commits (69599566f913b137702da4c413e777ff788de54a)

Author SHA1 Message Date
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
12 years ago
Michael Peter Christen 3e22d05290 added option for daterange properties in GSA interface to use an left-
12 years ago
Michael Peter Christen 35ab2cef7b added parsing of 'date', 'dc:date', 'dc.date' and 'last-modified' in
12 years ago
Michael Peter Christen dbef8ccfcb forced deletion of ZURL entries for a specific host for each host that
12 years ago
Michael Peter Christen e137ff4171 refactoring (im preparation for new removeHost method)
12 years ago
Michael Peter Christen 9e12fdff23 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 049c3b3f2e added an option to exclude image search results from text search. This
12 years ago
Michael Peter Christen 5d71a4c8bc fix for dc:description field
12 years ago
reger 392174de8c remove all_words, all_strings lists from QueryGoal
12 years ago
Michael Peter Christen cb85b22725 redesign of the image search process (with much better results,
12 years ago
Michael Peter Christen 6184fd9d9a fix for solr/gsa result logging
12 years ago
reger 29967102a2 optimized QueryGoal (reducing mem and computation by removing all_hashes)
12 years ago
orbiter f106345eef link strings should not be tokenized
12 years ago
orbiter 5b14bdfffd npe fix
12 years ago
orbiter 1ca4b9612c added special handling of the BinaryResponseWriter in the solr interface
12 years ago
Michael Peter Christen a88a62f7aa added a feature to set a collection for a crawl result based on a
12 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
12 years ago
Michael Peter Christen 47b1c81d08 - refactoring
12 years ago
Michael Peter Christen e6b423c4d9 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
reger 94bec24d14 add back menu to Surftips page (currently no menu is displayed)
12 years ago
Michael Peter Christen 1f299b0d42 removed link.gif as link button because this image is now shown
12 years ago
Michael Peter Christen 48ddd50a6c html fix
12 years ago
reger 96ae332427 revert del _blank (last commit) in template
12 years ago
reger 43348a98a9 add some href target=_blank to ext. links with external icon
12 years ago
reger 82d81a57bd info msg if no embedded Solr http://bugs.yacy.net/view.php?id=279
12 years ago
reger 02fe8b43ba Field Re-Indexing: display list of fields in reindex queue
12 years ago
sixcooler 7f501b7c38 clear some caches before reporting low Memory
12 years ago
reger 070bf85b33 css fix for IE10 showing border on all img within <a /> tag since introduction of external link icon (commit 112836dcc9)
12 years ago
sixcooler 8a96140f92 fix / workaround for
12 years ago
Michael Peter Christen 2674d28ef4 protection against self-ping (may be cause by fraud attempts)
12 years ago
orbiter f3d001c7ab more space in the about section
12 years ago
Michael Peter Christen e879b97b0a added line to enhance debugging
12 years ago
Michael Peter Christen 76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE,
12 years ago
orbiter 252c525709 fixed feed api servlet and and enhanced RSSReader class
12 years ago
Marc Nause 112836dcc9 Improved external links.
12 years ago
Marc Nause d64a094f0e External links in HTML interface are marked as external with small icon.
12 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
12 years ago
sixcooler 7d53ac86a3 fix for Blacklist (-Administration)
12 years ago
orbiter f425b2c61c re-try to fetch url after a soft commit
12 years ago
orbiter bf0ad04e1b apply load limitation also to dht-in
12 years ago
Roland Haeder b58ca8622d Some cleanups:
12 years ago
Roland Haeder e2ee412160 Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS'
12 years ago
Roland Haeder ae19401af0 Removed another duplicate occurance of Blacklist.BLACKLIST_FILENAME_FILTER
12 years ago
Roland Haeder 59225487ea Fix for blacklist export, also applied the filename filter here
12 years ago
Roland Haeder 952fc0e7bd Removed superfluous check for files ending '.black' as the previous commit already excluded all other files (e.g. .ser dumps), added logging in catch-all block
12 years ago
Roland Haeder 060fec1577 Reuse Blacklist.BLACKLIST_FILENAME_FILTER
12 years ago
Roland Haeder 29049c71f5 Possible fix for ticket http://bugs.yacy.net/view.php?id=270, the filter for only including *.black must be applied
12 years ago
Michael Peter Christen 4c242f9af9 always use a default value for boolean options to have transparency for
12 years ago
orbiter 9c681cc00d added segment sizes, postprocessing status and cpu load to crawler
12 years ago
orbiter 86b514cf46 added load info to status_p.xml
12 years ago
orbiter 056b42f5aa - added information about segment count to status_p.xml
12 years ago
orbiter 6fb2811e68 fixes for problems with remote solr and non-activated webgraph index
12 years ago
orbiter e24016e30a added the property federated.service.solr.indexing.timeout to yacy.init
12 years ago
orbiter 232100301c removed double-ocurring value assignments
12 years ago
Roland Haeder aaedc0405d Fixes and avoid of catching bad exceptions (some):
12 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
12 years ago
Felix Ableitner 376f9cd9d0 Merge branch 'master' of git://gitorious.org/yacy/rc1 into blacklist_structure
12 years ago
Michael Peter Christen 89c0aa0e74 added collection_sxt to error documents
12 years ago
Michael Peter Christen 0df5195cb0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 1fd006cc56 fixes using the embedded connector
12 years ago
orbiter aba7cc5de7 added cpu load information to status page
12 years ago
Roland Haeder 59b4fdd5ad Merge remote-tracking branch 'upstream/master'
12 years ago
orbiter 5493389576 stealth mode shall only be available for authorized users, because
12 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification
12 years ago
orbiter 2be456e7fb added a postprocessing field into api/status_p.xml to show if the
12 years ago
orbiter 575f913154 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter c4efb612e2 added list of crawls to status_p.xml
12 years ago
Lotus bb6caa346c Do not allow automatic update in case YaCy is installed to the Program
12 years ago
orbiter dac88561ae minimum access time has a tight connection to ClientIdentification,
12 years ago
Felix Ableitner a020697d64 Fixed problems with blacklist entry insertion.
12 years ago
sixcooler bff8c753c6 re-insert this file - was deleted by mistake
12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
orbiter c79f687110 enhanced the network scanner: find more hosts automatically by removal
12 years ago
orbiter b4677d1cad fix for bug #252
12 years ago
Michael Peter Christen 07261fe274 Merge remote-tracking branch 'nutomics/blacklist_structure'
12 years ago
Michael Peter Christen dea71851d2 - better concurrency for network scanner
12 years ago
orbiter 9f0cc9b401 enhanced network scanner
12 years ago
orbiter f8c28efd66 fix for rssTerminal coloring
12 years ago
Felix Ableitner 44f8fcf62e Changed class structure of Blacklist.
12 years ago
Michael Peter Christen 3054a6d4b9 added a patch from Sebastian M.B., submitted by email for coloring of
12 years ago
Michael Peter Christen 78af998f8f Merge commit 'fd90fcc4e08f80acbfd1c9a7ec62ce04cd309594'
12 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Felix Ableitner fd90fcc4e0 Fixes #196.
12 years ago
Michael Peter Christen f1c5338210 prepartion for greedy crawl profiles and refactoring
12 years ago
Michael Peter Christen e6f361f474 adding the canonical tag to crawl queues
12 years ago
Michael Peter Christen 203921006a redesign of citation index storage
12 years ago
Michael Peter Christen e92b9275ce Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 56cdcfa2fa fixed greedy learning mode - global is not a search attribute in
12 years ago
Michael Peter Christen 32aa1d4569 removed unused option for queries
12 years ago
Michael Peter Christen 0c5bed7e2c added configuration option for greedy learning function to ConfigPortal
12 years ago
sixcooler 5d1f619f07 possible helpful closing of solr-requests
12 years ago
Michael Peter Christen 9d291764d1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
sixcooler e5abccdfe4 added optimize-option
12 years ago
Michael Peter Christen 8ea6ddf636 removed attributes from ConfigPortal.html which are redundant to
12 years ago
Michael Peter Christen 64140f35cd fix for solr requests if no query part is given (prevent npe)
12 years ago
Michael Peter Christen 23fb458963 - fix to gsa searchresult answer in case that no query part is given
12 years ago
Michael Peter Christen 660a196989 refactoring
12 years ago
Michael Peter Christen 54024958ac added url_file_name_s in qeury for live-search of urls
12 years ago