Commit Graph

4865 Commits (6e0f4557f80ffea782d05b33263d70f8d805ba33)

Author SHA1 Message Date
Michael Peter Christen 461a0ce052 removed warnings
13 years ago
Michael Peter Christen 407fdf6968 more bug fixes and performance hacks for search process
13 years ago
Michael Peter Christen a1fe65b115 performance hacks
13 years ago
Michael Peter Christen 2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 0284a4d88f more fixes for double precision of coordinates
13 years ago
Michael Peter Christen e0d8643226 - performance hacks
13 years ago
Michael Peter Christen 9b4c699526 ehanced location search:
13 years ago
Michael Peter Christen 43c2c6e588 better logging
13 years ago
Michael Peter Christen 20e0cc0822 fix for bad location evaluation
13 years ago
Michael Peter Christen eff7667554 fix for http://bugs.yacy.net/view.php?id=188
13 years ago
Michael Peter Christen 8b974905ee changed log-in text for all servlets with authentication:
13 years ago
Michael Peter Christen 16b21f7a5b Added more steering in Crawler_p.html interface
13 years ago
Michael Peter Christen acc19e190d hack against 100% cpu during crawl delete
13 years ago
Michael Peter Christen c15fcde1c8 add-on to latest commit
13 years ago
Michael Peter Christen cf47d94888 performance hack to parse numbers inside of substrings without actually
13 years ago
Michael Peter Christen 7e0ddbd275 added a "fromCache" flag in Response object to omit one cache.has()
13 years ago
Michael Peter Christen 125d47b3c1 added more interruptions in DidYouMean because that was the cause for
13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
13 years ago
Michael Peter Christen 3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Roland 'Quix0r' Haeder b3ae2aa41f With or without 'final'? At least please try it in other methods
13 years ago
Michael Peter Christen 5b3acc12cd Pattern.quote() replaces \\Q and \\E according to publication in
13 years ago
Michael Peter Christen 89142d1e8d removed (not all) warnings
13 years ago
Michael Peter Christen e7e381d110 added configuration to switch off redirection following in crawler
13 years ago
Michael Peter Christen 70505107ca enhanced crawler/balancer: better remaining waiting-time guessing
13 years ago
Michael Peter Christen f150bc218b fixed bug in solr error document
13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects
13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes
13 years ago
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD
13 years ago
Michael Peter Christen f5efdb21fd refactoring
13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
13 years ago
Michael Peter Christen a1a5b015d8 refactoring: moved document Classification to cora package
13 years ago
Michael Peter Christen a5d7da68a0 refactoring: removed dependency from switchboard in Balancer/CrawlQueues
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Peter Christen 046f3a7e8d check if httpc has decompressed the release file and rename the file
13 years ago
Michael Christen 22f05c83ff fixed default must-match filter for full domain crawls - the old filter
13 years ago
Michael Peter Christen 0cc0290978 bugfix for a must-not-match pattern check. This bug did not make the
13 years ago
Michael Peter Christen 2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method.
13 years ago
Michael Peter Christen 8aba045ba1 if a new pop-up page is set in config portal, then this page applies
13 years ago
Michael Peter Christen c6c61be3f0 fix for http://bugs.yacy.net/view.php?id=148
13 years ago
Michael Peter Christen 0d148c3353 more logging in resource observer
13 years ago
Michael Peter Christen 2fa037ae1d enhanced crawler
13 years ago
low012 2120db289a *) Small change which should solve problem with cgitb module in Python CGI scripts.
13 years ago
Lotus ee89cf5ae5 fix must match filter for full domain crawl
13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
13 years ago
Michael Peter Christen 4540174fe0 memory hacks
13 years ago
Michael Peter Christen 9ebcae2fbc enhanced url parser to understand urls with & instead of & in post
13 years ago
Michael Peter Christen 1f4f60654a Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen e6d26a023f fix for bookmark crash with possible side-effects on crawl start after
13 years ago
Michael Peter Christen 190b77c55e added Ukrainian translation
13 years ago
Marek Otahal 72adbeae90 !Important: move from Hashtable to HashMap
13 years ago
Marek Otahal c1af123ddd just a little faster toString
13 years ago
Marek Otahal 64e4bcee82 serverSwitch get(App/Data)Path() use common helper method
13 years ago
Marek Otahal 371fbb4deb just comment + shorter code in serverSwitch
13 years ago
Marek Otahal ed253b7aff update javadoc, does not throw IOException
13 years ago
Michael Peter Christen 2ee8cbeb2c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 992dbdf4bb added noload statistic to servlets
13 years ago
Michael Christen 354b976110 fix for concurrency problem and endless loop in /suggest.json
13 years ago
Michael Christen c21966bb43 fix
13 years ago
Michael Christen 13b05f9c08 fix
13 years ago
Michael Christen e5d878c59e Merge branch 'master' of ssh://gitorious.org/yacy/rc1
13 years ago
Michael Christen ec26b2bea4 Merge commit 'fa08ed5ae5d72bddc3cc6a662b23103579e86109' into quix0r
13 years ago
Michael Christen 216a287a85 Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r
13 years ago
stbrumm d18095dc48 Patch fuer Issue 0000102
13 years ago
Roland 'Quix0r' Haeder 901f37d608 Also this ... :( #2
13 years ago
Roland 'Quix0r' Haeder a985717ed2 Also this ... :(
13 years ago
Roland 'Quix0r' Haeder 5f490de554 Fix for ported fix from my old days ...
13 years ago
Roland 'Quix0r' Haeder fa08ed5ae5 Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen c04bfaa51b refactoring
13 years ago
Michael Christen 17f962fceb translator updates:
13 years ago
Michael Christen 752b092b8a Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
admin 23afee58fe Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
Michael Christen 3eccdca63c protection against too long running snippet fetch processes
13 years ago
apfelmaennchen ff19fcdb28 bugfix for YMarks XBEL import and export; thanks to Dominic
13 years ago
Michael Christen 044f83feed added some pauses into the search process which shall produce
13 years ago
Michael Christen 6e66c9d7f1 fix for http://bugs.yacy.net/view.php?id=87
13 years ago
Michael Christen e7e429705a - less automatic indexing after a search (needs to reset the default
13 years ago
admin a4ac051029 Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
low012 7cfdc2c092 Improved CGI capabilities:
13 years ago
Michael Christen 9cd469e6d6 added pull request from als plus an NPE fix
13 years ago
orbiter 11729061f2 added an option in the bookmark import process to put everything into the crawler
13 years ago
apfelmaennchen 70bcfc150a - small bug fix to ymarks html importer
13 years ago
apfelmaennchen b5d9f631e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8128 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter 35a9e8f307 - fixed network graphic
13 years ago
Al Sutton 8993cac4d8 Initial performance improvements
13 years ago
orbiter 8895d8c1cd removed unnecessary log entries
13 years ago
apfelmaennchen 77a080ced9 smaller fixes for YMarks
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
apfelmaennchen dd1482aaf5 further update to YMarks
13 years ago
orbiter c584db991f creating a bookmark from the search results now works again .. with new YMarks
13 years ago
apfelmaennchen 564374d1fe - included YMarks in addition to old bookmarks in yacysearchitem.html; don't get confused by the old bookmark dialog, the ymark is automatically added silently beforehand.
13 years ago
orbiter c93f10417a add a bookmark automatically each time a new crawl is started
13 years ago
orbiter e4a82ddd8b produce a bookmark entry from every crawl start. these bookmarks are always private.
13 years ago
apfelmaennchen 6287c2b4a9 YMarks:
13 years ago
cominch 2236e01137 Minor correction to prevent useless comma at beginning of string, created from list
13 years ago
apfelmaennchen 5581be12fb YMarks:
13 years ago
apfelmaennchen a3eebfdcba YMarks:
14 years ago
orbiter c50f8f9a06 code cleanup
14 years ago
apfelmaennchen 4f95f72124 YMarks:
14 years ago
orbiter aa322bc6d0 fix
14 years ago
orbiter 97d1347adb added also a default accept field to robots.txt downloads
14 years ago
orbiter f183d3822c added a default accept header in http requests since some http fraud detection functions check that this header field exist
14 years ago
orbiter 06352b8d6b more logging
14 years ago
orbiter a99934226e more logging for debugging of robots.txt
14 years ago
orbiter 7a5841e061 fix for robot parser
14 years ago
orbiter 458c20ff72 fix for robot parser
14 years ago
orbiter 017a01714d - enhanced logging in robots.txt parser for remote debugging
14 years ago
apfelmaennchen a8dfe787ed - updated to jquery flexigrid 1.1
14 years ago
orbiter eb1c7c041d write info about robots.txt evaluation into getpageinfo_p.xml
14 years ago
apfelmaennchen abba31f02e - bugfix for correctly sorting ymarks
14 years ago
orbiter 775b44017e refactoring
14 years ago
apfelmaennchen 5f7dbe1c42 - some refactoring (ymarks)
14 years ago
orbiter 78ce3b13be typo
14 years ago
orbiter 85d6bf4ac4 fixed urls to media content during indexing
14 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process
14 years ago
orbiter 3a807e10cf - added a cache for active crawl profiles to the crawl switchboard
14 years ago
orbiter 37e35f2741 normalization of url using urlencoding/decoding
14 years ago
orbiter 1b86d06d1e fix for http://bugs.yacy.net/view.php?id=62
14 years ago
orbiter 9e4875230f performance hacks
14 years ago
orbiter a9838f8b99 fix for http://bugs.yacy.net/view.php?id=59
14 years ago
orbiter a7df70221e refactoring
14 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile
14 years ago
orbiter c61e4cfd78 - fix for incomplete clear() in balancer
14 years ago
orbiter 813f297a95 another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash
14 years ago
orbiter 035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill)
14 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
14 years ago
f1ori e207c41c8e * fix urlproxy for urls containing dolar signs
14 years ago
orbiter 5ad7f9612b added crawl settings for three new filters for each crawl:
14 years ago
orbiter d2ea250d99 refactoring:
14 years ago
low012 42b5f09f68 *) this should fix a bug in snippet creation (also cleaned up a little bit)
14 years ago
orbiter 6b22865dbc - removed some warinings
14 years ago
orbiter 0c6d95e57b - more tolerance against failure of table opening
14 years ago
orbiter 4f31869c5a enhanced search result timing
14 years ago
orbiter 6b02b696b0 - add number of search results to end of rss and json output to reflect latest status of retrieval
14 years ago
f1ori 87e6abd168 * fix urls containing a port number in urlproxy
14 years ago
f1ori 97045022fa * pass cookies to Server Side Includes
14 years ago
orbiter ce2a76d603 performance hack for search process
14 years ago
orbiter 2c4a672fe2 bugfixes and performance hacks for tabe index
14 years ago
orbiter dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
14 years ago
orbiter 734059d33e performance hacks
14 years ago
orbiter 23e81b28b2 synchronization enhancements
14 years ago
orbiter dd4635e323 patches
14 years ago
orbiter bb0c045036 fix for problem with relocation of network
14 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
14 years ago
orbiter 52a2b3f110 try to fix bug http://bugs.yacy.net/view.php?id=26
14 years ago
orbiter 2cba860693 - fix for wrong entries in NOLOAD indexing queue (that caused that urls had been only indexed based on their url and not loaded)
14 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
14 years ago
orbiter 49e5ca579f added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
14 years ago
f1ori 41e146116a fixes size of document in case the server doesn't give the size in the header
14 years ago
orbiter e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase.
14 years ago