Commit Graph

5296 Commits (72adbeae90120840ec67fd1ddd963f4ca270c73f)

Author SHA1 Message Date
orbiter 8895d8c1cd removed unnecessary log entries
13 years ago
orbiter 550c881d80 remove more news (all older than one day) because they can be a performance problem if we have too many peers sending news
13 years ago
orbiter ebd840ebf6 - enhanced description on search front page
13 years ago
apfelmaennchen 77a080ced9 smaller fixes for YMarks
13 years ago
orbiter e22f8497c9 - tested the ARC methods
13 years ago
orbiter bc5df0eef5 updated ranking tables (fresh computation)
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
apfelmaennchen dd1482aaf5 further update to YMarks
13 years ago
orbiter c9216d5adf fixed secondary remote search (the process that finds distributed join situations)
13 years ago
orbiter 64fd20b857 new default ranking profile
13 years ago
orbiter 0cf9ebc3b0 speed enhancements when parsing RWI rows (makes search slightly faster)
13 years ago
orbiter c9a0dbd25a added a security check
13 years ago
orbiter ee8b1d4de1 fixed unresolved pattern and unwanted local/global switch when using votes on search results
13 years ago
orbiter c584db991f creating a bookmark from the search results now works again .. with new YMarks
13 years ago
orbiter 1120f0c93c update to network graphics: slightly less crawling activity, slightly stronger color for query activity
13 years ago
orbiter 6cd27473f5 - better default values for caching and cache usage
13 years ago
orbiter 709013385a fix for language fix
13 years ago
orbiter 1019c36dad bug fixes and speed enhancements for search
13 years ago
orbiter 507c9d478d much better timing when search globally; less blocking; more results earlier!
13 years ago
orbiter 8e0b2c5832 fixed cluster search
13 years ago
orbiter c0c6e9e7a5 fix for bad language encoding
13 years ago
apfelmaennchen 564374d1fe - included YMarks in addition to old bookmarks in yacysearchitem.html; don't get confused by the old bookmark dialog, the ymark is automatically added silently beforehand.
13 years ago
orbiter 05f34a3fa7 added a full, complete, database insert, update and delete API for the tables.
13 years ago
lotus 3cc93325f0 temporary remove compare search from tray
13 years ago
orbiter c93f10417a add a bookmark automatically each time a new crawl is started
13 years ago
orbiter e4a82ddd8b produce a bookmark entry from every crawl start. these bookmarks are always private.
13 years ago
apfelmaennchen 6287c2b4a9 YMarks:
13 years ago
cominch 2236e01137 Minor correction to prevent useless comma at beginning of string, created from list
13 years ago
apfelmaennchen 5581be12fb YMarks:
13 years ago
orbiter 804e48888b smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter
13 years ago
apfelmaennchen a3eebfdcba YMarks:
13 years ago
orbiter c50f8f9a06 code cleanup
13 years ago
orbiter 84c3fc9d97 local/global fixes in search, better abstraction
13 years ago
apfelmaennchen 4f95f72124 YMarks:
13 years ago
orbiter aa322bc6d0 fix
13 years ago
orbiter 97d1347adb added also a default accept field to robots.txt downloads
13 years ago
orbiter f183d3822c added a default accept header in http requests since some http fraud detection functions check that this header field exist
13 years ago
orbiter 06352b8d6b more logging
13 years ago
orbiter a99934226e more logging for debugging of robots.txt
13 years ago
orbiter 7a5841e061 fix for robot parser
13 years ago
orbiter 458c20ff72 fix for robot parser
13 years ago
orbiter 017a01714d - enhanced logging in robots.txt parser for remote debugging
13 years ago
apfelmaennchen a8dfe787ed - updated to jquery flexigrid 1.1
13 years ago
orbiter eb1c7c041d write info about robots.txt evaluation into getpageinfo_p.xml
13 years ago
apfelmaennchen abba31f02e - bugfix for correctly sorting ymarks
13 years ago
orbiter 3a15e58e28 - increased stability when opening the robots table
13 years ago
orbiter 775b44017e refactoring
13 years ago
orbiter e914a30099 fix for npe
13 years ago
apfelmaennchen 5f7dbe1c42 - some refactoring (ymarks)
13 years ago
orbiter 78ce3b13be typo
13 years ago
orbiter 85d6bf4ac4 fixed urls to media content during indexing
13 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process
13 years ago
orbiter 3a807e10cf - added a cache for active crawl profiles to the crawl switchboard
13 years ago
orbiter 37e35f2741 normalization of url using urlencoding/decoding
13 years ago
orbiter e58438c01c - added a new retry connector for solr (for cases where solr responses are slow)
13 years ago
orbiter d8d9735b4f stability bugfix
13 years ago
orbiter c31564ef08 stability bugfixes
13 years ago
orbiter f121f4bb45 fix for link in Supporter and Suftipps page
13 years ago
orbiter 94eab08794 - updated opensearchdescription text and icon
13 years ago
orbiter 279482a76d fix for npe
13 years ago
orbiter 1b86d06d1e fix for http://bugs.yacy.net/view.php?id=62
13 years ago
orbiter 9e4875230f performance hacks
13 years ago
orbiter eb9c9edb01 enhanced table method (used by almost all yacy api interfaces)
13 years ago
orbiter 4ad9fc2bff new snippet strategy for search hits in metadata: show beginning of text instead of hit position
13 years ago
orbiter a9838f8b99 fix for http://bugs.yacy.net/view.php?id=59
13 years ago
hermens d3df03838a make sure myself-target is always inserted at its appropriate position
13 years ago
hermens c3e7efa846 added sender side prevention of rwi flooding as mentioned in SVN 7993
13 years ago
orbiter 5af9598bd1 enhanced exported row parsing during row import
13 years ago
orbiter 7598a9e26b fix for thread dump
14 years ago
orbiter 8eef8722d1 update to ThreadDump analysis: freerunner and thread state recognition
14 years ago
orbiter 1df43b137d another performance hack
14 years ago
orbiter 7df0643f0e performance hacks
14 years ago
orbiter a7df70221e refactoring
14 years ago
orbiter 1b45e33f04 added robots tag parser to solr scheme
14 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile
14 years ago
orbiter c61e4cfd78 - fix for incomplete clear() in balancer
14 years ago
orbiter 813f297a95 another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash
14 years ago
orbiter 035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill)
14 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
14 years ago
f1ori e207c41c8e * fix urlproxy for urls containing dolar signs
14 years ago
orbiter 57d5529a01 performance hacks
14 years ago
orbiter 5ad7f9612b added crawl settings for three new filters for each crawl:
14 years ago
orbiter 47a8c69745 added a new feature to MultiProtocolURIs to get the locale for each url:
14 years ago
orbiter 2c3161b4ac refactoring:
14 years ago
orbiter d2ea250d99 refactoring:
14 years ago
low012 42b5f09f68 *) this should fix a bug in snippet creation (also cleaned up a little bit)
14 years ago
low012 277b454a62 *) added comments
14 years ago
orbiter 6b22865dbc - removed some warinings
14 years ago
orbiter 0c6d95e57b - more tolerance against failure of table opening
14 years ago
orbiter 4f31869c5a enhanced search result timing
14 years ago
orbiter 6b02b696b0 - add number of search results to end of rss and json output to reflect latest status of retrieval
14 years ago
f1ori 87e6abd168 * fix urls containing a port number in urlproxy
14 years ago
f1ori 97045022fa * pass cookies to Server Side Includes
14 years ago
orbiter ce2a76d603 performance hack for search process
14 years ago
orbiter aaf7a0feaa yet another cache strategy
14 years ago
orbiter 8a428d3e77 ensure termination of pdf parser to avoid deadlocking of other processes during search result preparation
14 years ago
orbiter 2c4a672fe2 bugfixes and performance hacks for tabe index
14 years ago
orbiter dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
14 years ago
orbiter 734059d33e performance hacks
14 years ago
orbiter 23e81b28b2 synchronization enhancements
14 years ago
orbiter dd4635e323 patches
14 years ago
orbiter bb0c045036 fix for problem with relocation of network
14 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
14 years ago
orbiter 0819e1d397 protection against OOM cases in image parser. See also bugs.yacy.net/view.php?id=54
14 years ago
orbiter 52a2b3f110 try to fix bug http://bugs.yacy.net/view.php?id=26
14 years ago
orbiter 2cba860693 - fix for wrong entries in NOLOAD indexing queue (that caused that urls had been only indexed based on their url and not loaded)
14 years ago
orbiter 2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking
14 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
14 years ago
sixcooler ecb4986b38 refactored stuff from last commit to ReferenceContainer
14 years ago
sixcooler f7c4abfdd7 limit references per blob & term to the 100.000 youngest
14 years ago
orbiter 28f5b79deb added a fast mass-deletion method
14 years ago
orbiter a70dbce41c added another file tool class to yacy-cora
14 years ago
orbiter 49e5ca579f added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
14 years ago
orbiter e02bfbde56 fix for solr url
14 years ago
f1ori 41e146116a fixes size of document in case the server doesn't give the size in the header
14 years ago
orbiter 580beb12a5 reverting SVN 7863; the synchronization was needed and no synchronization causes repeated DNS lookup for the same hosts
14 years ago
orbiter 44d6416e2d ensure termination of shrink()
14 years ago
orbiter 52230a6864 replaced catching of Exception with Throwable, which catches also Errors
14 years ago
orbiter 877eaf6bcb switched off logging of org.apache.http which was suddenly switched on by default (??)
14 years ago
orbiter e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase.
14 years ago
sixcooler 2cf61a40ce fixed a bug from 7856, where Snippet returned an error by mistake when Metadata was found
14 years ago
orbiter 610b01e1c3 - added a 'add every media object linked in a html document as a new document' to the html parser. This causes that all image, app, video or audio file that is linked in a html file is added as document. In fact that means that parsing a single html document may cause that a number of documents is inserted into the search index.
14 years ago
orbiter 3da21c4266 protection against starting of a (second) yacy peer while another one is already running on the same port
14 years ago
orbiter b5252ef91f added new word recommendation library in DictionaryLoader_p.html
14 years ago
orbiter 1c007188ad bugfixes in html parser
14 years ago
orbiter 231074bf0a fixed a parsing bug by reverting SVN 7766
14 years ago
low012 30a8a2f76b *) replacing one ugly hack with an extended ugly hack ;-)
14 years ago
low012 95379ce0b1 *) should fix some problems with RSS Importer (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3253)
14 years ago
low012 24e76a7b69 *) Replaced occurrences of "Wikimedia" with "MediaWiki" where applicable. (Thanks to the folks of 0x20.be for pointing this out.)
14 years ago
sixcooler d40a177c05 Generation Memory Strategy fine tuning
14 years ago
sixcooler 839f407fe4 Generation Memory Strategy fine tuning:
14 years ago
orbiter 3e6767d66c limitation of reference evaluation (protection against crawler pits)
14 years ago
orbiter a5541751a8 - added memory computation to termlist_p.xml
14 years ago
orbiter 45e497a9bd fix for term iteration
14 years ago
orbiter 5dd2efc9a2 - bugfixes in html parser
14 years ago
orbiter 2c595a6a47 added new methods to count the number of objects in RWIs. lots of refactoring was necessary to introduce new Rating class and to unify naming of methods
14 years ago
orbiter 75df87832c refactoring/better naming of methods and classes
14 years ago
orbiter 9f9f634de2 fix in search
14 years ago
sixcooler 5f8a5ca32d - not doing merge-jobs while short on Memory
14 years ago
orbiter 965fabfb87 enhanced sorting speed (affects all DB operations)
14 years ago
orbiter 41a8ee4569 added iterable implementation in KeyList
14 years ago
orbiter 22d69a6368 refactoring in cora: added sorting package
14 years ago
orbiter 51cf697acd refactoring: moved all score-related classes to new ranking package
14 years ago
orbiter a0d5e7b6e6 added new score comparator
14 years ago
sixcooler 169236c6d9 almost revert changes in this class of 7880 and 7882
14 years ago
sixcooler 4fec99115b Implementation of strategies for controlling memory resources.
14 years ago
sixcooler 63a375b801 do not look at external dtd, cause this make this reader stay forewer(?) on on faulty dtd-locations
14 years ago
orbiter 2c58af6874 - added a short memory status simulation mode
14 years ago
orbiter c64faf41e2 addon to svn 7880
14 years ago
sixcooler 7b7a196243 ignore cookies in httpclient per default
14 years ago