Commit Graph

750 Commits (2c4a672fe26027669af76f962ba1c1a4a09f7027)

Author SHA1 Message Date
orbiter 2c4a672fe2 bugfixes and performance hacks for tabe index
13 years ago
orbiter dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
13 years ago
orbiter 734059d33e performance hacks
13 years ago
orbiter 23e81b28b2 synchronization enhancements
13 years ago
orbiter dd4635e323 patches
13 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
13 years ago
orbiter 0819e1d397 protection against OOM cases in image parser. See also bugs.yacy.net/view.php?id=54
13 years ago
orbiter 2cba860693 - fix for wrong entries in NOLOAD indexing queue (that caused that urls had been only indexed based on their url and not loaded)
13 years ago
orbiter 2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking
13 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
13 years ago
sixcooler ecb4986b38 refactored stuff from last commit to ReferenceContainer
13 years ago
sixcooler f7c4abfdd7 limit references per blob & term to the 100.000 youngest
13 years ago
orbiter 28f5b79deb added a fast mass-deletion method
13 years ago
orbiter a70dbce41c added another file tool class to yacy-cora
13 years ago
orbiter 49e5ca579f added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
13 years ago
orbiter e02bfbde56 fix for solr url
13 years ago
orbiter 580beb12a5 reverting SVN 7863; the synchronization was needed and no synchronization causes repeated DNS lookup for the same hosts
13 years ago
orbiter 44d6416e2d ensure termination of shrink()
13 years ago
orbiter 52230a6864 replaced catching of Exception with Throwable, which catches also Errors
13 years ago
orbiter 877eaf6bcb switched off logging of org.apache.http which was suddenly switched on by default (??)
13 years ago
orbiter e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase.
13 years ago
orbiter 610b01e1c3 - added a 'add every media object linked in a html document as a new document' to the html parser. This causes that all image, app, video or audio file that is linked in a html file is added as document. In fact that means that parsing a single html document may cause that a number of documents is inserted into the search index.
13 years ago
orbiter 3da21c4266 protection against starting of a (second) yacy peer while another one is already running on the same port
13 years ago
orbiter b5252ef91f added new word recommendation library in DictionaryLoader_p.html
13 years ago
orbiter 1c007188ad bugfixes in html parser
13 years ago
orbiter 231074bf0a fixed a parsing bug by reverting SVN 7766
13 years ago
low012 30a8a2f76b *) replacing one ugly hack with an extended ugly hack ;-)
13 years ago
low012 95379ce0b1 *) should fix some problems with RSS Importer (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3253)
13 years ago
low012 24e76a7b69 *) Replaced occurrences of "Wikimedia" with "MediaWiki" where applicable. (Thanks to the folks of 0x20.be for pointing this out.)
13 years ago
sixcooler d40a177c05 Generation Memory Strategy fine tuning
13 years ago
sixcooler 839f407fe4 Generation Memory Strategy fine tuning:
13 years ago
orbiter a5541751a8 - added memory computation to termlist_p.xml
13 years ago
orbiter 45e497a9bd fix for term iteration
13 years ago
orbiter 5dd2efc9a2 - bugfixes in html parser
13 years ago
orbiter 2c595a6a47 added new methods to count the number of objects in RWIs. lots of refactoring was necessary to introduce new Rating class and to unify naming of methods
13 years ago
orbiter 75df87832c refactoring/better naming of methods and classes
13 years ago
sixcooler 5f8a5ca32d - not doing merge-jobs while short on Memory
13 years ago
orbiter 965fabfb87 enhanced sorting speed (affects all DB operations)
13 years ago
orbiter 41a8ee4569 added iterable implementation in KeyList
13 years ago
orbiter 22d69a6368 refactoring in cora: added sorting package
13 years ago
orbiter 51cf697acd refactoring: moved all score-related classes to new ranking package
13 years ago
orbiter a0d5e7b6e6 added new score comparator
13 years ago
sixcooler 4fec99115b Implementation of strategies for controlling memory resources.
13 years ago
sixcooler 63a375b801 do not look at external dtd, cause this make this reader stay forewer(?) on on faulty dtd-locations
13 years ago
orbiter 2c58af6874 - added a short memory status simulation mode
13 years ago
orbiter c64faf41e2 addon to svn 7880
13 years ago
sixcooler 7b7a196243 ignore cookies in httpclient per default
13 years ago
sixcooler 411ed159f8 do some extra sleep while running low on memory
13 years ago
sixcooler 9ab0ba41e2 using GzipDecompressingEntity from httpclient instead of our own
13 years ago
sixcooler 07f5954570 try better handling of corrupt blobs
13 years ago