Commit Graph

335 Commits (b73ea6581d8269ddf120e15e270688baf28df2ff)

Author SHA1 Message Date
orbiter d39a5b42ca more care about open file handles. Now files also close on windows and can be deleted afterwards.
16 years ago
orbiter 029495e64d fixed bug introduced in SVN 5756 in EcoTable.put()
16 years ago
orbiter 587838bd09 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5758 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter 96eaecda3e - added migration class to go from index collections to the index cell data structure.
16 years ago
orbiter 37f892b988 added new concurrent merger class for IndexCell RWI data
16 years ago
borg-0300 8c494afcfe svn attributes added
16 years ago
orbiter 67aaffc0a2 - added Latency control to the crawler:
16 years ago
orbiter 61f9dbf0cc - fixed a display problem in watch crawler
16 years ago
orbiter b3f75e48fa - enhanced balancer: auto-solving of waiting-deadlocks
16 years ago
orbiter d99ff745aa fix for http://forum.yacy-websuche.de/viewtopic.php?p=13378#p13378
16 years ago
borg-0300 fd0976c0a7 refactoring
16 years ago
borg-0300 ce79239322 "typo"
16 years ago
orbiter 7dff1cba62 removed option to use different primary keys in kelondro tables
16 years ago
orbiter 7f67238f8b refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter f6d989aa04 added new class RowSetArray which arranges RowSet objects like Elements in a hashtable, but still provides the functionality of sorted enumeration. The new class is now integrated into the ObjectIndexCache, which is the core class to provide index functions to all database files. The new index access is about twice as fast as before. This has strong speed enhancement effects on all parts of YaCy.
16 years ago
orbiter efcd95dc37 simplification of (internal) query process / refactoring
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter 8444357291 added new row interator in kelondro tables files that enumerates rows
16 years ago
orbiter 9559bc23fd automatic clean-up of dead connections
16 years ago
orbiter 4f9dae2571 remove reference in crawl entries
16 years ago
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter 62505bb3cb more bugfixes as recommendet by findbugs
16 years ago
orbiter 411f2212f2 more memory leak fixing hacks
16 years ago
orbiter 6c627dbdff update to the server core
16 years ago
orbiter 6a876ecb88 first fixes to the DHT transmission process
16 years ago
orbiter c25c334b75 replaced old DHT transmission method with new method. Many things have changed! some of them:
16 years ago
orbiter 65a1de6c05 longer timeout for remote crawl queries
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter 024da2916b refactoring of logging
16 years ago
orbiter 83ce65707a (almost) completed partition of classes in kelondro
16 years ago
orbiter 7ee494fde5 more refactoring of kelondro:
16 years ago
orbiter bf93767ec6 refactoring of kelondro database classes
16 years ago
orbiter fc27bf8c4c refactoring of kelondro classes:
16 years ago
orbiter 91af105373 last changes before release
16 years ago
orbiter 05c235de32 fix for npe
16 years ago
orbiter 2b32248079 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1516&p=10545#p10545
16 years ago
orbiter 4d5b401f00 try to fix some performance problems with the internal index management:
16 years ago
orbiter c6880ce28b removed the permanent cache flush and replaced it with a periodic cache flush
16 years ago
orbiter 07fc115e90 removed active profiling in kelondroRowSet
16 years ago
orbiter e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
16 years ago
orbiter bb935fdbb0 less organization overhead for DNS caching and prefetching
16 years ago
orbiter e34ac22fbd - added new monitoring servlet at
16 years ago
orbiter d376d81fc4 replaced busy thread control of crawl stacker by blocking threads
16 years ago
f1ori 0881190b19 * Robots.txt: don't interpret Crawl-Delays for other robots
16 years ago
orbiter 243e73f53b removed unnecessary usage of kelondroBLOBTree
16 years ago
orbiter 7535fd7447 - refactoring of CrawlEntry and CrawlStacker
16 years ago
orbiter 2802138787 - refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck)
16 years ago
orbiter 1779c3c507 - added a read cache to the RAFile interface to RandomAccessFile
16 years ago
orbiter 47292e696a more performance hacks
16 years ago
orbiter d39d420b39 performance hacks
16 years ago
orbiter fa26a8f25a fix for deadlock-like behavior in balancer
16 years ago
orbiter 1918a0173e added more exception handling during crawling
16 years ago
orbiter dba7ef5144 extended crawling constraints:
16 years ago
orbiter ef66438662 - more space in error db to store larger error messages
16 years ago
orbiter 674ad2d55b different handling of error cases that occur during loading files with http or ftp:
16 years ago
lotus 16723d0fa6 ask another peer if crawljob loading fails
16 years ago
orbiter 1b18d4bcf3 enhancement to crawling and remote crawling:
16 years ago
orbiter 3f746be5d4 - consolidation and refactoring of many DHT target - computing methods
16 years ago
lotus 5cf0cbb47e javadoc
16 years ago
lotus 8d07607d1d update to resource observer:
16 years ago
orbiter 1778fb420d - added some performance tweaks to the new BLOB buffer
16 years ago
orbiter 382226da94 fix for bug introduced in SVN 5281: parameters were switched
16 years ago
danielr f2fd043797 refactoring (moved duplicate code into methods)
16 years ago
orbiter 826ca79735 refactoring and new architecture to store the files of the web cache:
16 years ago
orbiter 6fb865fbdc - fix of bug in iterator in kelondroBLOBHeap which caused bug in crawl profile listing
16 years ago
orbiter 2d65887723 - fix for bug in new profile handling
16 years ago
orbiter ff68f394dd fix for problem with balancer and lost crawl profiles:
16 years ago
orbiter 9ac16f565b - fixed several bugs in database management functions
16 years ago
orbiter c8bdd965ec - larger update time for status page
16 years ago
orbiter ce57de6cb3 - fixed re-setting of DHT Send/Receive settings
16 years ago
orbiter e1f67262f7 - added and removed some debugging output
16 years ago
orbiter 21dbb39afa switched two balancer cases
16 years ago
orbiter 1bbf362cef update to the crawl balancer: better organization and better crawl delay prediction
16 years ago
orbiter ddcf285499 - fixed a bug in performance setting (did not work with german translation)
16 years ago
orbiter 0cd0fee546 fixed bug with wrong proxy result enqueueing. See:
16 years ago
lotus fd9233244e configurable free disk space via disk.free
16 years ago
lotus 73f233bb11 * set resource observer to 1000MB
16 years ago
orbiter a28faabfd2 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1351&p=9242#p9242
16 years ago
f1ori bea6c13139 * with r5137 robotParser didn't work at all -> fix
16 years ago
f1ori ae677e1738 * fix problem in robotparser, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1421&p=9742
16 years ago
orbiter 39964e88fa fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1329#p9121
16 years ago
orbiter 3f3673b6e5 extended balancer:
16 years ago
orbiter d09ddabd09 corrected a design mistake (5-byte hashes not necessary)
16 years ago
orbiter 77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet
16 years ago
orbiter 80a7bc93d6 - added statistical evaluation about domains that appear during crawling
16 years ago
orbiter 05dbba4bab added logging conditions to all fine and finest log line calls
16 years ago
orbiter d3d41e2ee4 - fixed problem with searching with quotes (still not complete, but not as bad as before)
16 years ago
danielr 9ff4fc11da partial fix (images,audio,video) for proxy and content-type problem http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374
16 years ago
lotus d9d9c522a1 addendum to last commit
16 years ago
lotus 480497f7c9 changed recrawl
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
danielr 3c68905540 remove redundant null checks
16 years ago
orbiter 7989335ed6 Preparations to replace the HTCache with a new storage data structure:
16 years ago
danielr be28af50f5 - fixed "yacy2yacy no proxy"-problem
16 years ago
danielr a087090bbb fixed starting crawl results in "No parser available to parse mimetype 'application/octet-stream'"
17 years ago
danielr 621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
17 years ago
danielr 17b7845eb5 * refactoring
17 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter 50ef5c406f - refactoring of robots parser (removed opaque Objects[] result vector)
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter 3ca98fee42 removed superfluous copyright statement
17 years ago
orbiter 05c26d58d9 fixed missing remove operation in balancer
17 years ago
orbiter 606b323a2d fixed bug that appeared when a new crawl ist started
17 years ago
orbiter 28d5703f8a - fixed a bug in Robots.txt loader which could have caused that robots.txt files had been loaded from the same domain more than once
17 years ago
orbiter 7b1c9e6aee discovered and removed a (possibly large) memory leak:
17 years ago
orbiter 0f5fe8cc53 refactoring of method calling for objects from kelondroMapDataMining
17 years ago
orbiter 4acf0a61cd refactoring of kelondroObjects (mainly renaming to kelondroMap)
17 years ago
orbiter 1e6d12f146 Major update to BLOB data structures:
17 years ago
orbiter 81f75f5056 - removed unnecessary classes (these objects are much easier to handle using generics)
17 years ago
orbiter 7052f2f61f - added copyright header of ResourceObserver
17 years ago
orbiter 1400cdc91e - refactoring of resourceObserver (moved it to crawler)
17 years ago
orbiter a6719dfd2b - refactoring of robots parser
17 years ago
orbiter e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
17 years ago
orbiter 474e29ce4a added options to configure the 'corporate identity'-icons, the home page link and the greeting line from
17 years ago
orbiter 474659a71f - modified and enhanced the crawl balancer: better list export, fixing of damaged crawl queue at start-up, re-sorting at start-up to enhance domain order
17 years ago
danielr 63eadfdf84 fixed unlimited FileSizeLimit
17 years ago
orbiter b928ae492a some code-cleanup and possible speed enhancements in different core methods
17 years ago
danielr 6a9cc29cdd workaround for IndexOutOfBoundsException in ResultURLs.getExecutorHash() seen @ CrawlResults.html?process=4
17 years ago
danielr 68c38c2d34 - WatchCrawler shows status without JavaScript
17 years ago
orbiter 3330181aa0 refactoring:
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter 2f381b8d7a - fixed at least two causes for a NPE after a use case switch.
17 years ago
orbiter 25192e0d36 added a deletion button to indexControlRWIs that deletes the complete web index
17 years ago
orbiter 8be462986e fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1174&p=7841#p7841
17 years ago
orbiter d9d1c8de70 more protection against remote shutdown attacks: prevent loading using the crawler
17 years ago
orbiter 2f29ab8779 more target server access security
17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
17 years ago
orbiter dd75b3cabc - patch for bad profiles
17 years ago
orbiter f42c8cf69c updated terminal and dynamic webstructure applet: can now change when crawl is running
17 years ago
orbiter ad0f905124 fix for npe in crawler
17 years ago
orbiter b32736762c enhanced rssTerminal
17 years ago
orbiter fbb712c669 refactoring:
17 years ago
orbiter 1689030ee8 refactoring: moved all crawler classes into their own package
17 years ago