Commit Graph

1049 Commits (daf2e15f59f2d5a6c4efbd45ae371e723bb8de1b)

Author SHA1 Message Date
theli 2e43e744de *) Suppress stacktrace on crawler error for "connect timed out" 20 years ago
theli 36cbe04e3e *) Bugfix for Crawler Redirection Bug 20 years ago
theli b70de495a0 *) Remembering Crawler-isPaused setting 20 years ago
theli e569a84dc0 *) Using the same configuration settings for all indexing threads on server Startup 20 years ago
theli 17be77a468 *) Bugfix for "Crawler data will not be removed from htcache if content parsing failed" 20 years ago
theli 5f55dff297 *) Bugfix for "Binäre Nullen auf der page: Index Creation: Indexing Queue" 20 years ago
allo eb6365c069 local Bootstrapping bug. 20 years ago
theli 330eae7cf3 *) Normalizing CrawlerStartURL now before crawling is started 20 years ago
theli ab894d26bc *) Bugfix for "plasmaSwitchboard.deQueue: null" Bug (hopefully) 20 years ago
theli eaf9f26cc3 *) Bugfix for NULL PROFILE HANDLE 'null' Bug: 20 years ago
rramthun 4cb382decb Adding changes by borg-0300 from http://www.yacy-forum.de/viewtopic.php?t=997 20 years ago
theli ec4c70d722 *) If there are at most 10 entries left while doing an index transfer, these entries will also be appended 20 years ago
theli d4a045d7b1 *) Trying to solve "de.anomic.plasma.plasmaSwitchboard.deQueue': null" Bug 20 years ago
theli ea9a992f05 *) Before the crawler retries to download a URL it checks if the server is already doing a shutdown 20 years ago
theli ea26b84eed *) Bugfix for http://www.yacy-forum.de/viewtopic.php?t=954 20 years ago
theli 0c8a48e2cb *) converting php Session ID to lower case in funktion isCGI 20 years ago
orbiter e616395c3b latest changes and cut for 0.40 20 years ago
orbiter c47bb1182d bugfix for assortment initialization error 20 years ago
theli 4654eae4e2 *) adding php Session ID to argument in funktion isCGI 20 years ago
orbiter 25f632dbd9 more DHT bugfixes and better logging of DHT effects 20 years ago
orbiter 5cb00889d9 enhancements to dht selection, search and search presentation 20 years ago
orbiter ba0a486328 moved printStackTrace() to logging 20 years ago
orbiter 3094045d34 fix for http://www.yacy-forum.de/viewtopic.php?p=7454#7454 20 years ago
orbiter cd10370992 several bugfixes and dht selection / logging improvement 20 years ago
orbiter 3610fe6b3a see http://www.yacy-forum.de/viewtopic.php?p=7410#7410 20 years ago
orbiter c8a7a85ce2 fix for http://www.yacy-forum.de/viewtopic.php?p=7384#7384 20 years ago
orbiter 6594541ef5 fix for http://www.yacy-forum.de/viewtopic.php?p=7361#7361 20 years ago
orbiter 7db543a9fa fixes for several dht misbehaviours 20 years ago
orbiter 5716f8521d bug fixes for word ordering and dht index selection 20 years ago
orbiter f5259f29e8 word cache behaviour fix and other fixes 20 years ago
orbiter 2c234e1b82 better log output for search result 20 years ago
theli 89c9faa89e *) More graceful logging output in crawler 20 years ago
orbiter 248c24b60a intermission-feature usage in case of local and remote search 20 years ago
theli b32e7c516c git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@507 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago
theli 86305f051d *) Trying to solve "java.net.BindException: Address already in use: JVM_Bind" Problem 20 years ago
theli 865b9490a2 *) Making DHT Transfer while Crawling configurable 20 years ago
theli 1d83d7e4d7 *) httpdFileHandler.java: 20 years ago
orbiter 2d8557cb10 minor changes 20 years ago
orbiter 91163db52e fix for more time-related problems in proxy 20 years ago
orbiter fb6f238d70 fix for expires-problem 20 years ago
rramthun eacff63eda Typos... 20 years ago
orbiter 40da910f41 bugfixes and automatic news-cleanup 20 years ago
theli 228b04b499 *) Bugfix for "wrong seed-upload timestamp" problem 20 years ago
theli 470839a16a *) Crawler/Session pool settings will now be stored properly into configfile 20 years ago
orbiter 4377e119f3 bugfix for http://www.yacy-forum.de/viewtopic.php?p=6620#6620 20 years ago
orbiter e84a177c49 many bigfixes 20 years ago
orbiter 7e3e9ba0de fix for http://www.yacy-forum.de/viewtopic.php?p=6563#6563 20 years ago
orbiter 1022fbeb65 many YaCyNews fixes 20 years ago
orbiter 13abd8b6e7 added news-creation at crawl start 20 years ago
rramthun f555b9d5f2 Translation, spelling... 20 years ago
orbiter cdbbfd50fb fixed bad remote crawl behavior 20 years ago
orbiter 36707586c7 filtering of jsessionid 20 years ago
rramthun 6f2f54a312 Translation, spelling... 20 years ago
orbiter 81e564edb8 faster crawl profile list cleanup 20 years ago
orbiter ad90f0ad13 activated RWI distribution to DHT for senior peers (default redundancy 3), necessary now for network growth 20 years ago
orbiter b9d18d40cb configuration of proxy idle time in performance menue 20 years ago
orbiter 3470a72d48 fixed div by zero, set default delays, fixed release number format and display 20 years ago
orbiter be1f324fca performance setting for remote indexing configuration and latest changes for 0.39 20 years ago
orbiter c64970fa47 re-implemented proxy-busy-check and fixed some other things 20 years ago
orbiter b73557ed2d better assortment monitoring and enhanced profile menue 20 years ago
orbiter 1f36bf4dae enhanced assortment capacity; added extended WORDS migration 20 years ago
rramthun 0f11399d16 Some corrections... 20 years ago
orbiter 9f505af7aa preparations for bulk remote crawls 20 years ago
orbiter 9c72b4cdec replaced index dump stack by an dump array and limited url number in assortment ram (prevents too much RAM occupation) 20 years ago
orbiter 51962d55bf added 'PPM', page-per-minute statistics 20 years ago
orbiter 159f795f65 bugfix (null pointer exception in assortments) 20 years ago
orbiter 1d2155675b changed assortment memory cache flush 20 years ago
orbiter 19dbed7cc8 code clean-up 20 years ago
orbiter 00f63ea00d fail-save patch for pattern matching 20 years ago
orbiter 0a6be961ea added pattern organization 20 years ago
orbiter 40036ba69c fixed dht transmission; added url-blacklist blocking also for remote search 20 years ago
orbiter 311e627363 blocking of blacklisted urls in indexReceive and small changes 20 years ago
orbiter 2f0d7ea8d3 removed htcache stati (superfluous now) 20 years ago
orbiter 277048501e bugfix 20 years ago
orbiter 8b89c46afe fixed problem with cache write 20 years ago
orbiter 455ae9f55f fixed htcache-store problem and due-time for remote crawls 20 years ago
theli 55d10b864c *) further improvements in shutdown behaviour 20 years ago
orbiter 419f8fb398 fixed bugs/missing code regarding new crawl stack 20 years ago
orbiter 112c5d3332 the new file-based indexing queue 20 years ago
orbiter 858cd94299 replaced indexing ram-queue by file-based stack-queue 20 years ago
theli 57c30f1d78 *) bugfix for usage of httpc without gzip content encoding 20 years ago
theli 0e2c33ee55 *) Network.html/Network.java: 20 years ago
orbiter 5159a090b0 fixed parser bug with lowercase force (appeared in: http://spellbound.sourceforge.net/) 20 years ago
orbiter 7f7cbc5019 fixed bug with snippets 20 years ago
orbiter eb74fa0c82 fixed a bug with snippet-length 20 years ago
orbiter 86f2aa8478 fixed seed-load date bug (evaluating server date for age computation) 20 years ago
orbiter 664bceced5 removed debug-lines 20 years ago
orbiter 75ebdbc852 enhanced snippet-generation (case where snippet is too long) 20 years ago
orbiter 8a4f297324 fixed/enhanced snippet error-handling; suppression of results where no snippet exists 20 years ago
orbiter 712fe9ef18 bugfixed utf-8 decoding and parser 20 years ago
theli eee6322aaf *) Adding redirection support to plasmaCrawlWorker.java 20 years ago
theli cd279907c0 *) Adding redirection support to plasmaCrawlWorker.java 20 years ago
theli 6697d5e52e *) correcting fkt. mediaExtContains 20 years ago
orbiter 3addf58046 enhanced snippet-loading with threads 20 years ago
orbiter 56d28a16f0 bugfixes 20 years ago
orbiter d6c85228a6 enhanced snippet computation 20 years ago
theli fafda068f9 *) allowing crawler to process resources with statuscode 203 20 years ago
theli aae9a433a6 *) correcting usage of supportedFileExt-List 20 years ago
orbiter 1e7f062350 many bugfixes, memory leak fixes, performance enhancements; new kelondroHashtable; activated snippets 20 years ago
orbiter 68dc2b0c6b added kelondroArray, the basis for upcoming kelondroHash and some bug fixes 20 years ago
orbiter a19541e563 code-enhancements after analysis with AppPerfect 20 years ago
orbiter 85075269a6 extended fail-safe memory-managament. prevents too much allocation, too often GC and should help for the 100%CPU-bug 20 years ago
orbiter e3c92818db avoiding OutOfMemoryError routines 20 years ago
orbiter 3e8ee5a46d enhanced caching in kelondroRecords and added better synchronization/finalizer 20 years ago
theli db3ed75728 *) closing stream correctly 20 years ago
orbiter 5d06ded005 enhanced html parser speed 20 years ago
orbiter 5a490aa065 fixed html parser 20 years ago
orbiter a25b5b4986 fixed possible memory leak in htmlScraper: be aware that now links can get lost; further work necessary 20 years ago
theli 9e47ba5ad6 *) adding missing calls for function close() to avoid "too many open file" bug 20 years ago
theli 9a98988c3c *) Bugfix for SSL/NIO Bug 20 years ago
orbiter a1ffc27041 preparations for image/movie/music indexing 20 years ago
orbiter a5b40923b6 added word migration to assortments (start with 'java -classpath classes yacy -migratewords') 20 years ago
theli 890e3f4d4a *) adding missing calls for function close() to avoid "too many open file" bug*) adding 20 years ago
theli 6dd3ec0dc4 *) Adding debug="true" debuglevel="lines,vars,source" to ant build files 20 years ago
orbiter 4f9c30ef49 using mime-type instead of file extension for doctype 20 years ago
theli ee9e110366 *) removing old logging configuration properties from yacy.init 20 years ago
theli c1a4e0dc28 *) changing reference to logger 20 years ago
theli d0083f845f *) changing reference to logger 20 years ago
theli 1b5ae054f8 *) changing reference to logger 20 years ago
theli 68f30811fa *) changing reference to logger 20 years ago
theli fbbea813c5 *) changing references to logger 20 years ago
orbiter 4574fa4ce7 bugfixes 20 years ago
theli 83b41ef2f7 *) Adding timeouts for shutdown 20 years ago
theli ef6851798b *) changing thread priority while parsing a pdf file to avoid 100% CPU usage. 20 years ago
orbiter 33f9315e58 implemented multithreading of indexing 20 years ago
orbiter ca3b4ccaf4 added snippet-routines (not yet finished) 20 years ago
orbiter ee0758fe4d bugfixes/empty-dir-deletion/snippet-test-activation 20 years ago
orbiter 594c591223 changes towards 0.38 20 years ago
orbiter d8fdc2526e added experimental snipplet-generation (to be disabled for 0.38) 20 years ago
orbiter 3771b10b89 implemented automated migration indexCache 0.37 -> indexAssortmentCluster 20 years ago
orbiter e89ded9e41 bugfixes 20 years ago
orbiter 650ca3955a added flush-thread for index cache and added language-name mapping in Language_p 20 years ago
orbiter 3d8a2ff937 enhanced parallelization of local/global/remote crawling 20 years ago
orbiter a05d738ea4 enhanced caching, removed bug causing outOfMemory 20 years ago
orbiter 21110dcd5e fixed bugs with open files and caching 20 years ago
orbiter f8f8dd05db fixed "Too many open files" - bug 20 years ago
theli 74eb21f62e *) adding image tag into rss template 20 years ago
orbiter 5f90daa265 implemented localization environment 20 years ago
theli 84f9d8f7f0 *) migrating ant build files to generate a single extension tar per default 20 years ago
orbiter fdd606c8c8 fixed bugs 20 years ago
theli 8bd49ba535 *) setting root dir for all tar files properly 20 years ago
orbiter 0c35171c85 assortment fine-tuning 20 years ago
orbiter 76dc892017 refined assortment 20 years ago
theli 0484c41a84 *) replacing system.xxx.println with logging statements 20 years ago
theli 7994c485f1 *) Trying to set the document title properly 20 years ago
theli 285936d778 *) trying to set document title properly 20 years ago
theli 573a8e8047 *) setting document title properly 20 years ago
orbiter 4b01ff7548 activated assortments, removed write-queues 20 years ago
orbiter e26ac60c3e modified assortment data structures 20 years ago
orbiter 79be6f003d enhanced Assortment class 20 years ago
theli 9ee3e69021 *) Solving "Warning: You did not close the PDF Document" problem when an OutOfMemory Exception occured ... 20 years ago
orbiter 5c6147a54c introduced assortment structure (generalization of singletons) 20 years ago
theli 73e297f30f *) adding proper default values for RealtimeParsableMimeTypes if something goes wrong with the configuration file 20 years ago
theli 893a662329 *) Adding missing cast statement 20 years ago
theli 361f05978d Multiple updates regarding the yacy seedUpload facility, 20 years ago
theli ddc5675781 *) Correcting typo 20 years ago
theli d2c4e9a55e *) Implementing yacy forum wishlist item: "Pause Crawling" 20 years ago
orbiter 287d2e6f10 further enhanced caching (new cache flush methods) 20 years ago
orbiter 376b917c91 fixed shut-down by stopYACY.sh 20 years ago
orbiter ea478f3975 enhanced indexing-caching 20 years ago
orbiter b4030e5023 implemented serverSwitchActions - action-hooks 20 years ago
theli 6f4d2e5272 *) fixing replace bug. 20 years ago
orbiter 10a4a2741d fixed missing close 20 years ago
orbiter db1da3345d introduced singleton-database 20 years ago
orbiter a9b22647dc fixed bug in indexDump.stack - generation 20 years ago
orbiter 1d7fed87dc redesign of index caching - removed indexCache.db 20 years ago
rramthun 3f85978519 Fixed one spelling mistake, limited input for ICQ numbers to 9 digits and made ICQ number in peer profiles clickable. 20 years ago
theli 1dad015b0b *) Migration of Ant build files 20 years ago
theli 2aa5fe8f50 *) Import statements reorganized 20 years ago
theli 351c86d5d9 *) Migration of optional Content Parser integration 20 years ago
orbiter d0010ff0b0 last changes for release 0.37 20 years ago
orbiter c7c6aaf06e many bug-fixes 20 years ago
orbiter 48650c082c fixed 100%-CPU-Bug in plasmaCondenser 20 years ago
orbiter 995673d795 several bugfixes 20 years ago
orbiter 2de90020ed fixed caching+synchronization+brute-force-denial 20 years ago
orbiter 9156fd53bc fixed bugs in last commit 20 years ago
orbiter e25f2354c2 removed synchronization and thread blockings 20 years ago
theli 58a65b60bd *) synchronized keyword removed from function processLocalCrawling to avoid deadlocks. 20 years ago
theli 65fc650109 *) plasmaCrawlLoader shutdown problem fixed (hopefully) 20 years ago
orbiter ba16da72b4 fixed not-working kelondroRecords-Cache 20 years ago
orbiter 7fb645b0ab enhanced crawling performance, changed memory settings, new performace options 20 years ago
theli fd584c113c *) some minor changes 20 years ago
theli f44b219e44 *) Eclipse has accidentally copied in the wrong file header into the new files (because these headers were accidentally set as default for the whole workspace instead of the project) 20 years ago
theli 081ebd5517 *) I've accidentally used Java 5.0 syntax for enumerations 20 years ago
theli 58b1a0ba40 *) adding an new package for extra content parsers 20 years ago
orbiter 8b31f9e202 enhanced shut-down behaviour & added experimental nio-wrapper for kelondroRA (not active yet) 20 years ago
orbiter 00f223cfc1 fixed post-parsing (a case when the bluelist is empty) 20 years ago
theli c9c0a1f11c *) Trying to speedup local crawling 20 years ago
orbiter 97ec8d65e4 fixed makerelease & clean-up of dead code 20 years ago
(no author) 1fec00bc24 *) Bugfix to avoid Nullpointer-Exceptions 20 years ago
(no author) f39812da91 *) Some performance improvements 20 years ago
orbiter b9203bdb50 bug fixes and code cleaning 20 years ago
orbiter c0807abd33 new crawl/proxy/cache design + fixes 20 years ago
orbiter e7d055b98e very experimental integration of the new generic parser and optional disabling of bluelist filtering in proxy. Does not yet work properly. To disable the disable-feature, the presence of a non-empty bluelist is necessary 20 years ago
orbiter 96516fc9d8 fixed bugs (search+kelondroException, dns) 20 years ago
orbiter a87a17a3c8 prepared generic text parser environment 20 years ago
orbiter e374aca2cd enhanced exception handling in kelondro 20 years ago
orbiter 89eb9a2292 fixed bug with crawl profiles 20 years ago
orbiter 248077d3f0 initial load with yacy 0.36 20 years ago