Commit Graph

426 Commits (f1643228f550d66ab9605e347bb6af86eff72375)

Author SHA1 Message Date
orbiter 7fc822a59b changed handling of time-zones
20 years ago
theli 9b7f37fc37 *) Minor changes
20 years ago
theli b5a8992d29 *) Setting some object fields to final
20 years ago
theli 023be89586 *) Bugfix for "Robots.txt wird immer wieder geladen"
20 years ago
theli 35c6c5ead7 *) Bugfix for "Blacklist und Crawlen" Bug.
20 years ago
orbiter 9e2fc7e5fe load balancing of crawl target domains
20 years ago
orbiter 3fcc95a82c integrated crawl-profiles db in memory-performance monitor
20 years ago
theli fe6a6abc0b *) Adding robots.txt db to Performance Settings for Memory menue
20 years ago
orbiter 3274ae725e increased cache size of robots database; however, this should be integrated into new memory control
20 years ago
orbiter c6d2f50375 changed order of robots and double-check
20 years ago
orbiter 68d5ff2ef1 added stringbuffer in condenser
20 years ago
orbiter 495bc8bec6 removed cache-control from low and medium priority caches which reduces memory use and computation overhead
20 years ago
orbiter 18d9e1a256 fix for http://www.yacy-forum.de/viewtopic.php?p=10026#10026
20 years ago
orbiter 07f30931ec various configuration options in memory performance
20 years ago
theli b990dc1ad1 *) Replacing jsch 0.1.19 lib with newer version 0.1.21
20 years ago
borg-0300 6d1de8abfd finals; cleaned;
20 years ago
orbiter 14bc880fa4 fixed bug with crashed profile database
20 years ago
orbiter 71a31f0902 integrated and extended new memory performance menu; found and fixed bug in DHT caching
20 years ago
orbiter fb52a82008 added new performance page for memory settings
20 years ago
orbiter cddd9aaa33 fixed SERIOUS bug with kelondroStack; affected all stack processing since 729
20 years ago
orbiter 416c126815 fix for a profile = null problem and new monitor in crawl queue
20 years ago
orbiter 2148c0cf49 replaced kelondro storage core; much less objects in kelondro cache now; less IO from DB
20 years ago
theli beefddf0e8 *) Adding option which allows to do a Index-Transfer without deletion of index
20 years ago
rramthun 4036ee812a Updated german language file
20 years ago
theli 40925f4fb7 *) Improving complete index transfer performance by automatically increasing size of transfered word chunk
20 years ago
theli 91ab4d044b *) Adding automatic retry functionality to complete index transfer function
20 years ago
theli a62677f761 *) Adding additional logging output for complete index transfer
20 years ago
theli b991d2e7dd *) Additional logging message for complete index transfer
20 years ago
theli 3c00c5f6c7 *) Complete Index Transfer
20 years ago
theli 2cb084d426 *) Complete Index Transfer
20 years ago
theli d1de71e9f6 *) Suppress stacktrace on proxy error for "No route to host Exception"
20 years ago
theli 56160cbd01 *) Bugfix for "YaCy verzählt sich ..." Bug.
20 years ago
orbiter 43b42854a0 fix for null-entries and http://www.yacy-forum.de/viewtopic.php?p=8649
20 years ago
theli 3587407039 *) Fixing problems of list operation if index and queue size are both 0.
20 years ago
theli 51b48a10e8 *) Suppress stacktrace on proxy error for "ValidatorException: No trusted certificate found"
20 years ago
theli 7fe8784231 *) URLs pointing to a server having a private ip addess will not be indexed anymore
20 years ago
theli 0aafb83edc *) Bugfix for robots.txt isDisallowed Check.
20 years ago
borg-0300 8260128ee9 changed getFreeSize();
20 years ago
theli f8ad65eae1 *) First trial implementation of robots.txt support
20 years ago
borg-0300 0a57fbcde5 Added new HashSet filesInUse;
20 years ago
borg-0300 8cd6a52dd0 Convention
20 years ago
borg-0300 c0e3d18bbf *) remove import java.lang
20 years ago
borg-0300 b1cd1fa917 cleaned
20 years ago
borg-0300 da9c6857fb *) changed a misunderstand, no BUG ;)
20 years ago
borg-0300 fbac053c03 small change
20 years ago
theli 578f36ae18 *) Speedup of indexer. Proxy files will not be enqueued by the cachemanager
20 years ago
theli 1219ef99f0 *) Bugfix for NullpointerException in yacyDebugMode Init
20 years ago
theli 6c722706b7 *) Moving yacyDebugMode intialization to switchboard
20 years ago
theli 4e07828807 *) httpdProxyHandler.java
20 years ago
borg-0300 81cb8feb15 back to 649 :/
20 years ago
borg-0300 5194511e8e *) attempt to find bug
20 years ago
theli 6991b9e2b9 *) Suppress stacktrace on crawler error for "Connection reset"
20 years ago
theli a47f9238fe *) Blacklist is now also used by the crawler
20 years ago
theli dc0a2d4c11 *) Bugfix for Loader Queue:
20 years ago
theli 732a107160 *) Bugfix for "-UNRESOLVED_PATTERN-" Bug on IndexCreateWWWLocalQueue_p.html and "urlEntry.url() == null" Bug
20 years ago
theli 33aaffbfc6 *) Displaying content size of each entry in indexing queue
20 years ago
borg-0300 7626823519 BUGFIX for last 'commit'
20 years ago
borg-0300 971756e8dd the delete size is smaller
20 years ago
theli 0471019606 *) IndexCreateIndexingQueue_p.html now also shows indexing jobs that are currently in process
20 years ago
borg-0300 cc493ef8c1 Added change from Hermes
20 years ago
theli bead8a32aa *) IndexCreate_p.java:
20 years ago
theli 48aaf703cc *) Adding additional logging output to detect crawling problems
20 years ago
theli 59b8a98c7e *) Bugfix for suppressing of stacktrace in log on crawler error "MalformedURLException"
20 years ago
borg-0300 c1d7527929 better cache cleanup
20 years ago
theli 2e6df95786 *) adding toString method
20 years ago
theli 4fd5b95b1f *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
20 years ago
theli 6adf8a4bde *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
20 years ago
theli f19c09b227 *) Suppress stacktrace on crawler error for "MalformedURLException"
20 years ago
theli cc1df08069 *) Adding missing synchronized blocks
20 years ago
borg-0300 bf14e6def5 *) proxyCache, proxyCacheSize can be changed under 'Proxy Indexing'
20 years ago
theli 9b818b1ce3 *) Pausing Crawlers if there is not enough space on disk
20 years ago
theli b33094e925 *) Trying to solve "Too many open files bug"
20 years ago
theli 34790acf02 *) Bugfix for suppressing of stacktrace in log on crawler error "unknown host"
20 years ago
theli af7b8f75bd *) Making proxyAccessLogging configureable via yacy.logging file
20 years ago
theli 2a081c9ee5 *) Adding additional logging message for "NURL.entry() == null" Bug
20 years ago
theli cb1f11c96b *) Suppress stacktrace on crawler error for "Unknown Host"
20 years ago
theli e338a13de3 *) Suppress stacktrace on crawler error for "Read timed out"
20 years ago
theli 2e43e744de *) Suppress stacktrace on crawler error for "connect timed out"
20 years ago
theli 36cbe04e3e *) Bugfix for Crawler Redirection Bug
20 years ago
theli b70de495a0 *) Remembering Crawler-isPaused setting
20 years ago
theli e569a84dc0 *) Using the same configuration settings for all indexing threads on server Startup
20 years ago
theli 17be77a468 *) Bugfix for "Crawler data will not be removed from htcache if content parsing failed"
20 years ago
theli 5f55dff297 *) Bugfix for "Binäre Nullen auf der page: Index Creation: Indexing Queue"
20 years ago
allo eb6365c069 local Bootstrapping bug.
20 years ago
theli 330eae7cf3 *) Normalizing CrawlerStartURL now before crawling is started
20 years ago
theli ab894d26bc *) Bugfix for "plasmaSwitchboard.deQueue: null" Bug (hopefully)
20 years ago
theli eaf9f26cc3 *) Bugfix for NULL PROFILE HANDLE 'null' Bug:
20 years ago
rramthun 4cb382decb Adding changes by borg-0300 from http://www.yacy-forum.de/viewtopic.php?t=997
20 years ago
theli ec4c70d722 *) If there are at most 10 entries left while doing an index transfer, these entries will also be appended
20 years ago
theli d4a045d7b1 *) Trying to solve "de.anomic.plasma.plasmaSwitchboard.deQueue': null" Bug
20 years ago
theli ea9a992f05 *) Before the crawler retries to download a URL it checks if the server is already doing a shutdown
20 years ago
theli ea26b84eed *) Bugfix for http://www.yacy-forum.de/viewtopic.php?t=954
20 years ago
theli 0c8a48e2cb *) converting php Session ID to lower case in funktion isCGI
20 years ago
orbiter e616395c3b latest changes and cut for 0.40
20 years ago
orbiter c47bb1182d bugfix for assortment initialization error
20 years ago
theli 4654eae4e2 *) adding php Session ID to argument in funktion isCGI
20 years ago
orbiter 25f632dbd9 more DHT bugfixes and better logging of DHT effects
20 years ago
orbiter 5cb00889d9 enhancements to dht selection, search and search presentation
20 years ago
orbiter ba0a486328 moved printStackTrace() to logging
20 years ago
orbiter 3094045d34 fix for http://www.yacy-forum.de/viewtopic.php?p=7454#7454
20 years ago
orbiter cd10370992 several bugfixes and dht selection / logging improvement
20 years ago
orbiter 3610fe6b3a see http://www.yacy-forum.de/viewtopic.php?p=7410#7410
20 years ago
orbiter c8a7a85ce2 fix for http://www.yacy-forum.de/viewtopic.php?p=7384#7384
20 years ago
orbiter 6594541ef5 fix for http://www.yacy-forum.de/viewtopic.php?p=7361#7361
20 years ago
orbiter 7db543a9fa fixes for several dht misbehaviours
20 years ago
orbiter 5716f8521d bug fixes for word ordering and dht index selection
20 years ago
orbiter f5259f29e8 word cache behaviour fix and other fixes
20 years ago
orbiter 2c234e1b82 better log output for search result
20 years ago
theli 89c9faa89e *) More graceful logging output in crawler
20 years ago
orbiter 248c24b60a intermission-feature usage in case of local and remote search
20 years ago
theli b32e7c516c git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@507 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
theli 86305f051d *) Trying to solve "java.net.BindException: Address already in use: JVM_Bind" Problem
20 years ago
theli 865b9490a2 *) Making DHT Transfer while Crawling configurable
20 years ago
theli 1d83d7e4d7 *) httpdFileHandler.java:
20 years ago
orbiter 2d8557cb10 minor changes
20 years ago
orbiter 91163db52e fix for more time-related problems in proxy
20 years ago
orbiter fb6f238d70 fix for expires-problem
20 years ago
rramthun eacff63eda Typos...
20 years ago
orbiter 40da910f41 bugfixes and automatic news-cleanup
20 years ago
theli 228b04b499 *) Bugfix for "wrong seed-upload timestamp" problem
20 years ago
theli 470839a16a *) Crawler/Session pool settings will now be stored properly into configfile
20 years ago
orbiter 4377e119f3 bugfix for http://www.yacy-forum.de/viewtopic.php?p=6620#6620
20 years ago
orbiter e84a177c49 many bigfixes
20 years ago
orbiter 7e3e9ba0de fix for http://www.yacy-forum.de/viewtopic.php?p=6563#6563
20 years ago
orbiter 1022fbeb65 many YaCyNews fixes
20 years ago
orbiter 13abd8b6e7 added news-creation at crawl start
20 years ago
rramthun f555b9d5f2 Translation, spelling...
20 years ago
orbiter cdbbfd50fb fixed bad remote crawl behavior
20 years ago
orbiter 36707586c7 filtering of jsessionid
20 years ago
rramthun 6f2f54a312 Translation, spelling...
20 years ago
orbiter 81e564edb8 faster crawl profile list cleanup
20 years ago
orbiter ad90f0ad13 activated RWI distribution to DHT for senior peers (default redundancy 3), necessary now for network growth
20 years ago
orbiter b9d18d40cb configuration of proxy idle time in performance menue
20 years ago
orbiter 3470a72d48 fixed div by zero, set default delays, fixed release number format and display
20 years ago
orbiter be1f324fca performance setting for remote indexing configuration and latest changes for 0.39
20 years ago
orbiter c64970fa47 re-implemented proxy-busy-check and fixed some other things
20 years ago
orbiter b73557ed2d better assortment monitoring and enhanced profile menue
20 years ago
orbiter 1f36bf4dae enhanced assortment capacity; added extended WORDS migration
20 years ago
rramthun 0f11399d16 Some corrections...
20 years ago
orbiter 9f505af7aa preparations for bulk remote crawls
20 years ago
orbiter 9c72b4cdec replaced index dump stack by an dump array and limited url number in assortment ram (prevents too much RAM occupation)
20 years ago
orbiter 51962d55bf added 'PPM', page-per-minute statistics
20 years ago
orbiter 159f795f65 bugfix (null pointer exception in assortments)
20 years ago
orbiter 1d2155675b changed assortment memory cache flush
20 years ago
orbiter 19dbed7cc8 code clean-up
20 years ago
orbiter 00f63ea00d fail-save patch for pattern matching
20 years ago
orbiter 0a6be961ea added pattern organization
20 years ago
orbiter 40036ba69c fixed dht transmission; added url-blacklist blocking also for remote search
20 years ago
orbiter 311e627363 blocking of blacklisted urls in indexReceive and small changes
20 years ago
orbiter 2f0d7ea8d3 removed htcache stati (superfluous now)
20 years ago
orbiter 277048501e bugfix
20 years ago
orbiter 8b89c46afe fixed problem with cache write
20 years ago
orbiter 455ae9f55f fixed htcache-store problem and due-time for remote crawls
20 years ago
theli 55d10b864c *) further improvements in shutdown behaviour
20 years ago
orbiter 419f8fb398 fixed bugs/missing code regarding new crawl stack
20 years ago
orbiter 112c5d3332 the new file-based indexing queue
20 years ago
orbiter 858cd94299 replaced indexing ram-queue by file-based stack-queue
20 years ago
theli 57c30f1d78 *) bugfix for usage of httpc without gzip content encoding
20 years ago
theli 0e2c33ee55 *) Network.html/Network.java:
20 years ago
orbiter 5159a090b0 fixed parser bug with lowercase force (appeared in: http://spellbound.sourceforge.net/)
20 years ago
orbiter 7f7cbc5019 fixed bug with snippets
20 years ago
orbiter eb74fa0c82 fixed a bug with snippet-length
20 years ago
orbiter 86f2aa8478 fixed seed-load date bug (evaluating server date for age computation)
20 years ago
orbiter 664bceced5 removed debug-lines
20 years ago
orbiter 75ebdbc852 enhanced snippet-generation (case where snippet is too long)
20 years ago
orbiter 8a4f297324 fixed/enhanced snippet error-handling; suppression of results where no snippet exists
20 years ago
orbiter 712fe9ef18 bugfixed utf-8 decoding and parser
20 years ago
theli eee6322aaf *) Adding redirection support to plasmaCrawlWorker.java
20 years ago
theli cd279907c0 *) Adding redirection support to plasmaCrawlWorker.java
20 years ago
theli 6697d5e52e *) correcting fkt. mediaExtContains
20 years ago
orbiter 3addf58046 enhanced snippet-loading with threads
20 years ago
orbiter 56d28a16f0 bugfixes
20 years ago
orbiter d6c85228a6 enhanced snippet computation
20 years ago
theli fafda068f9 *) allowing crawler to process resources with statuscode 203
20 years ago
theli aae9a433a6 *) correcting usage of supportedFileExt-List
20 years ago
orbiter 1e7f062350 many bugfixes, memory leak fixes, performance enhancements; new kelondroHashtable; activated snippets
20 years ago
orbiter 68dc2b0c6b added kelondroArray, the basis for upcoming kelondroHash and some bug fixes
20 years ago
orbiter a19541e563 code-enhancements after analysis with AppPerfect
20 years ago
orbiter 85075269a6 extended fail-safe memory-managament. prevents too much allocation, too often GC and should help for the 100%CPU-bug
20 years ago
orbiter e3c92818db avoiding OutOfMemoryError routines
20 years ago
orbiter 3e8ee5a46d enhanced caching in kelondroRecords and added better synchronization/finalizer
20 years ago
theli db3ed75728 *) closing stream correctly
20 years ago
orbiter 5d06ded005 enhanced html parser speed
20 years ago
orbiter 5a490aa065 fixed html parser
20 years ago
orbiter a25b5b4986 fixed possible memory leak in htmlScraper: be aware that now links can get lost; further work necessary
20 years ago
theli 9e47ba5ad6 *) adding missing calls for function close() to avoid "too many open file" bug
20 years ago
theli 9a98988c3c *) Bugfix for SSL/NIO Bug
20 years ago
orbiter a1ffc27041 preparations for image/movie/music indexing
20 years ago
orbiter a5b40923b6 added word migration to assortments (start with 'java -classpath classes yacy -migratewords')
20 years ago
theli 890e3f4d4a *) adding missing calls for function close() to avoid "too many open file" bug*) adding
20 years ago
theli 6dd3ec0dc4 *) Adding debug="true" debuglevel="lines,vars,source" to ant build files
20 years ago
orbiter 4f9c30ef49 using mime-type instead of file extension for doctype
20 years ago
theli ee9e110366 *) removing old logging configuration properties from yacy.init
20 years ago
theli c1a4e0dc28 *) changing reference to logger
20 years ago
theli d0083f845f *) changing reference to logger
20 years ago
theli 1b5ae054f8 *) changing reference to logger
20 years ago
theli 68f30811fa *) changing reference to logger
20 years ago
theli fbbea813c5 *) changing references to logger
20 years ago
orbiter 4574fa4ce7 bugfixes
20 years ago
theli 83b41ef2f7 *) Adding timeouts for shutdown
20 years ago