Commit Graph

704 Commits (a16534cb0a60a3cdd873ca1a9b925f84e2b8b77b)

Author SHA1 Message Date
Michael Peter Christen 1795a7325b made HandleSet serializable 13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects 13 years ago
Michael Peter Christen 0cf3d36eae more tolerance in case of corrupted file 13 years ago
Michael Peter Christen 34f4225d7e less 'wellformed' calls without asserts 13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes 13 years ago
Michael Christen e32055aa15 added stub classes for 13 years ago
Michael Peter Christen 2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method. 13 years ago
Michael Peter Christen 213c8d97f2 use less proccesses in process pool 13 years ago
Michael Peter Christen c639248c23 protection against strange answers from remote peers during search 13 years ago
Michael Peter Christen 1cd711d005 added classes for citation references (for new citation ranking) 13 years ago
Michael Peter Christen e0f1e7d904 added new citation reference data structure that shall be used for a 13 years ago
Michael Peter Christen e18a4f6b74 more tolerant merge iterator 13 years ago
Michael Peter Christen 7e4e3fe5b6 free some memory after parsing html 13 years ago
Michael Peter Christen 4540174fe0 memory hacks 13 years ago
Michael Peter Christen b4409cc803 small redesign of blob column index and usage 13 years ago
Michael Peter Christen d5c1f2746e performance hack 13 years ago
Michael Peter Christen 803963aebd performance hack: better space grow in CharBuffer (speeds up html 13 years ago
Michael Peter Christen e2f8f263e8 changed storage of search words: keep order 13 years ago
Michael Peter Christen 0b67a0a5d8 added a column index for tables in blob files. This is heavily used 13 years ago
Michael Peter Christen e3bb73c3d6 serialized some database access methods 13 years ago
Michael Peter Christen 2ea585d616 fix for host navigator 13 years ago
Michael Peter Christen ef78f22ee1 performance hack 13 years ago
Michael Peter Christen a02fdf8625 better error messages 13 years ago
Michael Peter Christen c6ba44468e timeout = 5000 instead 3000 13 years ago
low012 8776b84c10 *) small fix to make password change function of reconfigureYACY.sh work 13 years ago
Michael Peter Christen 4901cee3cc suppress auto-tagged subject entries when sending out or receiving 13 years ago
sixcooler 985b78cf89 correct 'avaiable()' to use max of young / eden 13 years ago
sixcooler 4da8746275 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
sixcooler c9aaa9e00a respect non-reserved Memory in GenerationMemoryStrategy 13 years ago
Michael Peter Christen 37f2d1b3e9 replaced Thread initialization with ExecutorService pool for delete 13 years ago
Michael Peter Christen 0d6176804b emergency disabling of GenerationMemoryStrategy because of non-working 13 years ago
Michael Peter Christen 87f0210480 enriched log output to find NPE in HeapReader 13 years ago
Michael Peter Christen 254adea51c small fixes 13 years ago
Michael Peter Christen 49be60a7c8 WorkflowProcess is forced to make small pauses if shortMemoryStatus is 13 years ago
Michael Peter Christen b7bb84c0bb set a limit to CharBuffer object size to fight against bad/too large 13 years ago
Marek Otahal 72adbeae90 !Important: move from Hashtable to HashMap 13 years ago
Marek Otahal f75b5e40e0 little fix in copy() 13 years ago
Michael Christen 216a287a85 Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r 13 years ago
Michael Christen 20962a4ed7 added metadata node stub for metadata from blobs 13 years ago
Michael Christen 575dbbaa93 enhancements in Blob retrieval: try to use less CPU resources by testing 13 years ago
Roland 'Quix0r' Haeder 6d4e08ed06 Rewrote filesize() to (hopefully) avoid a NPE, rewrote Blacklist class to concurrent classes to avoid a CME 13 years ago
Roland 'Quix0r' Haeder fa08ed5ae5 Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check 13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows. 13 years ago
Michael Christen c04bfaa51b refactoring 13 years ago
Michael Peter Christen 613ab6a69d added BEncodedHeapBag and BEncodedHeapShard which are storage container 13 years ago
Michael Christen 6fecd0db88 one more performance hack to prevent costly md5 computation 13 years ago
Michael Christen e13441b069 better digest pool size (smaller by default but unlimited) 13 years ago
Michael Christen 1f4afb4dc0 performance hacks 13 years ago
Michael Christen e9dc99fe15 added rules to set specific RWIs as private RWIs which are not 13 years ago
Michael Peter Christen 0bcef2d156 added feature as requested in 13 years ago
Michael Christen 204c29f010 small bugfixes for search result display and cache display 13 years ago
Michael Christen 078fcde0dd bad initialization 13 years ago
Michael Christen 14e45e90fd patch for a bug that I don't understand by now. 13 years ago
Michael Christen 86b3385847 fixed a deadlock during secondary remote search 13 years ago
Michael Christen 404758698a less io operations 13 years ago
Michael Christen 044f83feed added some pauses into the search process which shall produce 13 years ago
sixcooler 448656087a probably fix for http://bugs.yacy.net/view.php?id=94 13 years ago
Michael Christen d35bdc2df6 removed npe 13 years ago
Michael Christen e7e429705a - less automatic indexing after a search (needs to reset the default 13 years ago
Michael Christen 9cd469e6d6 added pull request from als plus an NPE fix 13 years ago
orbiter 83335c3b09 fix for http://bugs.yacy.net/view.php?id=78 13 years ago
orbiter 35a9e8f307 - fixed network graphic 13 years ago
Al Sutton 8993cac4d8 Initial performance improvements 13 years ago
orbiter 8895d8c1cd removed unnecessary log entries 13 years ago
orbiter bc5df0eef5 updated ranking tables (fresh computation) 13 years ago
orbiter 5a55397f99 some last-minute performance hacks 13 years ago
orbiter c9216d5adf fixed secondary remote search (the process that finds distributed join situations) 13 years ago
orbiter 0cf9ebc3b0 speed enhancements when parsing RWI rows (makes search slightly faster) 13 years ago
orbiter 709013385a fix for language fix 13 years ago
orbiter c0c6e9e7a5 fix for bad language encoding 13 years ago
orbiter 05f34a3fa7 added a full, complete, database insert, update and delete API for the tables. 13 years ago
orbiter 3a15e58e28 - increased stability when opening the robots table 14 years ago
orbiter 775b44017e refactoring 14 years ago
orbiter e914a30099 fix for npe 14 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process 14 years ago
orbiter e58438c01c - added a new retry connector for solr (for cases where solr responses are slow) 14 years ago
orbiter d8d9735b4f stability bugfix 14 years ago
orbiter f121f4bb45 fix for link in Supporter and Suftipps page 14 years ago
orbiter 1b86d06d1e fix for http://bugs.yacy.net/view.php?id=62 14 years ago
orbiter eb9c9edb01 enhanced table method (used by almost all yacy api interfaces) 14 years ago
orbiter 5af9598bd1 enhanced exported row parsing during row import 14 years ago
orbiter 7598a9e26b fix for thread dump 14 years ago
orbiter 8eef8722d1 update to ThreadDump analysis: freerunner and thread state recognition 14 years ago
orbiter 1df43b137d another performance hack 14 years ago
orbiter 7df0643f0e performance hacks 14 years ago
orbiter 813f297a95 another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash 14 years ago
orbiter 035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill) 14 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists 14 years ago
orbiter 57d5529a01 performance hacks 14 years ago
orbiter d2ea250d99 refactoring: 14 years ago
orbiter 0c6d95e57b - more tolerance against failure of table opening 14 years ago
orbiter ce2a76d603 performance hack for search process 14 years ago
orbiter 2c4a672fe2 bugfixes and performance hacks for tabe index 14 years ago
orbiter dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time. 14 years ago
orbiter 734059d33e performance hacks 14 years ago
orbiter dd4635e323 patches 14 years ago
orbiter 2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking 14 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet 14 years ago
sixcooler ecb4986b38 refactored stuff from last commit to ReferenceContainer 14 years ago
sixcooler f7c4abfdd7 limit references per blob & term to the 100.000 youngest 14 years ago
orbiter 28f5b79deb added a fast mass-deletion method 14 years ago
orbiter 44d6416e2d ensure termination of shrink() 14 years ago
orbiter 52230a6864 replaced catching of Exception with Throwable, which catches also Errors 14 years ago
orbiter e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase. 14 years ago
sixcooler d40a177c05 Generation Memory Strategy fine tuning 14 years ago
sixcooler 839f407fe4 Generation Memory Strategy fine tuning: 14 years ago
orbiter a5541751a8 - added memory computation to termlist_p.xml 14 years ago
orbiter 45e497a9bd fix for term iteration 14 years ago
orbiter 2c595a6a47 added new methods to count the number of objects in RWIs. lots of refactoring was necessary to introduce new Rating class and to unify naming of methods 14 years ago
orbiter 75df87832c refactoring/better naming of methods and classes 14 years ago
sixcooler 5f8a5ca32d - not doing merge-jobs while short on Memory 14 years ago
orbiter 965fabfb87 enhanced sorting speed (affects all DB operations) 14 years ago
orbiter 22d69a6368 refactoring in cora: added sorting package 14 years ago
orbiter 51cf697acd refactoring: moved all score-related classes to new ranking package 14 years ago
sixcooler 4fec99115b Implementation of strategies for controlling memory resources. 14 years ago
orbiter 2c58af6874 - added a short memory status simulation mode 14 years ago
orbiter c64faf41e2 addon to svn 7880 14 years ago
sixcooler 411ed159f8 do some extra sleep while running low on memory 14 years ago
sixcooler 07f5954570 try better handling of corrupt blobs 14 years ago
orbiter 0a3ab7da1b do not sort concrrently the same array 14 years ago
orbiter 44d74f8f89 performance hacks for seed generation (because thread dumps showed multiple occurrences at these code points) 14 years ago
sixcooler 5cd07d7f84 early freeing resources on deleting index reference if search-verification fails (aka Switchboard.cleanupJob) 14 years ago
sixcooler 9170a434ed throwing an exception again in FileUtils.copy(reader, writer) 14 years ago
sixcooler 916d79111e Runtime.maxMemory() DOES change @ runtime: 14 years ago
orbiter 1f300217f8 more protection for the cleanup thread 14 years ago
orbiter d13103a0a7 changed the way how the index cache is flushed: do not flush when a put was made because that could cause that many put calls synchronize for a long time when the dump or a merge is performed. Instead a watchdog thread is doing the dump and therefore puts cannot block any more which is good when a put happens during a search result preparation. 14 years ago
orbiter 6a6f27eaf3 do not sort arrays again if arrays are already sorted 14 years ago
orbiter 3d043ce9d6 - refactoring 14 years ago
orbiter 48b78e9ff4 disabling concurrency in new sort since that is not working yet correctly 14 years ago
orbiter 62ac73a108 fixed bugs and deadlocks in core database indexing structures: 14 years ago
orbiter 1912d0cccc changed handling of RowSet element retrieval: until today all elements had been copied from the underlying byte[] arrays into a new Entry object that again had a copy of a portion of that byte[] in its own bye[]. There was an option to just refer to the underlying byte[] with a pointer but that was almost never used. This commit now changes an interface to the Row class where it is now necessary to tell if a copy is always required. Fortunately the copy is only needed in very rare cases. That means that this change should cause much less memory allocation; it is expected that this happens especially during search situations. 14 years ago
orbiter bb8e3f8523 code cleanup 14 years ago
orbiter 11dc653de3 added a visualization of peer pings to the performance graphic 14 years ago
orbiter 6d2e252bcf fix for: 14 years ago
orbiter b666a929e7 fixed Semaphore handling in case of interruptions 14 years ago
orbiter 267290a821 removed the semaphores from the cache dump process because I believe some of the semaphores may be lost somewhere which then causes that the cache is never flushed and then the peer dies from a OOM. The re-introduced synchronization may not be the best solution but should ensure that the caches are flushed. 14 years ago
orbiter f803da8aae code cleanup 14 years ago
orbiter 31283ecd07 - added a search option to filter only specific network protocols. i.e. get only results from ftp servers. Just add '/ftp' to your search. 14 years ago
orbiter 7db208c992 performance hacks: more pre-allocated StringBuilder 14 years ago
orbiter 996f0a8764 disabled assert in Base64Order which eats away too much performance during testing with -l 14 years ago
orbiter f30d36b101 enhanced template engine 14 years ago
orbiter 0c1b29f3c9 - applied many small performance hacks 14 years ago
orbiter fe0c08455b more concurrency (enhancement) hacks 14 years ago
orbiter 87082f407e less String object creation during search 14 years ago
orbiter a36fda991e hack to increase speed of url hash computation 14 years ago
orbiter dbea40d536 - changed snippet fetch strategy logic: do not check if entry is in cache. This should reduce IO load on the HTCACHE which is a showstopper during large number of search requests 14 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 14 years ago
orbiter 746e3c3b06 Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties 14 years ago
orbiter e28bd0d038 fix for some possible causes of memory leaks 14 years ago
orbiter 09ba6814c0 - non-blocking word hash computation with dynamic digest object generation (this was important!) 14 years ago
orbiter 10e2f588f8 - enhanced ybr ranking computation 14 years ago
orbiter bd55dcee50 - commented out experimental distributed ranking loading 14 years ago
orbiter 3ed4a09368 small features, some bug fixes and performance hacks 14 years ago
orbiter b45701d20f this is a re-implementation of the YaCy Block Rank feature 14 years ago
orbiter d27a0a67ff fix in log initialization according to hint from Dominic 14 years ago
orbiter 123375bfba added a new yacy protocol servlet 'idx'. This returns an index to one of the data entities that is stored in YaCy. 14 years ago
orbiter 5b579e21a3 code cleanup 14 years ago
orbiter 039126cfaf better handling of on/off switched solr indexing 14 years ago
orbiter dc54915df4 fix for very bad compare 14 years ago
orbiter deda54d684 - relaxed matching of string-search (this is now case-insensitive) 14 years ago
orbiter b77b8cac0c - enhanced html parser: recognized much more details in the content 14 years ago
orbiter 17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10 14 years ago
orbiter 0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages 14 years ago
orbiter 8412f8787d fix for http://bugs.yacy.net/view.php?id=8 14 years ago
orbiter 9b25d07295 - added geo information parsing to html parser 14 years ago
orbiter b1a8d0c020 enhancements to web cache and less strict caching rules 14 years ago
f1ori df71776929 * fix bug 14 years ago
orbiter 78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated. 14 years ago
orbiter 2b5f8585bf performance hack for Balancer and ip address parsing 14 years ago
orbiter b1d133b69f another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts 14 years ago
orbiter a35d513bd8 fix for not-deleted .gap and .idx files 14 years ago
orbiter 859c99886c fix for multiple thread dump 14 years ago
orbiter 61acf55da4 avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final. 14 years ago
orbiter c2a968c23f fix for bug in formatting in ThreadDump 14 years ago
orbiter 078ecacf61 avoid synchronization in DigestURI hash requests 14 years ago
orbiter 1989ebc24b removed more warnings 14 years ago
orbiter 0324de1467 removed debug line 14 years ago
orbiter 1aba7869bf patch for Windows: do not use the thread lock feature from previous commit if used on Windows 14 years ago
orbiter 0a11727374 added new feature for Thread dump: 14 years ago
orbiter a07a1a8b1e removed type cast warnings 14 years ago
orbiter e6c3507b17 disabled some of the previous changes (did not work in openjdk) 14 years ago
orbiter f9e5c21083 update to thread dump logs 14 years ago
orbiter 8f11d3a5bb redesigned the ScoreMap classes: 14 years ago
orbiter a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases) 14 years ago
orbiter dc0db3550e avoid string conversion 14 years ago
orbiter 694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion 14 years ago
orbiter 30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding 14 years ago
orbiter 3820525464 more memory protection: auto-flush of caches in case of memory shortage 14 years ago
orbiter 96bb33ed9b added default size to StringBuffer in logger (and it is not possible to replace the StringBuffer with a StringBuilder...) 14 years ago
orbiter e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations 14 years ago
low012 3b40b98256 *) set SVN properties 14 years ago
orbiter 619b561a4a enhanced secondary search: index abstracts decompression is now much faster and does not cause strong CPU load after several searches with more than one word 14 years ago
low012 bf27a72d53 *) set SVN properties 14 years ago
low012 b649ce2dd7 *) minor changes 14 years ago
orbiter 70a996a06c reverted SVN 7557 because these classes are called using reflection. The class declaration is in the log configuration. Without these classes you get errors during runtime and a non-formatted log output, i.e.: 14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'. 14 years ago
low012 9d366ee9d7 *) removed unused code (I assume that most of the code was really dead, but if you need any of the classes, tell me and I will put it back in.) 14 years ago
orbiter 7138f4036b less synchronization, better thread dump tool 14 years ago
orbiter 8d14916c74 more patches for a better out-of-memory management 14 years ago
orbiter ce0c8247fc removed (most probably!?!) superfluos System.err output 14 years ago
orbiter 799c534935 one more patch again OOM during secondary remote search 14 years ago
orbiter f8d0454c53 small bug fixes and experiments with search speed enhancement 14 years ago
orbiter 993b9bc1a8 memory/performance hacks, less synchronization, better concurrency 14 years ago
orbiter 42d90664f3 - fixed a memory leak in the httpc.post method (no finish) 14 years ago
orbiter 38dce547c0 better concurrency (less locking on date formatting) more logging and minor bug fixes 14 years ago
orbiter b1781d7aae some more performance hacks 14 years ago
orbiter b2f147d28e performance hack: excluded map encoding in many cases from synchronization block, especially when doing an iteration 14 years ago
orbiter 5e186e0122 continuing the fight against deadlocks during time formatting: better caching. 14 years ago
orbiter 1110d16af9 performance hack: replaced generic row.getColBytes() call with row.getPrimaryKeyBytes() where the column is 0 14 years ago
orbiter 19b2a50578 - enhanced date formatter cache 14 years ago
orbiter 48a61c39a3 speed hacks in BLOB ArrayStack: 14 years ago
orbiter ad7fcb9d61 Enhanced Base64Order transformation: less overhead (transformation between StringBuilder and byte[]) 14 years ago
orbiter 0ce17d823a - fixed bug in ordering 14 years ago
orbiter dec4f36700 - fix for missing favicons in search widgets 14 years ago
orbiter 804ae2275b - do not delete idx and gap files if the heap is not modified 14 years ago
orbiter 5e45ded8e2 - removed locks from WordReference 14 years ago
orbiter cd19d0517e added dns resolve to HTTPClient POST using a dns cache to prevent that that not-thread-safe built-in dns cache inside apache http client is used 14 years ago
orbiter af87af0d4c - removed synchronization in serverSwitch which should improve speed 14 years ago
orbiter d84b4a072e healing for some OOM problems 14 years ago
orbiter 6083f2f171 fix for (false) oom 14 years ago
orbiter fe93caac5a added flags and administration options to show advanced search and to show search result attributes (for each search result) 14 years ago
orbiter 431f780f41 patch for bad data in url metadata 14 years ago
orbiter 0cdfb82963 replaced more appearance of double values by float values 14 years ago
orbiter eb12e15738 moved all Double values to Float values because of 14 years ago
f1ori 982aa689ef * fix StringIndexOutOfBoundException in WebStructureGraph 14 years ago
orbiter 090c73e32e catch a OOM in HeapReader iteration 14 years ago
orbiter feefe17568 npe assert fix 14 years ago
orbiter 733903f2c9 fix for http://forum.yacy-websuche.de/viewtopic.php?p=21489#p21489 14 years ago
orbiter 10ae8d961b - cora package has now no dependencies to other yacy packages and becomes a 'base' package (refactoring) 14 years ago
lotus b1484299b2 same units for memory observer configuration (MiB) 14 years ago
orbiter 387db84087 maybe found bug in non-working index dumper 14 years ago
orbiter a4c9d27287 - moved some variables from Stwitchboard to new class AccessTracker 14 years ago
orbiter cdfe8afe3f fix for really bad table iteration implementation: reduction of IO 14 years ago
orbiter b2ed4cfaf8 more small bugfixes and light refactoring 14 years ago
orbiter e753027c43 fix for http://forum.yacy-websuche.de/viewtopic.php?p=21439#p21439 14 years ago
orbiter bf4ef1513e - fix for map view 14 years ago
orbiter 56264dcc17 - added CamelCase parser to MultiProtocolURI: generate better to-be-indexed words from urls 14 years ago
orbiter 99a7fe87f9 - removed old intranet scanner (the generic scanner now completely subsumes the old one) 14 years ago
orbiter a563b05b60 enhanced crawler: 14 years ago
orbiter db99db4be9 some redesign of the search-fail-response mechanism: 14 years ago
f1ori 4915d1781a * use local backup-file, if remote network-definition is not availible 14 years ago
orbiter f0651e5f2f added image search to yacyinteractive.html 14 years ago
orbiter a9f754c45f removed unused CR accumulation and distribution process 14 years ago
low012 9b3fae9496 *) cleaning up the code a little bit 14 years ago
orbiter 321eb012fe removed two warnings and reverted one change 14 years ago
low012 eb79b952ef *) cleaner code 14 years ago
sixcooler b87bf88ac8 using less memory on merging and rewriting blobs 15 years ago
orbiter 4c50d3428e smaller file size for array stacks to support smaller deletion sizes 15 years ago
orbiter becc463d8a enhanced did-you-mean 15 years ago
orbiter 93c535d111 fixed http://forum.yacy-websuche.de/viewtopic.php?p=21113#p21113 15 years ago