Commit Graph

749 Commits (1027f3d04a267c72aebf6d0fd1504bde3055e3f9)

Author SHA1 Message Date
Michael Peter Christen 528d6763fa - added new solr fields:
12 years ago
Michael Peter Christen 316b5fe116 - added a solr type definition verifier
12 years ago
Michael Peter Christen e8acd542b5 - added faceted drill-down for host and geolocation to solr queries
12 years ago
orbiter 2094df2e4e - correct length computation for BStringObject (bugfix suggested by
12 years ago
Michael Peter Christen 4716546ef5 - reduced memory usage in index transmission using a transformation of
12 years ago
Michael Peter Christen 06b0081fdc fix for NPE during host navigation computation
12 years ago
orbiter acb9f04e80 removed unused classes
12 years ago
Michael Peter Christen 755f5e76cf removed strange assert statements and simplified code in metadata
12 years ago
orbiter ee01c12e56 fixes for putDocument and putMetadata
12 years ago
Michael Peter Christen f9fc5cfaba better check for bad urls in url transmission
12 years ago
Michael Peter Christen 40c0856489 refactoring
12 years ago
Michael Peter Christen 9bece5ac5f enhanced snippet fetch - removed a bug that caused documents to be
12 years ago
Michael Peter Christen 395b78a0d8 using the solr search index to concurrently search within solr and the
12 years ago
Michael Peter Christen e5ef840f40 - renamed DoubleSolrConnector to MirrorSolrConnector and added a
12 years ago
Michael Peter Christen 94a334f128 another fix to the Solr metadata reading process and to the shutdown
12 years ago
Michael Peter Christen b51df6c7e8 - added coordinate storage in solr schema
12 years ago
Michael Peter Christen f9c0e6e950 - Implemented and integrated the URIMetadataNode object which is a
12 years ago
Michael Peter Christen dcc72799c4 better abstraction for result writers using controlled vocabularies and
12 years ago
Michael Peter Christen a12f693ec9 added two response writer for embedded solr interface:
12 years ago
sixcooler f32aa9a49c prevent merge of blobs that can't be handled in memory
12 years ago
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
12 years ago
Michael Peter Christen e432bb9cd9 better calculation of possible saving in HeapReader index data structure
12 years ago
Michael Peter Christen 9549984c65 documentation/comments
12 years ago
Michael Peter Christen 826967513b changed options in IndexFederated_p to switch on/off parts of the index
12 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
12 years ago
Michael Peter Christen f0a079ac9f allow larger log entries
13 years ago
Michael Peter Christen 784a4abb18 enhancement in internal data organization which should generate less
13 years ago
Michael Peter Christen f78ce93a80 collection of speed and memory saving hacks
13 years ago
orbiter a196f24f60 prevent enqueueing of non-loggeable logging entries
13 years ago
orbiter 482afed07c reduced logging overhead (a bit)
13 years ago
orbiter e76159040b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter bbfa497a3c replaced more size() > 0 by !isEmpty()
13 years ago
Michael Peter Christen 83da68c4c1 fixed a memory leak inside the logger which appeared if the log was
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
Michael Peter Christen 1addbc792c use less memory for md5 cache
13 years ago
Michael Peter Christen f32de94723 more logging
13 years ago
Michael Peter Christen 8efc1c1078 - fixed a memory leak (or bad usage) during parsing/snippet fetch
13 years ago
Michael Peter Christen b0c408788b made class methods static where possible
13 years ago
Michael Peter Christen 5bd3c90907 - removed unnecessary semicolons
13 years ago
Michael Peter Christen 132afaf687 removed unaccessible code
13 years ago
Michael Peter Christen 7c1ba99755 removed more unused method parameters
13 years ago
Michael Peter Christen 83701a1b4c removed unused ImageReference package
13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen d3964253ae - added @SuppressWarnings to unused servlet method parameters
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 1481037820 replaced non-generic array with collection
13 years ago
Michael Peter Christen 613b45f604 - better data structures in secondary search
13 years ago
Michael Peter Christen 8a82609360 - smaller caches to save memory
13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
Michael Peter Christen b9d42fd9c8 using com.google.common.io.Files instead of homebrew methods
13 years ago
Michael Peter Christen de3ef8ad73 removed unimportant warnings
13 years ago
Michael Peter Christen 9264d8b4af removed old navigation practice using subject tags in favor of
13 years ago
Michael Peter Christen 61bb52d55c - using http://purl.org/dc/terms/references to refer from an
13 years ago
Michael Peter Christen 8b53771db2 changed behavior of navigation processing:
13 years ago
Michael Peter Christen bef823c247 close the reader if finished
13 years ago
cominch 9cbfc1a1c0 augmentedProxy, which forwards every proxy request to a
13 years ago
Michael Peter Christen 3b992e6b00 using utf8 String compression in Webstructure database
13 years ago
Michael Peter Christen 2280a7b276 - changed initialization order to prefer allocation of memory for table
13 years ago
Michael Peter Christen 0746308bc2 only the metadata tables shall be able to use the tail cache
13 years ago
Michael Peter Christen 7ec9bef0c3 fix for OOM
13 years ago
Michael Peter Christen 41c02cb10e - less restrictions for usage of Table RAM copy
13 years ago
Michael Peter Christen b8f56a9803 npe bugfix
13 years ago
Michael Peter Christen ba10caf89a lazy initialization of database tables
13 years ago
Michael Peter Christen 701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 10c9c17d51 fixed handlemap spread factor and null iterator handling
13 years ago
Michael Peter Christen b0095c8d3c flush the compressor cache when a cleanup is done
13 years ago
Michael Peter Christen 96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 00f2df1120 a variety of possible memory leak fixes
13 years ago
Michael Peter Christen 3dd8376825 added automatic cleaning of cache if metadata and file database size is
13 years ago
Michael Peter Christen 6bb07afcc3 accept also files with other file prefix; used to read 'foreign' cache
13 years ago
Michael Peter Christen 461a0ce052 removed warnings
13 years ago
Michael Peter Christen 407fdf6968 more bug fixes and performance hacks for search process
13 years ago
Michael Peter Christen a1fe65b115 performance hacks
13 years ago
Michael Peter Christen e0d8643226 - performance hacks
13 years ago
Michael Peter Christen 9b4c699526 ehanced location search:
13 years ago
Michael Peter Christen 1f48d1528b performance hacks
13 years ago
Michael Peter Christen 10da7335ea performance hack: use a hash cache for all hashes that are computed by a
13 years ago
Michael Peter Christen 7c1feefb28 introduced a default 10 second time-out in rwi normalization time
13 years ago
Michael Peter Christen 8d997d55b6 better logging
13 years ago
Michael Peter Christen 43c2c6e588 better logging
13 years ago
Michael Peter Christen c15fcde1c8 add-on to latest commit
13 years ago
Michael Peter Christen cf47d94888 performance hack to parse numbers inside of substrings without actually
13 years ago
Michael Peter Christen 7e0ddbd275 added a "fromCache" flag in Response object to omit one cache.has()
13 years ago
Michael Peter Christen c6a09eab0b synchronization needed
13 years ago
reger 6696cb1313 bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
13 years ago
Michael Peter Christen acf8d521a2 fix for http://bugs.yacy.net/view.php?id=126
13 years ago
Michael Peter Christen fa735f4f04 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 6f8a2fef1f small speed enhancement using a column factory
13 years ago
Roland 'Quix0r' Haeder d10627d591 More sync in close() methods
13 years ago
Roland 'Quix0r' Haeder fbb946f913 Made a method static (Eclipse suggested it), removed unused import, pk=null check does now output a warning in logfile
13 years ago
Michael Peter Christen 89142d1e8d removed (not all) warnings
13 years ago
Michael Peter Christen 15db703808 added missing serialization to remove all warnings
13 years ago
Michael Peter Christen 1795a7325b made HandleSet serializable
13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects
13 years ago
Michael Peter Christen 0cf3d36eae more tolerance in case of corrupted file
13 years ago
Michael Peter Christen 34f4225d7e less 'wellformed' calls without asserts
13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes
13 years ago
Michael Christen e32055aa15 added stub classes for
13 years ago
Michael Peter Christen 2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method.
13 years ago
Michael Peter Christen 213c8d97f2 use less proccesses in process pool
13 years ago
Michael Peter Christen c639248c23 protection against strange answers from remote peers during search
13 years ago
Michael Peter Christen 1cd711d005 added classes for citation references (for new citation ranking)
13 years ago
Michael Peter Christen e0f1e7d904 added new citation reference data structure that shall be used for a
13 years ago
Michael Peter Christen e18a4f6b74 more tolerant merge iterator
13 years ago
Michael Peter Christen 7e4e3fe5b6 free some memory after parsing html
13 years ago
Michael Peter Christen 4540174fe0 memory hacks
13 years ago
Michael Peter Christen b4409cc803 small redesign of blob column index and usage
13 years ago
Michael Peter Christen d5c1f2746e performance hack
13 years ago
Michael Peter Christen 803963aebd performance hack: better space grow in CharBuffer (speeds up html
13 years ago
Michael Peter Christen e2f8f263e8 changed storage of search words: keep order
13 years ago
Michael Peter Christen 0b67a0a5d8 added a column index for tables in blob files. This is heavily used
13 years ago
Michael Peter Christen e3bb73c3d6 serialized some database access methods
13 years ago
Michael Peter Christen 2ea585d616 fix for host navigator
13 years ago
Michael Peter Christen ef78f22ee1 performance hack
13 years ago
Michael Peter Christen a02fdf8625 better error messages
13 years ago
Michael Peter Christen c6ba44468e timeout = 5000 instead 3000
13 years ago
low012 8776b84c10 *) small fix to make password change function of reconfigureYACY.sh work
13 years ago
Michael Peter Christen 4901cee3cc suppress auto-tagged subject entries when sending out or receiving
13 years ago
sixcooler 985b78cf89 correct 'avaiable()' to use max of young / eden
13 years ago
sixcooler 4da8746275 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
sixcooler c9aaa9e00a respect non-reserved Memory in GenerationMemoryStrategy
13 years ago
Michael Peter Christen 37f2d1b3e9 replaced Thread initialization with ExecutorService pool for delete
13 years ago
Michael Peter Christen 0d6176804b emergency disabling of GenerationMemoryStrategy because of non-working
13 years ago
Michael Peter Christen 87f0210480 enriched log output to find NPE in HeapReader
13 years ago
Michael Peter Christen 254adea51c small fixes
13 years ago
Michael Peter Christen 49be60a7c8 WorkflowProcess is forced to make small pauses if shortMemoryStatus is
13 years ago
Michael Peter Christen b7bb84c0bb set a limit to CharBuffer object size to fight against bad/too large
13 years ago
Marek Otahal 72adbeae90 !Important: move from Hashtable to HashMap
13 years ago
Marek Otahal f75b5e40e0 little fix in copy()
13 years ago
Michael Christen 216a287a85 Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r
13 years ago
Michael Christen 20962a4ed7 added metadata node stub for metadata from blobs
13 years ago
Michael Christen 575dbbaa93 enhancements in Blob retrieval: try to use less CPU resources by testing
13 years ago
Roland 'Quix0r' Haeder 6d4e08ed06 Rewrote filesize() to (hopefully) avoid a NPE, rewrote Blacklist class to concurrent classes to avoid a CME
13 years ago
Roland 'Quix0r' Haeder fa08ed5ae5 Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen c04bfaa51b refactoring
13 years ago
Michael Peter Christen 613ab6a69d added BEncodedHeapBag and BEncodedHeapShard which are storage container
13 years ago
Michael Christen 6fecd0db88 one more performance hack to prevent costly md5 computation
13 years ago
Michael Christen e13441b069 better digest pool size (smaller by default but unlimited)
13 years ago
Michael Christen 1f4afb4dc0 performance hacks
13 years ago
Michael Christen e9dc99fe15 added rules to set specific RWIs as private RWIs which are not
13 years ago
Michael Peter Christen 0bcef2d156 added feature as requested in
13 years ago
Michael Christen 204c29f010 small bugfixes for search result display and cache display
13 years ago
Michael Christen 078fcde0dd bad initialization
13 years ago
Michael Christen 14e45e90fd patch for a bug that I don't understand by now.
13 years ago
Michael Christen 86b3385847 fixed a deadlock during secondary remote search
13 years ago
Michael Christen 404758698a less io operations
13 years ago
Michael Christen 044f83feed added some pauses into the search process which shall produce
13 years ago
sixcooler 448656087a probably fix for http://bugs.yacy.net/view.php?id=94
13 years ago
Michael Christen d35bdc2df6 removed npe
13 years ago
Michael Christen e7e429705a - less automatic indexing after a search (needs to reset the default
13 years ago
Michael Christen 9cd469e6d6 added pull request from als plus an NPE fix
13 years ago
orbiter 83335c3b09 fix for http://bugs.yacy.net/view.php?id=78
13 years ago
orbiter 35a9e8f307 - fixed network graphic
13 years ago
Al Sutton 8993cac4d8 Initial performance improvements
13 years ago
orbiter 8895d8c1cd removed unnecessary log entries
13 years ago
orbiter bc5df0eef5 updated ranking tables (fresh computation)
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
orbiter c9216d5adf fixed secondary remote search (the process that finds distributed join situations)
13 years ago
orbiter 0cf9ebc3b0 speed enhancements when parsing RWI rows (makes search slightly faster)
13 years ago
orbiter 709013385a fix for language fix
13 years ago
orbiter c0c6e9e7a5 fix for bad language encoding
13 years ago
orbiter 05f34a3fa7 added a full, complete, database insert, update and delete API for the tables.
13 years ago
orbiter 3a15e58e28 - increased stability when opening the robots table
13 years ago
orbiter 775b44017e refactoring
13 years ago
orbiter e914a30099 fix for npe
13 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process
13 years ago
orbiter e58438c01c - added a new retry connector for solr (for cases where solr responses are slow)
13 years ago
orbiter d8d9735b4f stability bugfix
13 years ago
orbiter f121f4bb45 fix for link in Supporter and Suftipps page
13 years ago
orbiter 1b86d06d1e fix for http://bugs.yacy.net/view.php?id=62
13 years ago
orbiter eb9c9edb01 enhanced table method (used by almost all yacy api interfaces)
13 years ago
orbiter 5af9598bd1 enhanced exported row parsing during row import
13 years ago
orbiter 7598a9e26b fix for thread dump
13 years ago
orbiter 8eef8722d1 update to ThreadDump analysis: freerunner and thread state recognition
13 years ago
orbiter 1df43b137d another performance hack
13 years ago
orbiter 7df0643f0e performance hacks
13 years ago
orbiter 813f297a95 another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash
13 years ago
orbiter 035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill)
13 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
13 years ago
orbiter 57d5529a01 performance hacks
13 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter 0c6d95e57b - more tolerance against failure of table opening
13 years ago
orbiter ce2a76d603 performance hack for search process
13 years ago
orbiter 2c4a672fe2 bugfixes and performance hacks for tabe index
13 years ago
orbiter dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
13 years ago
orbiter 734059d33e performance hacks
13 years ago
orbiter dd4635e323 patches
13 years ago
orbiter 2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking
13 years ago
orbiter cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
13 years ago
sixcooler ecb4986b38 refactored stuff from last commit to ReferenceContainer
13 years ago
sixcooler f7c4abfdd7 limit references per blob & term to the 100.000 youngest
13 years ago
orbiter 28f5b79deb added a fast mass-deletion method
13 years ago
orbiter 44d6416e2d ensure termination of shrink()
13 years ago
orbiter 52230a6864 replaced catching of Exception with Throwable, which catches also Errors
13 years ago
orbiter e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase.
13 years ago
sixcooler d40a177c05 Generation Memory Strategy fine tuning
13 years ago