Commit Graph

778 Commits (6578ff3ddb1c9300e311360dfa2f02a1f64c7c61)

Author SHA1 Message Date
Michael Peter Christen 14186e815e npe fix
12 years ago
Michael Peter Christen f7e77a21bf Added a citation reference computation for intra-domain link structures.
12 years ago
Michael Peter Christen e20450e798 patch in HTCache and CitationIndex loading in case that a file is
12 years ago
reger 7480e87386 - fix stopword handling for RWI see example http://bugs.yacy.net/view.php?id=247
12 years ago
Michael Peter Christen a1644ca0fd new workflow processor in Segment to enqueue indexing documents to solr
12 years ago
Michael Peter Christen 5344a1c5f7 getting the trash out
12 years ago
orbiter 888a985dc6 set a higher limit for table copy usage
12 years ago
Michael Peter Christen 8dbc80da70 redesign of index.exist-test: this shall now not be done using a single
12 years ago
Michael Peter Christen 44e363f37f refactoring of WorkflowProcessor, added process counter, update of
12 years ago
orbiter aeff31cd44 fix for workflow processor (cause: latest redesign for less threads)
12 years ago
orbiter a1c989002b fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4652
12 years ago
orbiter 7de5b9cfa0 fix for http://bugs.yacy.net/view.php?id=233
12 years ago
Michael Peter Christen bb4bf3d8fd infinity timeout bug protection patch
12 years ago
orbiter e1bfe9d07a - reduction of the concurrently running processes to make YaCy more
12 years ago
Michael Peter Christen c1a2175fbc added transparency to gif image animation and the integration to the
12 years ago
Michael Peter Christen ada3f27de7 added three new field for a better ranking: references_internal_i,
12 years ago
Michael Peter Christen 342ba1049b - callback fix
12 years ago
orbiter 47114910d5 fix for possible memory leaks
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 2b6c79d347 in method exists() also use the new caching-stacks for
12 years ago
Michael Peter Christen 3b1d9dc884 made index storage from DHT search result concurrently. This prevents
12 years ago
orbiter d74472f562 corrected result counter
12 years ago
Michael Peter Christen c95a84103a complete redesign of search process:
12 years ago
Michael Peter Christen 35fa718b77 testing to use solr for portalsearch caused some bugfixing but no full
12 years ago
Michael Peter Christen 089dee1770 - generalized SchemaConfiguration into super-class Configuration and
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Marc Nause 75f9568472 *) only install files from the RELEASE directory
12 years ago
Marc Nause 3bc5ee6e3d *) added protection against CSRF in update download page
12 years ago
reger 3897bb4409 added (manual) urldb migration (link on: Index Administraton -> Federated Solr Index)
12 years ago
Michael Peter Christen 38d3feae65 added separate delete commands for the local+remote solr index, the old
12 years ago
Michael Peter Christen 0f5b6f38c1 enhanced root-url detection
12 years ago
Michael Peter Christen 5c0c56cfe1 Preparations to produce a click depth attribute in the search index.
12 years ago
reger 276e63401e small sanitary fixes
12 years ago
Michael Peter Christen 24c9bb35f7 extended the Scheduler: introduced scheduled events
12 years ago
reger ad71747525 fix: set defaul language to "en"
12 years ago
orbiter 712cc37c40 if maxFileSize < 0 then the file size limit is without limit.
12 years ago
Michael Peter Christen 8fc3679c66 using more pre-compile pattern for split methods
12 years ago
Michael Peter Christen 5e182a566f - added another enumeration method in kelondro data structure to get a
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen f5ca5cea44 - added field options to all solr queries. This can be used to restrict
12 years ago
Michael Peter Christen 832eead998 Merge remote-tracking branch 'regerdev/master'
12 years ago
Michael Peter Christen 570e42c4e3 fix for filetype naviagtor
12 years ago
reger 633fbe9188 Fix Metadata handling
12 years ago
Michael Peter Christen c5f67a5d6d fixed a problem with local search from solr results: now all results
12 years ago
Michael Peter Christen f8f05ecba7 - added a delete button in host browser to delete a complete subpath
12 years ago
Michael Peter Christen a33e2742cb - removed unnecessary synchronized and deadlock in crawler
12 years ago
orbiter 354f0d9acd moved static method from ClusteredScoreMap to MapDataMining because it
12 years ago
Michael Peter Christen 1baf498d59 - show more lines in online log
12 years ago
Michael Peter Christen f2d0418218 because the new PngEncoder had a problem with the PixelGrabber which is
12 years ago
orbiter 276dd6452b removed warnings
12 years ago
Michael Peter Christen ce0e5b1e17 - more refactoring / private methods
12 years ago
Michael Peter Christen ccc3760a47 Refactoring and redesign of data architecture to make URIMetadataRow
12 years ago
Michael Peter Christen b400fc7b4d fix for file parser problem
12 years ago
Michael Peter Christen e5b3c172ff removed hack which translated Solr documents to virtual RWI entries
12 years ago
Michael Peter Christen 6017691522 added an exception catch
12 years ago
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
12 years ago
Michael Peter Christen 21fe8339b4 - enhanced generation of url objects
12 years ago
Michael Peter Christen 613cf7da7f enhancement to post argument parsing - possible fix to zero-filled
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of &amp; parts inside of the
12 years ago
Michael Peter Christen a06930662c replaced some more .getBytes() with UTF8/ASCII.getBytes()
12 years ago
Michael Peter Christen 2f536cb54d code cleanup: removed unised methods and made more methods and objects
12 years ago
Michael Peter Christen 584663ae8c - redesign of solr query construction
12 years ago
Michael Peter Christen a8167e6e5b clean-up: removed unused methods in kelondro
12 years ago
Michael Peter Christen 24d2ee3c52 - better date ranking
13 years ago
Michael Peter Christen ca313e404f - if a "/date" modifier is used, the solr remote query applies an
13 years ago
Michael Peter Christen 24f4ca4d85 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
apfelmaennchen 116f429e35 fix for java.lang.RuntimeException: TableColumnIndex not available...
13 years ago
Michael Peter Christen 1533bfd63b refactoring
13 years ago
Michael Peter Christen 872f83ebe0 refactoring
13 years ago
Michael Peter Christen 8219a445f3 refactoring
13 years ago
Michael Peter Christen 00c1c777fa refactoring
13 years ago
orbiter 563d584420 removed more dependencies in cora from kelondro
13 years ago
Michael Peter Christen e072632a54 no complaints about memory if the database is empty
13 years ago
Michael Peter Christen e65cecc419 - updated lucene libraries to 3.6.1
13 years ago
Michael Peter Christen 4d29f59a27 removed warnings
13 years ago
Michael Peter Christen 8c099d2106 Merge remote-tracking branch 'origin/master'
13 years ago
apfelmaennchen d31a632951 - added dmoz RDF dump importer
13 years ago
Michael Peter Christen d8425e6809 added collections to crawl monitor
13 years ago
Michael Peter Christen 528d6763fa - added new solr fields:
13 years ago
Michael Peter Christen 316b5fe116 - added a solr type definition verifier
13 years ago
Michael Peter Christen e8acd542b5 - added faceted drill-down for host and geolocation to solr queries
13 years ago
orbiter 2094df2e4e - correct length computation for BStringObject (bugfix suggested by
13 years ago
Michael Peter Christen 4716546ef5 - reduced memory usage in index transmission using a transformation of
13 years ago
Michael Peter Christen 06b0081fdc fix for NPE during host navigation computation
13 years ago
orbiter acb9f04e80 removed unused classes
13 years ago
Michael Peter Christen 755f5e76cf removed strange assert statements and simplified code in metadata
13 years ago
orbiter ee01c12e56 fixes for putDocument and putMetadata
13 years ago
Michael Peter Christen f9fc5cfaba better check for bad urls in url transmission
13 years ago
Michael Peter Christen 40c0856489 refactoring
13 years ago
Michael Peter Christen 9bece5ac5f enhanced snippet fetch - removed a bug that caused documents to be
13 years ago
Michael Peter Christen 395b78a0d8 using the solr search index to concurrently search within solr and the
13 years ago
Michael Peter Christen e5ef840f40 - renamed DoubleSolrConnector to MirrorSolrConnector and added a
13 years ago
Michael Peter Christen 94a334f128 another fix to the Solr metadata reading process and to the shutdown
13 years ago
Michael Peter Christen b51df6c7e8 - added coordinate storage in solr schema
13 years ago
Michael Peter Christen f9c0e6e950 - Implemented and integrated the URIMetadataNode object which is a
13 years ago
Michael Peter Christen dcc72799c4 better abstraction for result writers using controlled vocabularies and
13 years ago
Michael Peter Christen a12f693ec9 added two response writer for embedded solr interface:
13 years ago
sixcooler f32aa9a49c prevent merge of blobs that can't be handled in memory
13 years ago
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
13 years ago
Michael Peter Christen e432bb9cd9 better calculation of possible saving in HeapReader index data structure
13 years ago
Michael Peter Christen 9549984c65 documentation/comments
13 years ago
Michael Peter Christen 826967513b changed options in IndexFederated_p to switch on/off parts of the index
13 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
13 years ago
Michael Peter Christen f0a079ac9f allow larger log entries
13 years ago
Michael Peter Christen 784a4abb18 enhancement in internal data organization which should generate less
13 years ago
Michael Peter Christen f78ce93a80 collection of speed and memory saving hacks
13 years ago
orbiter a196f24f60 prevent enqueueing of non-loggeable logging entries
13 years ago
orbiter 482afed07c reduced logging overhead (a bit)
13 years ago
orbiter e76159040b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter bbfa497a3c replaced more size() > 0 by !isEmpty()
13 years ago
Michael Peter Christen 83da68c4c1 fixed a memory leak inside the logger which appeared if the log was
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
Michael Peter Christen 1addbc792c use less memory for md5 cache
13 years ago
Michael Peter Christen f32de94723 more logging
13 years ago
Michael Peter Christen 8efc1c1078 - fixed a memory leak (or bad usage) during parsing/snippet fetch
13 years ago
Michael Peter Christen b0c408788b made class methods static where possible
13 years ago
Michael Peter Christen 5bd3c90907 - removed unnecessary semicolons
13 years ago
Michael Peter Christen 132afaf687 removed unaccessible code
13 years ago
Michael Peter Christen 7c1ba99755 removed more unused method parameters
13 years ago
Michael Peter Christen 83701a1b4c removed unused ImageReference package
13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen d3964253ae - added @SuppressWarnings to unused servlet method parameters
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 1481037820 replaced non-generic array with collection
13 years ago
Michael Peter Christen 613b45f604 - better data structures in secondary search
13 years ago
Michael Peter Christen 8a82609360 - smaller caches to save memory
13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
Michael Peter Christen b9d42fd9c8 using com.google.common.io.Files instead of homebrew methods
13 years ago
Michael Peter Christen de3ef8ad73 removed unimportant warnings
13 years ago
Michael Peter Christen 9264d8b4af removed old navigation practice using subject tags in favor of
13 years ago
Michael Peter Christen 61bb52d55c - using http://purl.org/dc/terms/references to refer from an
13 years ago
Michael Peter Christen 8b53771db2 changed behavior of navigation processing:
13 years ago
Michael Peter Christen bef823c247 close the reader if finished
13 years ago
cominch 9cbfc1a1c0 augmentedProxy, which forwards every proxy request to a
13 years ago
Michael Peter Christen 3b992e6b00 using utf8 String compression in Webstructure database
13 years ago
Michael Peter Christen 2280a7b276 - changed initialization order to prefer allocation of memory for table
13 years ago
Michael Peter Christen 0746308bc2 only the metadata tables shall be able to use the tail cache
13 years ago
Michael Peter Christen 7ec9bef0c3 fix for OOM
13 years ago
Michael Peter Christen 41c02cb10e - less restrictions for usage of Table RAM copy
13 years ago
Michael Peter Christen b8f56a9803 npe bugfix
13 years ago
Michael Peter Christen ba10caf89a lazy initialization of database tables
13 years ago
Michael Peter Christen 701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 10c9c17d51 fixed handlemap spread factor and null iterator handling
13 years ago
Michael Peter Christen b0095c8d3c flush the compressor cache when a cleanup is done
13 years ago
Michael Peter Christen 96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 00f2df1120 a variety of possible memory leak fixes
13 years ago
Michael Peter Christen 3dd8376825 added automatic cleaning of cache if metadata and file database size is
13 years ago
Michael Peter Christen 6bb07afcc3 accept also files with other file prefix; used to read 'foreign' cache
13 years ago