Commit Graph

1645 Commits (a6ad1d6fd1d03eea9f524878586c61f6f88c6a8c)

Author SHA1 Message Date
reger 469efcdb9d fix: display and calculate authors and namespace search navigator if configured (otherwise skip overhead)
12 years ago
Michael Peter Christen eca68fa197 added debug code to crawler monitor
12 years ago
Michael Peter Christen 205f8b222b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter ee612e8b93 start the local search only if this peer is doing a remote search or
12 years ago
Michael Peter Christen d465773a37 - removed multi-add of documents (no used)
12 years ago
Michael Peter Christen a1a4d9aa94 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen b7004043ea - added a field cache for solr queries which call only for a single
12 years ago
orbiter 5aa5202adf fixes for filesystem indexing
12 years ago
Michael Peter Christen efd2c4622d added a new fail type attribute for the index to distinguish two
12 years ago
Michael Peter Christen 5e182a566f - added another enumeration method in kelondro data structure to get a
12 years ago
Michael Peter Christen 4eab3aae60 removed overhead by preventing generation of full search results when
12 years ago
Michael Peter Christen a114bb23bb - using edismax in gsa interface
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen f5ca5cea44 - added field options to all solr queries. This can be used to restrict
12 years ago
Michael Peter Christen 46be4af5b9 Merge commit '2bb8f045cc92f31fc7e720cc30b38af417563890'
12 years ago
Michael Peter Christen 832eead998 Merge remote-tracking branch 'regerdev/master'
12 years ago
Michael Peter Christen 952e143580 FINALLY YaCy can now search for full strings using double- or
12 years ago
orbiter 5dfd6359cb redesign of the QueryParams class: introduced QueryGoal which holds the
12 years ago
cominch 2bb8f045cc content control: use up-to-date definitions
12 years ago
Michael Peter Christen 5fd3b93661 added deletion of hosts during crawl start if deleteold option was given
12 years ago
Michael Peter Christen d64445c3cb because we have the inurl:<term> - searchmodifier, we don't actually
12 years ago
cominch a67ff1c8ac SMW Import: replaced JSON import routines with stable ones
12 years ago
cominch d2a94cc55e refactor package
12 years ago
cominch 05742b4562 remove old SMW importer which was part of the ymarks package
12 years ago
cominch 21df1ad9e0 update and generalization of the SMW import and content control routines
13 years ago
Michael Peter Christen 842faf96a2 fixed media search
13 years ago
Michael Peter Christen 93001586a0 removed warnings, removed too-fast pausing of crawls
13 years ago
Michael Peter Christen 8041742e48 added matching of path to query pattern
13 years ago
Michael Peter Christen 8b1c9cba3d fixed a problem with non-terminating crawls
13 years ago
Michael Peter Christen 61a1d32356 fix to ftp client
13 years ago
Michael Peter Christen 5105256927 update to search result logging (this was a remaining issue from the
13 years ago
Michael Peter Christen 570e42c4e3 fix for filetype naviagtor
13 years ago
Michael Peter Christen 71ed8e5e07 bugfixes for crawler
13 years ago
Michael Peter Christen 12c0db20e5 fixed npe for surrogate import
13 years ago
Michael Peter Christen 52df6ee369 more logging
13 years ago
Michael Peter Christen 158732af37 automatically delete entries from the crawl profile list if crawl is
13 years ago
Michael Peter Christen 15d1460b40 added information about the reason of pausing of crawls
13 years ago
Michael Peter Christen 2371ef031c added solr faceted search support to YaCy search results
13 years ago
Michael Peter Christen b30a7162fa added more thread-renaiming for search processes
13 years ago
Michael Peter Christen 900445d8e9 set the thread name during solr queries to the solr query to get better
13 years ago
Michael Peter Christen d481abd087 added the visualization of error-urls to host browser
13 years ago
Michael Peter Christen a15819fbec fix for some interface problems
13 years ago
Michael Peter Christen 791e1dcfdf when a new crawl is started, delete all entries about error-urls for
13 years ago
Michael Peter Christen 619bf7e875 fixed filetype modified for media types in text search
13 years ago
Michael Peter Christen 97f82994a6 automatically pause the crawler if there is a problem with solr
13 years ago
Michael Peter Christen 8fb370d9f8 renovated the way how search results are count. should be correct now...
13 years ago
Michael Peter Christen 7bec253bb0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen d88eb657fd Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
orbiter 354ef8000d - added 'deleteold' option to crawler which causes that documents are
13 years ago
reger 633fbe9188 Fix Metadata handling
13 years ago
Michael Peter Christen 75dd706e1b update to HostBrowser:
13 years ago
Michael Peter Christen e2c4c3c7d3 migration to solr 4.0.0
13 years ago
Michael Peter Christen b764de424a code cleanup
13 years ago
Michael Peter Christen 9330ad4838 - fixed the delete option in host browser
13 years ago
Michael Peter Christen a63179f3f9 added the MIME attribute for the R tag in GSA search result writer
13 years ago
Michael Peter Christen 1168d09de8 more refactoring - integrated the code of SnippetProcess into
13 years ago
Michael Peter Christen 6629e37685 tried to clean up the search process mess
13 years ago
Michael Peter Christen c5f67a5d6d fixed a problem with local search from solr results: now all results
13 years ago
Michael Peter Christen f8f05ecba7 - added a delete button in host browser to delete a complete subpath
13 years ago
Michael Peter Christen 0716a24737 added more / all new crawl profile fields into crawl profile editor
13 years ago
Michael Peter Christen 4a14122ba7 in case that a crawl profile has a collection assigned, use the
13 years ago
Michael Peter Christen 0fe8be7981 enhaced data structures for balancer and latency computation which
13 years ago
Michael Peter Christen ac9540dfb6 removed options for stopwords which are not used
13 years ago
Michael Peter Christen ce3fed8882 added the Google Search Appliance (GSA) api interface to the main menu.
13 years ago
Michael Peter Christen b2ffd49817 less latency
13 years ago
Michael Peter Christen 0833937c1c better balancing and duetime-cumputation also for no-delay intranet
13 years ago
Michael Peter Christen c326aa8f67 disabled writing new entries to crawl stacks to prevent that a domain
13 years ago
Michael Peter Christen 6905182d41 - fix for number of words log message
13 years ago
Michael Peter Christen c25d7bcb80 - added concurrency for robots.txt loading
13 years ago
Michael Peter Christen a94c537afc fixed getSize() which can use the cache size while the crawl is running
13 years ago
Michael Peter Christen 96912c9471 enhancement to solr caching: consider that during a get() the document
13 years ago
Michael Peter Christen a87811bc38 more auto-commit calls when a search interface is opened, but not when a
13 years ago
Michael Peter Christen 3d3d654e88 if a network configuration is choosed which does not allow DHT and no
13 years ago
Michael Peter Christen 2d9e577ad0 replaced the custom robots.txt loader by the standard http loader
13 years ago
Michael Peter Christen 799d71bc67 enhanced solr caching:
13 years ago
Michael Peter Christen a33e2742cb - removed unnecessary synchronized and deadlock in crawler
13 years ago
orbiter 8952153ecf update to Balancer algorithm:
13 years ago
orbiter 354f0d9acd moved static method from ClusteredScoreMap to MapDataMining because it
13 years ago
reger 722a447b0d - optimize code of augmented parsing to enhence document tags
13 years ago
Michael Peter Christen 8e1248ffe3 force a commit in advance of a search for the administrator to get most
13 years ago
Michael Peter Christen 3b48c78190 added an option to force a commit to solr.
13 years ago
sixcooler 2d972f289a rise commitWithinMs to default-value from SwitchBoard
13 years ago
orbiter 8fde1dd3b6 another performance and memory hack to graphics: this makes it possible
13 years ago
Michael Peter Christen 1baf498d59 - show more lines in online log
13 years ago
Michael Peter Christen 55bdafbaf1 more image processing hacks
13 years ago
Michael Peter Christen f2d0418218 because the new PngEncoder had a problem with the PixelGrabber which is
13 years ago
Michael Peter Christen d5d64019e5 - added a method for the RasterPlotter to draw arrow endings to lines
13 years ago
Michael Peter Christen 85ca07b90e when a new crawl is started, an equal crawl, if still running, is
13 years ago
Michael Peter Christen 906e51214a the web structure image shows the pivot dot in a different color
13 years ago
Michael Peter Christen b3ffcde0c7 - prepared PngEncoder for concurrency: PixelGrabber.grabPixels is the
13 years ago
Michael Peter Christen e9c6f4ce2e - new order of data computation: first compute the size of
13 years ago
orbiter c6a1b21399 added a 9-year old png encoder from David Eisenberg which I rewrote
13 years ago
orbiter 276dd6452b removed warnings
13 years ago
Michael Peter Christen b991685782 Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
Michael Peter Christen ea11a1efea fix for highlighting in gsa search
13 years ago
Michael Peter Christen 9eaede50e7 enhanced web structure images
13 years ago
Michael Peter Christen b7ac1da6a3 gsa results shall have only one title in metadata and that should be the
13 years ago
Michael Peter Christen ae6feb5610 showing the web structure graph as animation in the crawl monitor
13 years ago
reger 87aab9aa7c - fix: with augmented parsing = on; missing metadata in index (like title) due to overwriting metadata by adding multiple result docs from augmentparser with same url
13 years ago
Michael Peter Christen 39317a6c66 enhanced webstructure image: introduced
13 years ago