Commit Graph

456 Commits (f0587d4af5f27887e6aa3d401cbe4e8afe2790fc)

Author SHA1 Message Date
Michael Peter Christen 1a3e42eca4 index migration to lucene 4.4
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
sixcooler 1bc6003057 rise autoCommit maxTime to 3 Minutes to reduce IO
11 years ago
orbiter 944ae5686c added donation plea to the about box as default (you can replace this in
11 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
orbiter e7fcb81cea we should not do too much greedylearning at this time as we don't have
11 years ago
orbiter bf0ad04e1b apply load limitation also to dht-in
11 years ago
orbiter f50b596e0b do not run dht ditribution if system load is over 2.5
11 years ago
orbiter e24016e30a added the property federated.service.solr.indexing.timeout to yacy.init
11 years ago
Roland Haeder 98e10f95e2 Added some cora package loggers
11 years ago
orbiter 1b43e02b86 Merge branch 'master' of git://gitorious.org/~quix0r/yacy/quix0rs-yacy-rc1
12 years ago
orbiter a548354c71 replaced type of solr schema object sku of text_en_splitting_tight by
12 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
orbiter e609ec388a metager whitelist update
12 years ago
Michael Peter Christen 2716dfc46c increase crawler speed by reduction if the busysleep time
12 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Michael Peter Christen 5a5d411ec0 new robots_i attribute fields
12 years ago
orbiter 7c6ccc426c set crawlingQ to true by default because most webpages are dynamic and
12 years ago
Michael Peter Christen 16d1d744fa added url_file_name_s in default collection schema for the file name
12 years ago
orbiter 8792e6c6e9 stub for better image indexing
12 years ago
Michael Peter Christen 570511f3c8 removed fields references_internal_id_sxt and
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 7754a1263b switching back to the merge factor 10; the solr default.
12 years ago
Michael Peter Christen 1762911f57 added synchronizations and timeouts in solr api; missing
12 years ago
Michael Peter Christen 959ccc4675 increased the solr merge factor because 4 was too much IO load for
12 years ago
Michael Peter Christen 20fab1feb6 allip net has greedy learning disabled
12 years ago
Michael Peter Christen 6115bef335 added a 'greedy learning' mechanismn which will cause that a 'fresh'
12 years ago
Michael Peter Christen 856e5c42ae the line "Web Search by the People, for the People" is more generic for
12 years ago
Michael Peter Christen 713a6199ef activated citation ranking by default
12 years ago
Michael Peter Christen f7a4377812 usage of the new normalized link polularity CRn as default ranking
12 years ago
Michael Peter Christen f7e77a21bf Added a citation reference computation for intra-domain link structures.
12 years ago
reger 8a7fcb391d enable use of solrcore.properties for property substitution of solrconfig.xml
12 years ago
Michael Peter Christen eb9d0ba5b1 ranking and boost function update, small bugfixes, better default search
12 years ago
Michael Peter Christen a8dc4346e8 default configuration of MMapDirectoryFactory for solr, increased lock
12 years ago
Michael Peter Christen 0c1a018bbd removed 'later' tactic because it used too much RAM, reduced number of
12 years ago
Michael Peter Christen 536fd1450e added new keys for update locations
12 years ago
orbiter a83c2fe833 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter 4baa0d4a97 Added a default keystore for ssl encryption of the YaCy web interface.
12 years ago
reger da191c839d reduce SolrConnectorLogging setting (from default ALL to INFO)
12 years ago
Michael Peter Christen 9bd2aee180 migrated to solr 4.3.0
12 years ago
Michael Peter Christen cca19d94d4 re-declared some fields to be of type string rather than text which
12 years ago
Michael Peter Christen cc90f82dbb increased default proxy client timeout to one minute
12 years ago
Michael Peter Christen 50421171c3 added new schema fields:
12 years ago
Michael Peter Christen d05dc07cff setting of new default values for ranking
12 years ago
Michael Peter Christen 97775fbebc fixed ranking for add-function queries: this did not work. The option
12 years ago
Michael Peter Christen 7ab5093321 added new solr title_exact_signature_l and
12 years ago
Michael Peter Christen 27d6222880 added new field host_extent_i which, after a crawl and postprocessing,
12 years ago
Michael Peter Christen ada3f27de7 added three new field for a better ranking: references_internal_i,
12 years ago
reger e89491271f - fix opensearch discover err msg - webgraph not enabled - if no opensearchdescription link found in index
12 years ago
orbiter 17ae51e741 increased number of links limitation from 1000 to 10000 for rss feeds
12 years ago
Michael Peter Christen 2d36a7eaf5 - do not create a new query for all remote peers
12 years ago
Michael Peter Christen 4af0839be2 use appropriate ranking for each search situation:
12 years ago
Michael Peter Christen 2080fc7406 removed unused tag fields
12 years ago
orbiter 6b13dd0d3d added clickdepth field writing for webgraph core (unfinished)
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 25300913fa fixes to search debugging after testing with the different search
12 years ago
orbiter b1140e3d82 added debug switches for detailed search testing
12 years ago
Michael Peter Christen 0d7b4bc891 better protection against OOM during search flush and fixed missing
12 years ago
Michael Peter Christen 3b1d9dc884 made index storage from DHT search result concurrently. This prevents
12 years ago
orbiter 0f7ea7ad9f - enhanced solr.add procedure for mass adds
12 years ago
Michael Peter Christen 089dee1770 - generalized SchemaConfiguration into super-class Configuration and
12 years ago
Michael Peter Christen 56d5946a59 - added flags in IndexFederated_p.html to switch on or off the webgraph
12 years ago
Michael Peter Christen 461d46101d - Removed log4j from libraries. This can be removed because the package
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen 4111606654 removed the commitWithin attribute because that is not the way how the
12 years ago
Michael Peter Christen d70d99fab5 added more metadata fields and facets to OpensearchResponseWriter.
12 years ago
Michael Peter Christen 8651ec35fe turned author_s into the multi-valued field author_sxt
12 years ago
Michael Peter Christen 4735bd47f4 - changed solr commit call and added an optimize option. Since Solr
12 years ago
Michael Peter Christen db024a4e19 added new solr fields (unused yet; implementation will follow)
12 years ago
Michael Peter Christen 9b5bdae1b4 Reverted setting of MMapDirectoryFactory from solrconfig; see
12 years ago
orbiter eb68a30947 solr performance settings
12 years ago
Michael Peter Christen f53703df62 using MMapDirectoryFactory as solution for ClosedChannelException given
12 years ago
Michael Peter Christen 22c694f906 activated the clickdepth_i attribute for solr again because the
12 years ago
Michael Peter Christen 5a0eb1b268 clickpath should not be active by default because it needs extensive
12 years ago
Michael Peter Christen 5c0c56cfe1 Preparations to produce a click depth attribute in the search index.
12 years ago
Michael Peter Christen 295884fd54 - Merge commit '168b1d130d9d67b5e8855a0b50c4ba7ad4a416f8'
12 years ago
reger 168b1d130d Adding heuristic to get search results from configured systems which support opensearch specification
12 years ago
reger 7761b60325 fix: Broken Link on Crawler_p.html - issue 218
12 years ago
reger e9e0d63897 Add config option to show HostBrowser link in search result
12 years ago
Michael Peter Christen 98819ec3d9 use solr boost configuration to select search fields. At this time it is
12 years ago
Michael Peter Christen 01200f06cc using the author field as solr-native facet. this makes it necessary to
12 years ago
Michael Peter Christen eac9650b31 added another solr field clickdepth_i which reflects the number of
12 years ago
Michael Peter Christen 1052263af3 - added a new solr field references_i which stores the number of
12 years ago
Michael Peter Christen 72f165d58b added a Boost class which stores solr query boost values. The class can
12 years ago
Michael Peter Christen ea033f8f8e added number of characters in url to default index to be able to use
12 years ago
Michael Peter Christen efd2c4622d added a new fail type attribute for the index to distinguish two
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
reger 328ce0b297 fix: remove fixed individual testing IP (85.25.151.30 = server4you.de) from default/yacy.network.freeworld.unit
12 years ago
Michael Peter Christen e2c4c3c7d3 migration to solr 4.0.0
12 years ago
sixcooler 2d972f289a rise commitWithinMs to default-value from SwitchBoard
12 years ago
Michael Peter Christen 1baf498d59 - show more lines in online log
12 years ago
sixcooler 206e7bcf94 whitelist yacyportalsearch aka search.yacy.net
12 years ago
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
12 years ago
Michael Peter Christen 7e3e45fd04 added Open Graph Metadata default fields, see http://ogp.me/ns#
12 years ago
Michael Peter Christen c3e5f667a7 added schema.org breadcrumb counter to parser and solr schema
12 years ago
Michael Peter Christen 42e525ca9a enhanced the host browser
12 years ago
sof 5cb244b79b Merge remote branch 'origin/master'
12 years ago
apfelmaennchen 88b062210c Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based
12 years ago