Commit Graph

51 Commits (f8f1959ebb3f96b66e75d7d83cd70ae9714e85bd)

Author SHA1 Message Date
luccioman 8da3174867 Ensure lower case conversion consistency with any default locale.
7 years ago
Michael Peter Christen 3b1d640a3c enhanced debugging
8 years ago
Michael Peter Christen 973d74712f added yacy grid flatjson surrogate parser
8 years ago
luccioman ac766327d3 Switched a few more Solr fields from strictly mandatory to optional
8 years ago
luccioman cdc7f3e431 Switched some Solr fields from mandatory to optional
8 years ago
luccioman c68a8be2d9 Refactored and enforced Solr mandatory fields for proper operation
8 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago
sixcooler 646afe9183 do not store subfield *_coordinate + make all num-fields being docvalues
9 years ago
reger 802ccaead6 fix init of error cache, use latest faildates => load_date_dt
9 years ago
sixcooler 87e4abe393 fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
9 years ago
Michael Peter Christen 0aa6fcf259 remove old vocabularies and synonyms before adding new
9 years ago
Michael Peter Christen b060ba900d added parsing of contentprop attribute in html tags for
10 years ago
Michael Peter Christen 4cb4f67f38 added parsing of dd, dt and article html fields. The parsed result is
10 years ago
Michael Peter Christen 535f1ebe3b added a new way of content browsing in search results:
10 years ago
reger 73ba5d8ef7 adjust fieldtype and description of field httpstatus_redirect_s in CollectionSchema
10 years ago
Michael Peter Christen 66b5a56976 Added and integrated new date detection class which can identify date
10 years ago
Michael Peter Christen c67c5c0709 added new solr schema fields which record the occurences of vocabulary
10 years ago
Michael Peter Christen b1cfbc4a04 added new solr field url_paths_count_i which can be used to enhance the
10 years ago
reger fb1fcc2b03 handle noarchive tag, skip writing page to cache
10 years ago
Michael Peter Christen 1092e798a5 fixed double content postprocessing
11 years ago
Michael Peter Christen ff5b3ac84d added new fields http_unique_b and www_unique_b which can be used for
11 years ago
Michael Peter Christen 9a5ab4e2c1 removed clickdepth_i field and related postprocessing. This information
11 years ago
Michael Peter Christen cca851a417 introduced new solr field crawldepth_i which records the crawl depth of
11 years ago
Michael Peter Christen 8b44fcf0f4 added missing @Override annotation
11 years ago
Michael Peter Christen e515dd460d added linkscount_i and linksnofollowcount_i to the default solr schema
11 years ago
Michael Peter Christen 1b61bd40ed - Added new solr field url_file_name_tokens_t which stores the file name
11 years ago
orbiter 5f5a97bafc added the anchor text within web pages to the searcheable entities of a
11 years ago
orbiter 705b3338ee list more fields available for search and for ranking boosts
11 years ago
Michael Peter Christen 21aa6a0321 migration to Solr 4.5.0
11 years ago
Michael Peter Christen 4f83d5f18c added the new field harvestkey_s to the collection index and the
11 years ago
Michael Peter Christen 85456f46b2 added two new fields, exact_signature_copycount_i and
11 years ago
Michael Peter Christen a2511b5600 turned images_alt_txt back to images_alt_sxt because it is not necessary
11 years ago
orbiter f106345eef link strings should not be tokenized
11 years ago
orbiter deadeb406e image alt tag strings should be tokenized
11 years ago
Michael Peter Christen 2857499467 fix to collection schema; bug appeared for _txt fields with empty String
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
orbiter a548354c71 replaced type of solr schema object sku of text_en_splitting_tight by
12 years ago
Michael Peter Christen 16d1d744fa added url_file_name_s in default collection schema for the file name
12 years ago
orbiter 8792e6c6e9 stub for better image indexing
12 years ago
Michael Peter Christen 570511f3c8 removed fields references_internal_id_sxt and
12 years ago
Michael Peter Christen f7e77a21bf Added a citation reference computation for intra-domain link structures.
12 years ago
Michael Peter Christen cca19d94d4 re-declared some fields to be of type string rather than text which
12 years ago
Michael Peter Christen 50421171c3 added new schema fields:
12 years ago
Michael Peter Christen 7ab5093321 added new solr title_exact_signature_l and
12 years ago
Michael Peter Christen 27d6222880 added new field host_extent_i which, after a crawl and postprocessing,
12 years ago
Michael Peter Christen ada3f27de7 added three new field for a better ranking: references_internal_i,
12 years ago
Michael Peter Christen 2080fc7406 removed unused tag fields
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 008288719c fix for schema export to consider also automatically generated
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago