Commit Graph

240 Commits (ce8702546260a405e09fa8081c5874b45e27cf7d)

Author SHA1 Message Date
luccioman ac766327d3 Switched a few more Solr fields from strictly mandatory to optional
8 years ago
luccioman cdc7f3e431 Switched some Solr fields from mandatory to optional
8 years ago
luccioman c68a8be2d9 Refactored and enforced Solr mandatory fields for proper operation
8 years ago
reger 5e8879beb7 Reduce self generated content for text_t (visible text index field)
8 years ago
reger 1f497ccad5 Add consistency check for related index fields upon load and save of
8 years ago
reger 581b00cc20 remove obsolete lastmodified calculation in WebgraphConfig
8 years ago
luccioman 6a4d51d8f9 Cleaned up some Javadoc warnings.
8 years ago
reger 4c9be29a55 fix concurrency issue with htmlParser using not current scraper data
8 years ago
reger b522d540b9 Include itemprop latitude/longitude (see schema.org) in attribute
8 years ago
reger 9db68acb4f remove obsolete X_YACY... header declarations
8 years ago
luccioman 6e1959f469 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
luccioman 8c49a755da Postprocessing refactoring
9 years ago
luccioman 42f45760ed Refactored postprocessing
9 years ago
reger 4c7a77662a eleminate dependency on file-extension in storeDocument but use supported mime-type
9 years ago
luccioman 6e96c7341a Merge remote-tracking branch 'origin/master'
9 years ago
reger caf9e98f09 put metadata dc_publisher in corresponding schema field
9 years ago
luc 3f338777f7 Also check and index eventual icon url information from metadata.
9 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago
luc 571bc55937 Refactoring : use StandardCharsets constants instead of hard-coded
9 years ago
reger 45b9bd8403 adjust MultiProtocolURL.protocol detection to handle mailto with "://" in parameters,
9 years ago
sixcooler 646afe9183 do not store subfield *_coordinate + make all num-fields being docvalues
9 years ago
reger a58ee49307 Optimize internal imagequery focus on using content_type to select images
9 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
9 years ago
reger 802ccaead6 fix init of error cache, use latest faildates => load_date_dt
10 years ago
sixcooler 87e4abe393 fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
10 years ago
reger eaf0e8ff2c start recording/indexing pixel size for image document
10 years ago
reger c33229fc0c check mime prior to ext for metadata modification for images
10 years ago
Michael Peter Christen 8028410ab7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen df3314ac1a added a new facet type based on a probabilistic classifier using
10 years ago
reger 1409cabe8b exclude more default search fields from text copy to text_t
10 years ago
Michael Peter Christen 0aa6fcf259 remove old vocabularies and synonyms before adding new
10 years ago
reger f91298d3b6 fix one implicit Integer/Long type conversion
10 years ago
reger 821262a179 add CommonPattern for multiple spaces
10 years ago
Michael Peter Christen 90f75c8c3d added enrichment of synonyms and vocabularies for imported documents
10 years ago
reger f3ce99bfb8 fix extract of inboundlinks_protocol_sxt
10 years ago
reger 5408448a56 skip redundant add. of keywords to text
10 years ago
Michael Peter Christen b060ba900d added parsing of contentprop attribute in html tags for
10 years ago
Michael Peter Christen 4cb4f67f38 added parsing of dd, dt and article html fields. The parsed result is
10 years ago
reger 1395f10e95 fix typecast for css links
10 years ago
reger 7e09bff4a1 exclude default search fields from text copy to text_t
10 years ago
Michael Peter Christen 535f1ebe3b added a new way of content browsing in search results:
10 years ago
reger 9e1ec5fec4 refactor: just some more useages of constant for term ":[* TO *]"
10 years ago
Michael Peter Christen 68c605d637 replace with CommonPattern.SPACE for split
10 years ago
Michael Peter Christen 3e6c3e2237 documents pushed over the api/push_p.html interface will have their
10 years ago
Michael Peter Christen d2792a43fd do not write iframe and embed links into webgraph, but use them anyway
10 years ago
reger 73ba5d8ef7 adjust fieldtype and description of field httpstatus_redirect_s in CollectionSchema
10 years ago
Michael Peter Christen eb78388a98 changed prefer strategy for http unique in such a way that http is
10 years ago
Michael Peter Christen 66b5a56976 Added and integrated new date detection class which can identify date
10 years ago
Michael Peter Christen 6a1865f507 refactoring date -> lastModified
10 years ago
reger 70cf7060a4 coding fixes suggested in
10 years ago