Commit Graph

1084 Commits (135a123a773c32843240c4bb5faf9b0329e141f2)

Author SHA1 Message Date
luc f01d49c37a Process large or local file images dealing directly with content
9 years ago
luc 3c4c77099d If available, check content length before downloading. Check also
9 years ago
reger 2985baaa01 Exclude repetitive protocol part in tokenized url
9 years ago
Michael Peter Christen d1ae999ef9 replaced HashMap with LinkedHashMap to preserve the object order
9 years ago
reger c9937973e3 unescape MultiProtocolURL getAttributes() return values.
9 years ago
reger 43c27aa550 upd to solr/lucene 5.3.1
9 years ago
reger 688f7b2a5c allow/display svg images in image results previews
9 years ago
Michael Peter Christen 8e555d79a3 add also 1-character tokens to the token list because that could be also
9 years ago
reger bad34804fe optimize parseInt for <img> tag attribute parsing
9 years ago
reger 52e3eb4ce8 harmonize/correct assignment to Ymarkmeta.mime
10 years ago
Michael Peter Christen 87f358058e Fix for index entries which have id's not computed as hash from the url.
10 years ago
Michael Peter Christen 5f706797cb patch for a bug inside of solr since solr 5.0 when using a boost
10 years ago
reger b4cbdea1e7 adapt SolrServerConnector.add to handle error on partial update input document.
10 years ago
reger e37a4f0b3d prevent metadata records in index w/o valid url
10 years ago
reger 4cf875336c complete TODO: getFileExtension handle dot in query part
10 years ago
sixcooler 87e4abe393 fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
10 years ago
reger c33229fc0c check mime prior to ext for metadata modification for images
10 years ago
reger 206883f80d fix: Preserve protocol in url proxy
10 years ago
Michael Peter Christen 8028410ab7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen df3314ac1a added a new facet type based on a probabilistic classifier using
10 years ago
reger e2e73258ca remove obsolete interface SearchAccumulator
10 years ago
Michael Peter Christen dbbad23e12 removed warnings
10 years ago
Michael Peter Christen 500cfa9457 enhanced logging
10 years ago
Michael Peter Christen 203df5a750 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
reger ee77f24e52 use some more declared HeaderFramework constants
10 years ago
Michael Peter Christen 11a848da5a Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen b94bd7f20a a collection of search query enhancements:
10 years ago
Michael Peter Christen 1ccbf739b1 added bayes filter from Philipp Nolte, originally taken from
10 years ago
Michael Peter Christen de8cfbe1d7 added export option to export the fulltext of the search index text only
10 years ago
Michael Peter Christen 03ea723889 added log lines for query performance profiling
10 years ago
Michael Peter Christen 0e87a99ab8 more fixes for special windows paths
10 years ago
Michael Peter Christen e5b6424eed patch for bad windows file paths
10 years ago
Michael Peter Christen 0aa6fcf259 remove old vocabularies and synonyms before adding new
10 years ago
reger 821262a179 add CommonPattern for multiple spaces
10 years ago
Michael Peter Christen 694b22f165 migration to Solr 5.2: huge benefits - this is a lot faster!
10 years ago
Michael Peter Christen 34de1e8cbc gzip compression will perform more efficient and with better compression
10 years ago
Michael Peter Christen b43811d38c added surrogate import process for exported solr dumps.
10 years ago
Michael Peter Christen c7576d6028 added a full solr export to the IndexControlURLs_p.html servlet. The
10 years ago
reger cd31633369 improve MultiprotocolURL.getFileExtension()
10 years ago
Michael Peter Christen f5f88272e4 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen 5c67c4d460 fix for latest commit, see
10 years ago
reger c37dda8849 fix NPE on MultiProtocolURL on url with parameter value and '='
10 years ago
Michael Peter Christen f810915717 added crawl start from a clone with very, very large url: they are now
10 years ago
Michael Peter Christen 51de86c992 disabled debug thread dumps
10 years ago
Michael Peter Christen 0710648c31 enable api calls with very long urls
10 years ago
reger 1481a8ab56 add opensearch rss results to dht collection (due to text = snippet)
10 years ago
Michael Peter Christen fbf85a1561 added temporary debug output in http client
10 years ago
Michael Peter Christen ff29b0e503 added option to re-index exported xml snapshot dumps to
10 years ago
Michael Peter Christen fed26f33a8 enhanced timezone managament for indexed data:
10 years ago
Michael Peter Christen b060ba900d added parsing of contentprop attribute in html tags for
10 years ago