- added default filename filter to select field (as only addition to *.black list is permanent)
- modified Blacklist_p header/legend to show all active blacklists
(to support understanding that all configured lists are active)
- removed obsolete code in Blacklist_p servlet
- metatags my be null
Caused by: java.lang.NullPointerException
at net.yacy.search.query.QueryParams.getFacets(QueryParams.java:445)
at net.yacy.search.query.QueryParams.getBasicParams(QueryParams.java:400)
at net.yacy.search.query.QueryParams.solrTextQuery(QueryParams.java:345)
at net.yacy.search.query.QueryParams.solrQuery(QueryParams.java:334)
at net.yacy.search.query.SearchEvent.<init>(SearchEvent.java:290)
at net.yacy.search.query.SearchEventCache.getEvent(SearchEventCache.java:176)
at IndexControlRWIs_p.genSearchresult(IndexControlRWIs_p.java:641)
at IndexControlRWIs_p.respond(IndexControlRWIs_p.java:141)
regular expressions cause no results. Usage of '*' followed by a dot or
any expression will now cause that this expression is used as a filetype
search.
- see numerous idx entries with content_type image without url_file_ext_s (for various reason) which should be included in result
- try it yourself with following sample query
/solr/select?q=content_type:image/* AND -url_file_ext_s:[* TO *]&defType=edismax&fl=sku,url_file_ext_s,content_type
adresses also possible url without or deviating extension.
the right content domain (i.e. identifying that it is an image, text
etc.) because it used the file extension and not an existing mime type
assignment.
- fixed the new setting that images shall be loaded for a better image
search.
- both fixes together makes it now possible to crawl
commons.wikimedia.org which makes use of 'funny' document names (i.e.
ending with .jpg while the document is html)
is visible whenever a location is available in the search result.
To activate this, the search.navigation property in yacy.conf must be
modified to the new default values.
all unique links! This made it necessary, that a large portion of the
parser and link processing classes must be adopted to carry a different
type of link collection which carry a property attribute which are
attached to web anchors.
- introduction of a new URL class, AnchorURL
- the other url classes, DigestURI and MultiProtocolURI had been renamed
and refactored to fit into a new document package schema, document.id
- cleanup of net.yacy.cora.document package and refactoring
fuzzy_signature_copycount_i, which count the number of copies of
non-unique documents and assigns this to each document. Thus, each
document there is a number assigned which shows how many copies of this
document exists.
These fields are disabled by default.
by checking vocabulary tags also for rwi results (currently a filter is applied to the solr query)
TODO: as vocabularies are only locally valid, auto-switch to Searchdom.LOCAL could be considered.
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
signal in case that a cleanup process wants to remove the search
process. Added also a new cleanup process which can reduce the number of
stored searches to a specific number which can be higher or lower
according to the remaining RAM. The cleanup process is called every time
a search ist started.