Michael Peter Christen
eb90d38cd7
added missing extension 'mkv' for navigation
12 years ago
Michael Peter Christen
433143ba40
removed protocol, tld, ext from the urlmask and created specific
...
navigation field for these
12 years ago
Michael Peter Christen
84f82541e8
search process enhancements
12 years ago
Michael Peter Christen
02020b590b
- removed all extension types from extension navigation which are not
...
proper/known
- automatically show the protocol navigation if there is more than http
and https
- automatically show the extension navigation if there is some media
content
12 years ago
Michael Peter Christen
01200f06cc
using the author field as solr-native facet. this makes it necessary to
...
introduce a copy-field for the author field to be copied to a string
field. This field is then used to generate facets. Without this field,
the facet would consist only of the words of the author names, not of
the full author string.
12 years ago
Michael Peter Christen
34f8786508
removed dependency of vocabulary navigation from Jena and it's
...
triplestore; the vocabulary search is now done using generic solr fields
which are created on-the-fly during runtime.
12 years ago
Michael Peter Christen
8fc3679c66
using more pre-compile pattern for split methods
12 years ago
Michael Peter Christen
d48e9788d2
enhanced search result processing behavior
...
- query less at one time; query more often
- in between the small queries, evaluate results
- remove fields from search results which are not needed
12 years ago
reger
469efcdb9d
fix: display and calculate authors and namespace search navigator if configured (otherwise skip overhead)
...
(leave hosts, topics and not in ConfigPortal included filetype, protocoll navigator untouched)
12 years ago
orbiter
ee612e8b93
start the local search only if this peer is doing a remote search or
...
when it is doing a local search and the peer is old
12 years ago
Michael Peter Christen
d6b82840f8
added a feature to find similarities in documents.
...
This uses an enhanced version of the Nutch/Solr TextProfileSignatue.
As a result, a signature of the document is written to the solr search
index. Additionally for each time when a signature is written, it is
checked if the singature exists already in the index. If the signature
does not exist, the document is marked as unique. The unique attribute
can now be used to sort document lists and bring duplicates to the end
of a result list.
To enable this, a large portion of the search api to Solr had to be
changed. This affected mainly caching of 'exists' searches to enhance
the check for existing signatures and do this without actually doing a
solr query.
Because here the first time a long number is used as value in the Solr
store, also the value naming in the YaCySchema had to be adopted and
normalized. This caused that many files had to be changed.
12 years ago
Michael Peter Christen
46be4af5b9
Merge commit '2bb8f045cc92f31fc7e720cc30b38af417563890'
12 years ago
orbiter
5dfd6359cb
redesign of the QueryParams class: introduced QueryGoal which holds the
...
query string parser. This shall be used to create a proper full-string
matching which is handled then by QueryGoal.
12 years ago
cominch
d2a94cc55e
refactor package
12 years ago
cominch
21df1ad9e0
update and generalization of the SMW import and content control routines
12 years ago
Michael Peter Christen
842faf96a2
fixed media search
12 years ago
Michael Peter Christen
93001586a0
removed warnings, removed too-fast pausing of crawls
12 years ago
Michael Peter Christen
570e42c4e3
fix for filetype naviagtor
12 years ago
Michael Peter Christen
158732af37
automatically delete entries from the crawl profile list if crawl is
...
terminated.
12 years ago
Michael Peter Christen
2371ef031c
added solr faceted search support to YaCy search results
...
added solr highlighting / YaCy snippets to YaCy search results
- facets are now much more complete
- facets are computed and searched much faster
- snippet computation is done by solr if solr knows the snippet
12 years ago
Michael Peter Christen
619bf7e875
fixed filetype modified for media types in text search
12 years ago
Michael Peter Christen
8fb370d9f8
renovated the way how search results are count. should be correct now...
12 years ago
Michael Peter Christen
b764de424a
code cleanup
12 years ago
Michael Peter Christen
1168d09de8
more refactoring - integrated the code of SnippetProcess into
...
SearchEvent
12 years ago
Michael Peter Christen
6629e37685
tried to clean up the search process mess
12 years ago
Michael Peter Christen
c5f67a5d6d
fixed a problem with local search from solr results: now all results
...
from solr are shown (again)
12 years ago
orbiter
276dd6452b
removed warnings
12 years ago
Michael Peter Christen
ce0e5b1e17
- more refactoring / private methods
...
- fix for usage of custom solr field names
12 years ago
Michael Peter Christen
36c13ed15b
less solr prefetch
12 years ago
Michael Peter Christen
584663ae8c
- redesign of solr query construction
...
- fix for solr boosts and location search
- fix for number of search results in local search
13 years ago
orbiter
4fed4a86d8
another fix to location search
13 years ago
Michael Peter Christen
1533bfd63b
refactoring
13 years ago
Michael Peter Christen
e57bf2ca39
simplified DHT classes
13 years ago
Michael Peter Christen
8219a445f3
refactoring
13 years ago
Michael Peter Christen
00c1c777fa
refactoring
13 years ago
Michael Peter Christen
4d29f59a27
removed warnings
13 years ago
Michael Peter Christen
e8acd542b5
- added faceted drill-down for host and geolocation to solr queries
...
- added a new geolocation field to index schema, the old values are
migrated if possible
13 years ago
Michael Peter Christen
ff3eaa21b0
added remote search to solr on YaCy peers!
...
- when doing a remote search, node peers are selected for solr queries
- the solr query is done concurrently to the standard YaCy rwi search
- the solr search result is feeded into the same data structure that
prepares the rwi search result
- the same remote seach that is done to several outside peers is done to
the local solr index
- the search process works now also without any 'old' RWI data using
solr
13 years ago
Michael Peter Christen
a06123aec6
more abstraction and less parameter overhead for remote search
13 years ago
Michael Peter Christen
f00733186b
code simplifications
13 years ago
orbiter
404b0aab09
refactoring in remote search and stub for remote node peer selection
13 years ago
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
13 years ago
orbiter
62202e2d71
refactoring of query attribute variable names for better consistency
...
with (next) stored query words
13 years ago
Michael Peter Christen
241dd8410a
removed snippet pattern filter - it was not used
13 years ago
Michael Peter Christen
613b45f604
- better data structures in secondary search
...
- fixed a big memory leak in secondary search
13 years ago
Michael Peter Christen
0c345d1559
giving threads name so its easier to see whats happening during
...
debugging and within a thread dump
13 years ago
Michael Peter Christen
ab7107b34b
fixed RWIProcess queue limits: now discovering hidden results for mass
...
result retrieval
13 years ago
Michael Peter Christen
a1fe65b115
performance hacks
13 years ago
Michael Peter Christen
89142d1e8d
removed (not all) warnings
13 years ago
Michael Peter Christen
ba6aaabc51
refactoring + parser bugfixes
13 years ago