orbiter
5dfd6359cb
redesign of the QueryParams class: introduced QueryGoal which holds the
...
query string parser. This shall be used to create a proper full-string
matching which is handled then by QueryGoal.
12 years ago
Michael Peter Christen
2371ef031c
added solr faceted search support to YaCy search results
...
added solr highlighting / YaCy snippets to YaCy search results
- facets are now much more complete
- facets are computed and searched much faster
- snippet computation is done by solr if solr knows the snippet
12 years ago
Michael Peter Christen
ccc3760a47
Refactoring and redesign of data architecture to make URIMetadataRow
...
superfluous. The target is to make a solr document as the core of YaCy
documents which would cause that many conversions can be removed. On the
way to this target the Equivalence of URIMetadataRow and URIMetadataNode
had to be removed to expose the usage of the old URIMetadataRow data
structure.
This refactoring already removes unneccessary conversions and should
make memory usage during indexing lower.
12 years ago
Michael Peter Christen
5d16c23a1f
specified more URIMetadata as URIMetadataNode
12 years ago
Michael Peter Christen
0cec7e761a
enhanced snippet extractor to find snippets also inside of tokens of an
...
url
12 years ago
Michael Peter Christen
1533bfd63b
refactoring
12 years ago
Michael Peter Christen
8219a445f3
refactoring
12 years ago
Michael Peter Christen
00c1c777fa
refactoring
12 years ago
Michael Peter Christen
9bece5ac5f
enhanced snippet fetch - removed a bug that caused documents to be
...
parsed even if a solr text was available
12 years ago
Michael Peter Christen
24d9db1613
snippet retrieval loading processes may use a smaller minimum load time
...
value than crawling processes. This speeds up the search result
preparation dramatically.
12 years ago
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
12 years ago
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
12 years ago
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
13 years ago
Michael Peter Christen
0301aba1e9
removed unused method parameters
13 years ago
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
13 years ago
orbiter
fc0f9543fe
More SentenceReader cleanup
13 years ago
orbiter
78fc3cf8f8
refactoring and new usage of SentenceReader: this class appeared as one
...
of the major CPU users during snippet verification. The class was not
efficient for two reasons:
- it used a too complex input stream; generated from sources and UTF8
byte-conversions. The BufferedReader applied a strong overhead.
- to feed data into the SentenceReader, multiple toString/getBytes had
been applied until a buffered Reader from an input stream was possible.
These superfluous conversions had been removed.
- the best source for the Sentence Reader is a String. Therefore the
production of Strings had been forced inside the Document class.
13 years ago
Michael Peter Christen
de903a53a0
parser refactoring & hacks
13 years ago
Michael Peter Christen
1825f165b8
better integration of blacklist according to use case
13 years ago
Michael Peter Christen
00f2df1120
a variety of possible memory leak fixes
13 years ago
Michael Peter Christen
10da7335ea
performance hack: use a hash cache for all hashes that are computed by a
...
byte array. If this hash is used in a HashMap (which is very often the
case) then this hack eliminates a lot of re-computations of the same
hash.
13 years ago
Michael Peter Christen
7e0ddbd275
added a "fromCache" flag in Response object to omit one cache.has()
...
check during snippet generation. This should cause less blockings
13 years ago
Michael Peter Christen
76157dc2c3
bugfix for http://bugs.yacy.net/view.php?id=173
13 years ago
Michael Peter Christen
a3badd3205
changed search process for images: no more media snippet load process,
...
show only links from index which had been on the text search page
before. This creates a superfast search process for images!
13 years ago
Michael Peter Christen
33d1062c79
refactoring: the cache belongs to the crawler
13 years ago
Michael Peter Christen
ef78f22ee1
performance hack
13 years ago
Michael Christen
9e5894c784
Removed handling of components objects for URIMetadataRows.
...
This is a preparation to replace this rows with nodes from the node
store.
13 years ago
orbiter
4ad9fc2bff
new snippet strategy for search hits in metadata: show beginning of text instead of hit position
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7999 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
a7df70221e
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
d2ea250d99
refactoring:
...
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago