reger
ea633a794c
including small junit test case for WordTokenizer
10 years ago
reger
5790c7242e
skip to tokenize punktuation as word in WordTokenizer
...
remove unused variables in condenser related to Tokenizer
10 years ago
reger
f07392ff17
add. use host port parameter in YaCyApp
10 years ago
Michael Peter Christen
ad0da5f246
added new web page snapshot infrastructure which will lead to the
...
ability to have web page previews in the search results.
(This is a stub, no function available with this yet...)
10 years ago
reger
aa0faeabc5
adjust translation text of error msg on empty query
...
(ru: needs correction)
10 years ago
reger
c475be2937
fix (enable) error msg on empty query
10 years ago
reger
ef5c5b4489
update to Jetty 9.2.4
10 years ago
reger
f709132961
remove obsolete alternate link
...
fix api link
10 years ago
Michael Peter Christen
3c71e1c872
show vocabularies in search result (in case of debugging)
10 years ago
Michael Peter Christen
1d45d9405a
security bugfix
10 years ago
Michael Peter Christen
ff728b4aa5
ignore url errors during search
10 years ago
Michael Peter Christen
c94c24638f
disabled postprocessing by default. If you read this: please disable
...
postprocessing in your peer as well: open /IndexSchema_p.html, then
deselect field process_sxt
10 years ago
Michael Peter Christen
2fce2e2697
larger boost fields for ranking
10 years ago
Michael Peter Christen
6c03ff8355
bold words in snippets should not be coloured black in the base style
...
because there are styles with dark backgrounds which make the bold word
invisible
10 years ago
Michael Peter Christen
8317914ce3
changed vocabulary navigator object type to TreeMap to get a specific
...
order into the vocabularies. This is now lexicographic which is not so
much random as a hashed order
10 years ago
Michael Peter Christen
d5c1b07768
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
c0f9f6ac66
added option to change the navbar-default, i.e. usable for dark skins
10 years ago
Michael Peter Christen
10794e8efd
trying facet.method fc instead of fcs to handle large facets
10 years ago
Michael Peter Christen
041b605cfe
Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
Michael Peter Christen
f1f74e8626
toString fix
10 years ago
Michael Peter Christen
30276a2b48
prevent that a local Solr search and a local RWI search are running
...
concurrently. When a RWI search result is flushed into the result set,
id does Solr Queries (which replaced the old-style Metadata Queries) and
they are possibly running concurrently to a previously startet Solr
search. Both methods may block each other with IO. To enhance the speed,
they are now serialized. Because the Solr search results may result in
better results using the more advanced and configurable Ranking methods,
this result is preverred over the RWI search result. However, remote RWI
search results are still feeded concurrently into the search result as
well.
10 years ago
Michael Peter Christen
84763126e0
added option to make the YaCy proxy act as the cache is never stale. If
...
set to 'Always Fresh' the cache is always used if the entry in the cache
exist. This is a good way to archive web content and access it without
going online again in case the documents exist.
To do so, open /Settings_p.html?page=ProxyAccess and check the "Always
Fresh" checkbox.
This is set do false which behave as set before.
If you set this to true, then you have your web archive in DATA/HTCACHE.
Copy this to carry around your private copy of the internet!
10 years ago
reger
1e7ee72240
fix path lookup to ./defaults/yacy.badwords
...
(fix of commit ee277b9b3e
)
10 years ago
reger
7d863d6254
fix empty text facet entry
...
(noticed on Author facet)
10 years ago
Michael Peter Christen
a39419f2ef
more stacks shall be considered for on-demand loading, not only
...
deep-depth stacks to prevent "too many open files" problem
10 years ago
Michael Peter Christen
5bb52f79be
reduce number of calls to queue.size() because that may be a bottleneck
...
during crawling
10 years ago
Michael Peter Christen
4920ab7b76
optimize usage of size() cache
10 years ago
reger
ee277b9b3e
allow for local yacy.stopwords and yacy.badwords list (in DATA/SETTINGS/)
...
if file in DATA/SETTINGS it is loaded otherwise file in ./defaults is loaded
(if locale ./defaults/stopwords.xx doesn't exist take solr/lang/stopwords_xx.txt as default)
move yacy.stopwords, yacy.stopwords.de and yacy.badwords.example out of root directory to ./defaults directory
10 years ago
reger
de56266bcb
remove redundant toLower for topwords
10 years ago
Michael Peter Christen
a34f837592
better delete all files in path when removing host crawl stack
10 years ago
Michael Peter Christen
10b1db430a
if we have many hosts, use on-demand earlier
10 years ago
Michael Peter Christen
1324927e66
prevent division by zero
10 years ago
Michael Peter Christen
2beb6abeb6
disabled crazy sleep loop
10 years ago
Michael Peter Christen
092d97d7ac
when importing vocabulary csv files, accept also files without semicolon
...
and truncate quotes from literals
10 years ago
Michael Peter Christen
ee9ec40048
added hints to ranking to make ranking boosts using vocabularies easier
10 years ago
Michael Peter Christen
70f03f7c8e
do not cache search requests to Solr if the result is used for
...
doublechecking. If a double-check comes from cached results the
doublecheck fails.
10 years ago
Michael Peter Christen
a0b84e4def
use a LinkedHashMap for factes to maintain facet order as given by solr
10 years ago
reger
ef5dc68313
include domtype to searcheventcache id
...
to differenciate between local / global events for reuse of cached events
fix for http://mantis.tokeek.de/view.php?id=493
10 years ago
Michael Peter Christen
0dc6e0a5f2
added option to enrich vocabularies with synonyms from synonym database
10 years ago
Michael Peter Christen
6a2a669db4
added loading of the synonyms file from addon/synonyms into the
...
knowledge loader
10 years ago
Michael Peter Christen
c67c5c0709
added new solr schema fields which record the occurences of vocabulary
...
matchings. These matches can be used for result boosting, i.e. if a
document contains words from a specific vocabulary, boost it.
10 years ago
Michael Peter Christen
a67a465415
fix field counter for multi-fields in html writer for the solr servlet
10 years ago
Michael Peter Christen
fdba8e2fa0
fix for 2-day network stats table: showing 48 instead of 24 hours from
...
peer history
10 years ago
Michael Peter Christen
ec9d021568
added option in vocabulary editor to import CSV files with different
...
encodings (preselected windows-type character encoding which is typical
for CSV files). Fixed also other problems with character encoding in
dictionary files. Automatically generated vocabularies are now also
noted in the API steering.
10 years ago
reger
b558433211
adjust tag cloud font size calculation
...
to limit max font size to ~ TOPWORDS_MAXSIZE
10 years ago
reger
3c818fc912
add a check of java version string >=1.7 to startup class
...
stopping start with error msg on version < 1.7
10 years ago
Michael Peter Christen
0550b54d56
added fix to postprocessing: avoid caching of postprocessing collection
...
to always get fresh lists of documents. This is necessary since the
postprocessing changes the same documents which the
postprocessing-collection query selects.
10 years ago
Michael Peter Christen
68e8039fd1
added high-precision scheduler for API processes. This allows also to
...
make the execution in dependency of available RAM or CPU load. The
default value for CPU load is 4.0 and the check runs once a minute.
10 years ago
Michael Peter Christen
8aee7f940e
added missing class for latest changes
10 years ago
Michael Peter Christen
97039049e4
fix in key enumeration methods for cases where the enumeration is done
...
in reverse order.
10 years ago