Michael Peter Christen
46be4af5b9
Merge commit '2bb8f045cc92f31fc7e720cc30b38af417563890'
12 years ago
Michael Peter Christen
832eead998
Merge remote-tracking branch 'regerdev/master'
12 years ago
Michael Peter Christen
952e143580
FINALLY YaCy can now search for full strings using double- or
...
singlequoted strings in the search query line!!!
12 years ago
orbiter
5dfd6359cb
redesign of the QueryParams class: introduced QueryGoal which holds the
...
query string parser. This shall be used to create a proper full-string
matching which is handled then by QueryGoal.
12 years ago
cominch
2bb8f045cc
content control: use up-to-date definitions
13 years ago
Michael Peter Christen
5fd3b93661
added deletion of hosts during crawl start if deleteold option was given
13 years ago
Michael Peter Christen
d64445c3cb
because we have the inurl:<term> - searchmodifier, we don't actually
...
need regular expressions as search attributes. They had now been removed
from the advanced search page while they are still created internally.
The filter is then expressed against solr as regular expression filter
query. If the expression points out a selection of an specific protocol,
host or filetype this is then translated into a facetted query.
13 years ago
cominch
a67ff1c8ac
SMW Import: replaced JSON import routines with stable ones
13 years ago
cominch
d2a94cc55e
refactor package
13 years ago
cominch
05742b4562
remove old SMW importer which was part of the ymarks package
13 years ago
cominch
21df1ad9e0
update and generalization of the SMW import and content control routines
13 years ago
Michael Peter Christen
842faf96a2
fixed media search
13 years ago
Michael Peter Christen
93001586a0
removed warnings, removed too-fast pausing of crawls
13 years ago
Michael Peter Christen
8041742e48
added matching of path to query pattern
13 years ago
Michael Peter Christen
8b1c9cba3d
fixed a problem with non-terminating crawls
13 years ago
Michael Peter Christen
61a1d32356
fix to ftp client
13 years ago
Michael Peter Christen
5105256927
update to search result logging (this was a remaining issue from the
...
solr 4.0.0 migration)
13 years ago
Michael Peter Christen
570e42c4e3
fix for filetype naviagtor
13 years ago
Michael Peter Christen
71ed8e5e07
bugfixes for crawler
13 years ago
Michael Peter Christen
12c0db20e5
fixed npe for surrogate import
13 years ago
Michael Peter Christen
52df6ee369
more logging
13 years ago
Michael Peter Christen
158732af37
automatically delete entries from the crawl profile list if crawl is
...
terminated.
13 years ago
Michael Peter Christen
15d1460b40
added information about the reason of pausing of crawls
13 years ago
Michael Peter Christen
2371ef031c
added solr faceted search support to YaCy search results
...
added solr highlighting / YaCy snippets to YaCy search results
- facets are now much more complete
- facets are computed and searched much faster
- snippet computation is done by solr if solr knows the snippet
13 years ago
Michael Peter Christen
b30a7162fa
added more thread-renaiming for search processes
13 years ago
Michael Peter Christen
900445d8e9
set the thread name during solr queries to the solr query to get better
...
debugging options
13 years ago
Michael Peter Christen
d481abd087
added the visualization of error-urls to host browser
...
- only visible for admins
- a faceted search generates a huge list for all hosts in the host list
- the faceted search algorithms had to be modified for that
- within the browsing of the directory path, the error cause is written
to the url which is presented as error-url
- the errors are also accumulated for directory sums
13 years ago
Michael Peter Christen
a15819fbec
fix for some interface problems
13 years ago
Michael Peter Christen
791e1dcfdf
when a new crawl is started, delete all entries about error-urls for
...
crawl-start domains
13 years ago
Michael Peter Christen
619bf7e875
fixed filetype modified for media types in text search
13 years ago
Michael Peter Christen
97f82994a6
automatically pause the crawler if there is a problem with solr
13 years ago
Michael Peter Christen
8fb370d9f8
renovated the way how search results are count. should be correct now...
13 years ago
Michael Peter Christen
7bec253bb0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen
d88eb657fd
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
orbiter
354ef8000d
- added 'deleteold' option to crawler which causes that documents are
...
deleted which are selected by a crawl filter (host or subpath)
- site crawl used this option be default now
- made option to deleteDomain() concurrency
13 years ago
reger
633fbe9188
Fix Metadata handling
...
- language default on missing lang property to "uk" (fix set to nothing)
- language set to TLD (added call to existing language calculation from TLD)
- coordinate number exception on possible lat/lon content of "NaN,NaN"
adjust Netbeans IDE classpath (for Solr/Lucene 4.0.0 jars)
13 years ago
Michael Peter Christen
75dd706e1b
update to HostBrowser:
...
- time-out after 3 seconds to speed up display (may be incomplete)
- showing also all links from the balancer queue in the host list (after
the '/') and in the result browser view with tag 'loading'
13 years ago
Michael Peter Christen
e2c4c3c7d3
migration to solr 4.0.0
13 years ago
Michael Peter Christen
b764de424a
code cleanup
13 years ago
Michael Peter Christen
9330ad4838
- fixed the delete option in host browser
...
- added a delete method which can be used to delete a full subpath in
solr.
13 years ago
Michael Peter Christen
a63179f3f9
added the MIME attribute for the R tag in GSA search result writer
13 years ago
Michael Peter Christen
1168d09de8
more refactoring - integrated the code of SnippetProcess into
...
SearchEvent
13 years ago
Michael Peter Christen
6629e37685
tried to clean up the search process mess
13 years ago
Michael Peter Christen
c5f67a5d6d
fixed a problem with local search from solr results: now all results
...
from solr are shown (again)
13 years ago
Michael Peter Christen
f8f05ecba7
- added a delete button in host browser to delete a complete subpath
...
- removed storage of default collection name - default is now "user"
- made stacking of crawl start points concurrently
13 years ago
Michael Peter Christen
0716a24737
added more / all new crawl profile fields into crawl profile editor
13 years ago
Michael Peter Christen
4a14122ba7
in case that a crawl profile has a collection assigned, use the
...
collection to show a name in the web interface. This should prevent that
much too long names make the interface unusable.
13 years ago
Michael Peter Christen
0fe8be7981
enhaced data structures for balancer and latency computation which
...
should produce a bit better prognosis about forced waiting times.
13 years ago
Michael Peter Christen
ac9540dfb6
removed options for stopwords which are not used
13 years ago
Michael Peter Christen
ce3fed8882
added the Google Search Appliance (GSA) api interface to the main menu.
...
See:
https://developers.google.com/search-appliance/documentation/68/xml_reference#request_overview
13 years ago