Michael Peter Christen
e6f361f474
adding the canonical tag to crawl queues
12 years ago
orbiter
40c5ee47c1
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
ae23a0badb
updated copyright message; included LGPL for 'cora' and a warranty
...
warning.
12 years ago
reger
a6bf44212e
bugfix: location (lat/lon) meta data retrival (Double.NaN check)
12 years ago
Michael Peter Christen
203921006a
redesign of citation index storage
12 years ago
orbiter
7c6ccc426c
set crawlingQ to true by default because most webpages are dynamic and
...
crawlingQ should only be switched off in case of crawler traps
12 years ago
Lotus
5de4267a9d
windows installer: update to latest jre
12 years ago
reger
83763ee4a4
jpeg parser: extract GPS location from meta data
12 years ago
Michael Peter Christen
e92b9275ce
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
56cdcfa2fa
fixed greedy learning mode - global is not a search attribute in
...
searchitems
12 years ago
Michael Peter Christen
32aa1d4569
removed unused option for queries
12 years ago
Michael Peter Christen
0c5bed7e2c
added configuration option for greedy learning function to ConfigPortal
...
servlet
12 years ago
sixcooler
5d1f619f07
possible helpful closing of solr-requests
12 years ago
Michael Peter Christen
9d291764d1
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
sixcooler
e5abccdfe4
added optimize-option
12 years ago
Michael Peter Christen
8ea6ddf636
removed attributes from ConfigPortal.html which are redundant to
...
ConfigSearchPage_p.html
12 years ago
Michael Peter Christen
64140f35cd
fix for solr requests if no query part is given (prevent npe)
12 years ago
Michael Peter Christen
8caaf6203a
fixed false multiple-generation of remote facet search which
...
caused high cpu usage on remote side.
12 years ago
Michael Peter Christen
23fb458963
- fix to gsa searchresult answer in case that no query part is given
...
- fix to gsa default number of results (is 'num')
12 years ago
Michael Peter Christen
823ae4d6a7
added url_protocol_s to error documents
12 years ago
Michael Peter Christen
660a196989
refactoring
12 years ago
Michael Peter Christen
c4538d8d91
added metadata-extractor-2.6.2.jar to eclipse classpath, removed old lib
12 years ago
reger
3760e2616b
bump up lib/metadata-extractor-2.6.2.jar (used for image parser) with needed code adjustments
12 years ago
Michael Peter Christen
9a6fcdf597
npe fix
12 years ago
Michael Peter Christen
54024958ac
added url_file_name_s in qeury for live-search of urls
12 years ago
Michael Peter Christen
16d1d744fa
added url_file_name_s in default collection schema for the file name
...
without the file extension. This part of the file path is removed from
the multi-field url_paths_sxt, which has now not the file name as last
part of the path list.
The same applies to the new fields source_file_name_s and
target_file_name_s in the webgraph schema.
12 years ago
reger
8d1c4c423d
make imageparser fileextension detection case insensitive (extensions are often upper case)
12 years ago
Michael Peter Christen
f542cf7d9c
fix for daterange: the to-date is inclusive
12 years ago
Michael Peter Christen
f9d859f5dc
now writing image alt texts and (camelcase-)parsed urls into a text
...
search field for a better image retrieval
12 years ago
Michael Peter Christen
c36720d45f
added daterange option to gsa api
12 years ago
Michael Peter Christen
e441a9d4c8
to avoid confusion, the gsa api is available at /search? and
...
/searchresult?
12 years ago
orbiter
8792e6c6e9
stub for better image indexing
12 years ago
orbiter
97f2ac9091
added hint to gsa response writer that the result comes from a yacy peer
12 years ago
orbiter
d62464f129
start of next development cycle with small version number 0.01 (as in
...
the past)
12 years ago
Michael Peter Christen
363e955a0c
Release 1.5
12 years ago
Michael Peter Christen
14186e815e
npe fix
12 years ago
Michael Peter Christen
4e3007f4a0
typo
12 years ago
Michael Peter Christen
bdf306e0a7
increased time-out for loading of seed-lists
12 years ago
Michael Peter Christen
2cb6b6bc21
added target="_blank" to shutdown links
12 years ago
orbiter
c8e94ad7c7
fix for citation search in case that the citation is very fresh
12 years ago
orbiter
57dcf68665
added a feed-back message inside the shutdown page
12 years ago
Michael Peter Christen
0600d510e1
show the citation report also in ViewFile
12 years ago
Michael Peter Christen
1a92b61d69
fixed usage of ViewFile which needs a commit before showing latest crawl
...
result pages.
12 years ago
Michael Peter Christen
374d2e2a52
removed warning message during crawling
12 years ago
Michael Peter Christen
570511f3c8
removed fields references_internal_id_sxt and
...
references_internal_url_sxt because they had been shown to be
superfluous. The citation of referrer in the host browser is possible
without them. Therefore now the host browser does not only show
internal, but also external referrer to each link.
12 years ago
Michael Peter Christen
fd1776a3b0
added a new 'Citations' function: each search result item can now be
...
explored for citations within other documents. A click on the
'Citations' link shows an analysis with all text lines in the document
each with a complete list of documents which contain the same line. A
second section shows the linking documents in ascending order of number
of citations from the original document. Because documents from
different hosts are most interesting here, they are listed at the top of
the page as possible 'copypasta' source.
12 years ago
Michael Peter Christen
fc3ff92c69
npe fix
12 years ago
Michael Peter Christen
7754a1263b
switching back to the merge factor 10; the solr default.
12 years ago
Michael Peter Christen
1762911f57
added synchronizations and timeouts in solr api; missing
...
synchronizations in index modification methods causes deadlocks inside
solr.
12 years ago
Michael Peter Christen
3e1e358fdc
calling pdf cache flush on class initialization because calling of the
...
methods during runtime can conflict with dynamic solr class loader and
cause a deadlock (seriously!)
12 years ago