Michael Peter Christen
a1e8bdd5e9
log ppm instead of docs/second
10 years ago
Michael Peter Christen
cc0ded7abd
set process type of web graph according to fields as defined in the
...
schema
10 years ago
Michael Peter Christen
12fb9d7cd1
log postprocessing constraints in case that postprocessing is not
...
performed
10 years ago
Michael Peter Christen
3c23b89823
less logging
10 years ago
Michael Peter Christen
a0c53174c5
better solr query logging to detect unnecessary sort requests for more
...
performance profiling
10 years ago
Michael Peter Christen
338f574bdc
no sorting if http/www unique fields are not demanded (makes query
...
faster) and some code restrucuring
10 years ago
Michael Peter Christen
1609763be5
toString fix
10 years ago
Michael Peter Christen
b983e68254
more retries, less sleep
10 years ago
Michael Peter Christen
1503ba7794
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
reger
8f77719091
fix "Ljava.lang.String" in crawl queue anchor name
...
(e.g. IndexCreateQueues_p.html?stack=LOCAL with images in queue)
10 years ago
Michael Peter Christen
0ceeceb35e
more logic on Solr queries; usage of the query terms in posprocessing,
...
saving one query for double document detection now per document
10 years ago
reger
3963bca3b6
catch IndexControlRWIs_p error if RWI not connected
10 years ago
orbiter
38864ae004
Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter
4099296b45
added new classes which shall reduce call overhead to Solr (stub)
10 years ago
reger
d0c02e1de7
adjust rss lat/lon to double
...
(common format across other classes)
10 years ago
orbiter
3491ab4c38
removed unused images from webgraph edge computation
10 years ago
orbiter
2371d6b8db
target linktexts must be string to enable search facets on these fields
10 years ago
Michael Peter Christen
001e05bb80
do not store failure of loading of robots.txt into the index as a fail
...
document
10 years ago
Michael Peter Christen
05d58e4df0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
98f45c9032
fix for image alt attachment to AnchorURLs in html parser.
10 years ago
orbiter
22ce4fb4dd
better error handling for remote solr queries and exists-checks
10 years ago
reger
b510b182d8
- update Maven pom
...
- add ppt parser test case
10 years ago
Marc Nause
3dcfc717eb
This hopefully fixes http://mantis.tokeek.de/view.php?id=424
10 years ago
Marc Nause
9df14fc126
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Marc Nause
477be17c51
Replaced old UPNP library with Weupnp. UPNP should
...
work now, at least it does on my network. UPNP code in YaCy can still
be improved though (see TODO comment: make port on gateway configurable
or find free one).
*) removed old code
*) added new lib
*) changed code to work with new lib
10 years ago
orbiter
738989aab7
reverted commit f94c91315b
because the
...
webgraph has not enough performance for that
10 years ago
orbiter
e9163e7e10
fix for malformed hostpath names in crawl balancer
10 years ago
orbiter
161a11070c
yacystats is gone :(
10 years ago
Michael Peter Christen
c115f3869c
enhanced snippet computation and test method in ViewFile
10 years ago
reger
6c10b59f3e
move bootstrap peers test systems to its test class
...
var assignment not needed elsewhere.
10 years ago
reger
7328c2883b
fix type in .init description
...
http://mantis.tokeek.de/view.php?id=430
10 years ago
reger
94819f0797
set .ini default boost fields to same as assigned by button "reset to default"
...
(in RankingSolr_p)
- fix typo http://mantis.tokeek.de/view.php?id=430
10 years ago
reger
b4b937a046
update to pdfbox 1.8.6
10 years ago
orbiter
1027f3d04a
fix for the usage of ready-prepared solr queries, some queries are
...
formulated as edismax query but this was not set as query attribut. The
defType=edismax property needs a qf-field, so this was added as well. Do
not remove that field again! This fixes also a problem with title-unique
computation.
10 years ago
Michael Peter Christen
f94c91315b
if the webgraph is used, then use it also for reference computation to
...
avoid contradictions with references_i in the collection index.
10 years ago
Michael Peter Christen
6e1dc444c3
added a snippet test function in ViewFile: you can now search for a
...
specific word on the document; the servlet returns the snippet in the
same way as it would be shown in a search result.
10 years ago
Michael Peter Christen
c63e93df46
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
1bf605b6d1
toString() fix
10 years ago
orbiter
4b06adb751
fix for file urls
10 years ago
orbiter
08409ec680
no idea why the words max was an ordered one. This change increaes speed
...
dunring document processin a bit
10 years ago
reger
dd311ddac9
Merge origin/master
10 years ago
reger
e5854a5cdb
fix localhost link to opensearchdescription.xml
10 years ago
reger
29d1945c16
fix double &query parameter (index.html)
...
?query=word&query=
10 years ago
Marc Nause
172d7e68da
Updated commandline reconfiguration tool.
...
*) fixed "set HTTP port" (root cause was sloppy implementation of method
which gets values from config file)
*) added "set HTTPS port"
10 years ago
Michael Peter Christen
b44626e55b
fixed target_alt_t in webgraph
10 years ago
Michael Peter Christen
504327b15c
fix for condition for writing the webgraph
10 years ago
Michael Peter Christen
542c20a597
changed handling of crawl profile field crawlingIfOlder: this should be
...
filled with the date, when the url is recognized as to be outdated. That
field was partly misinterpreted and the time interval was filled in. In
case that all the urls which are in the index shall be treated as
outdated, the field is filled now with Long.MAX_VALUE because then all
crawl dates are before that date and therefore outdated.
10 years ago
Michael Peter Christen
4eec1a7452
refactoring (change Metadata name of load time data structure to avoid
...
confusion with Node data which is also called metadata)
10 years ago
reger
c95ba52cf0
improve logexception info
...
- log a message or class name insted of msgtxt "null"
10 years ago
reger
7f0e757bb5
fix bookmark.rss
...
- channel end tag postion
- link with html entity
10 years ago