Michael Peter Christen
a8dc4346e8
default configuration of MMapDirectoryFactory for solr, increased lock
...
timeout, less documents from remote searches (too many results had
easily blocked a peer)
12 years ago
Michael Peter Christen
0c1a018bbd
removed 'later' tactic because it used too much RAM, reduced number of
...
soft commits, reduced caching size of search events, ensured that solr
results are processed before connection is closed to keep that stuff not
too long in RAM
12 years ago
Michael Peter Christen
5344a1c5f7
getting the trash out
12 years ago
Michael Peter Christen
709e9b8ce7
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
1eb9626cca
less logging
12 years ago
Michael Peter Christen
281959a2d7
added option to re-boot the embedded solr during run-time. Added also
...
API recording for this method so it can be repeated automatically. The
index dump generation is now also available for API recording. Added
some synchronization in backend which was necessary for this.
12 years ago
orbiter
da621e827e
prevent NPE in case RWI is disabled
12 years ago
Michael Peter Christen
c2bcfd8afb
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
67757b425a
use a retry handler with retryCount=0 because we usually expect requests
...
to fail if we access non-permanently available resources (peers, web
pages) and want to fail fast without repeating the same request which is
doomed to fail. The previous appearance of http client connection had a
1-2-4-8-second timeout scheme, which caused that connection attempts
lasted for 16 seconds.
12 years ago
Michael Peter Christen
c2b1075dcf
activating pollImmediately in case that DHT receive is off. This will
...
cause a much faster search result when running in public robinson mode.
12 years ago
orbiter
888a985dc6
set a higher limit for table copy usage
12 years ago
Michael Peter Christen
2b563debbf
javadoc of new multiple-exist test
12 years ago
Marc Nause
8fb1b1e290
*) simplified banner creation code
12 years ago
Michael Peter Christen
8f2d3ce2f9
reduced locking situation in crawler: shifted synchronized location and
...
reduced time-out of robots.txt load limit
12 years ago
reger
97ab5b90e8
- odt & ooxml (office document) parser correction to add content to fulltext index
...
- adjust Junit yacyVersionTest & ParserTest
- update yacyVersion.combined2prettyVersion to the default 4-digit minor ver.
12 years ago
Michael Peter Christen
b68fbe7d21
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
source/net/yacy/migration.java
12 years ago
Michael Peter Christen
06d3063dc9
- no downcase when using collection modifier
...
- removed warnings
12 years ago
Michael Peter Christen
8dbc80da70
redesign of index.exist-test: this shall now not be done using a single
...
id to be tested, but with a collection of ids. This will cause only a
single call to solr instead of many. The result is a much better
performace when testing the existence of many urls. The effect should
cause very much less IO during index transmission, both on sender and
receiver side.
12 years ago
reger
7f63d3747d
more generic field selection for reindex option of documents with disabled fields
...
using Luke request to compare config with actual fields in index
12 years ago
Michael Peter Christen
c91c67c3cd
reject bad solr requests
12 years ago
Michael Peter Christen
44e363f37f
refactoring of WorkflowProcessor, added process counter, update of
...
process counter if an blocking thread dies. Added also a new column in
PerformanceConcurrency_p servlet to show the actual number of concurrent
processes.
12 years ago
Michael Peter Christen
4058369288
fixed query expressions for collection selection (added quotes)
12 years ago
Michael Peter Christen
f2e36fbd06
enhanced deletion process for very large number of documents
12 years ago
reger
79401cb938
added reindex option for documents with disabled or obsolete fields to Solr Schema Editor page (IndexSchema_p.html)
...
this allows to remove obsolete fields from the index (according to current schema config)
by selecting all documents containig disabled fields.
12 years ago
orbiter
cf36c1614f
prevent that concurrent deletion process causes wrong double-check in
...
crawl start
12 years ago
orbiter
aeff31cd44
fix for workflow processor (cause: latest redesign for less threads)
12 years ago
Michael Peter Christen
77faeada4d
small memory leak patch
12 years ago
Michael Peter Christen
b24d1d18e4
removed synchronization and concurrency in Fulltext class, concurrent
...
deletions are now handled in ConcurrentUpdateSolrConnector
12 years ago
Michael Peter Christen
b9b446bca6
- added ssl configuration sign (a lock) to network statistic/table
...
- fixed a bug in bitfield
12 years ago
Michael Peter Christen
e6c8b545c2
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
a83c2fe833
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
4baa0d4a97
Added a default keystore for ssl encryption of the YaCy web interface.
...
This will enable https-access to YaCy, but this feature is disabled by
default using the new server.https=false attribute. This has two
purposes:
- make it easier for everyone to use https (just set server.https=true)
- provide the basis for secure yacy-to-yacy communication in the future
12 years ago
Michael Peter Christen
aaddb4809c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
038f956821
fix for sitemap detection: the sitemap url was not visible if it
...
appeared after the declaration of robots allow/deny for the crawler
because the sitemap parser terminated after the allow/deny rules had
been found. Now the parser reads the robots.txt until the end to
discover also sitemap rules at the end of the file.
12 years ago
reger
4fc6837690
- fix monitor url of crawl job in PerformanceQueues_p.html
...
- reduce logging of every index add (switch embeddedsolr.add from info to debug)
12 years ago
Michael Peter Christen
442ed50be0
removed some unnecessary synchronizations
12 years ago
Michael Peter Christen
ad050ec88d
- upgraded httpclient, httpcore and httpmime
...
- removed httpclient 3.1 which has been used by solrj < 4.x.x and is now
not used any more
- fixed some parts in YaCy which used methods from httpclient 3.1
12 years ago
orbiter
a1c989002b
fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4652
...
generate dht data even if dht receive and dht transmission is switched
off
12 years ago
Michael Peter Christen
e26bdd4a52
fixes to deletion methods (removed unnecessary concurrency and added
...
removal of crawl queue entries)
12 years ago
Michael Peter Christen
f2c9b0b5f2
better robustness of Concurrent Solr Connector against update/deletion
...
thread failure
12 years ago
Michael Peter Christen
f7f3e28c5e
prevent that the size of the index is computed too many times.
...
Because the index size is now provided by solr, and the only way to do
that is a match for [* TO *], a size computation is quite complex and
time-consuming. Therefore this patch prevents that the method is called
at all and if necessary puts a DOS-preventing barrier in front of it.
12 years ago
Michael Peter Christen
cca19d94d4
re-declared some fields to be of type string rather than text which
...
makes them more efficient and less large
12 years ago
Michael Peter Christen
cc90f82dbb
increased default proxy client timeout to one minute
12 years ago
Michael Peter Christen
ed1d5bace6
draw the names of other peers which receive/send dht into the network
...
graphic
12 years ago
Michael Peter Christen
b528448332
enlarge network graph circle according to image height and reduce the
...
image height in the Network servlet. Overall, the image is now larger
but takes less space on the web page.
12 years ago
reger
24d2b4baee
remove pre 1.0 migration statement which possibly overwrites user navigator setting
12 years ago
Michael Peter Christen
3841854c97
abstraction of catchall term
12 years ago
Michael Peter Christen
ea85674be2
added the date to error documents
12 years ago
Michael Peter Christen
6fafed2180
fix for solr cache when a delete buffer is filled and a document, which
...
is the delete queue, is replaced with a new one.
12 years ago
Michael Peter Christen
20b767f35e
preventing score computation in solr where applicable
12 years ago