Michael Peter Christen
c3dcbdc8d5
try to recover from an OOM during citation index reading and fail-over
...
to second solr core in case of unrecoverable OOM.
11 years ago
Michael Peter Christen
7b69c438f7
more methods for the table class
11 years ago
Michael Peter Christen
5e31bad711
- the webgraph shall store all links which appear on a web page and not
...
all unique links! This made it necessary, that a large portion of the
parser and link processing classes must be adopted to carry a different
type of link collection which carry a property attribute which are
attached to web anchors.
- introduction of a new URL class, AnchorURL
- the other url classes, DigestURI and MultiProtocolURI had been renamed
and refactored to fit into a new document package schema, document.id
- cleanup of net.yacy.cora.document package and refactoring
11 years ago
Michael Peter Christen
47b1c81d08
- refactoring
...
- generalized writing of url attributes to solr documents
- added more url attributes to error documents
11 years ago
Roland Haeder
13433d41a1
Log this exception better
...
Conflicts:
source/net/yacy/kelondro/blob/Tables.java
11 years ago
Michael Peter Christen
aeac2fb763
replaced more containsKey() -> get() usages by a simple get(), followed
...
by a test for NULL. This should increase the application speed and
reduces the lookup time for the affected methods by 50%
11 years ago
Roland Haeder
841a28ae76
Added 'final' for all exception blocks as this helps the Java compiler
...
to optimize memory usage
Conflicts:
source/net/yacy/search/Switchboard.java
11 years ago
Michael Peter Christen
5878c1d599
- refactoring of log to ConcurrentLog:
...
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
12 years ago
Michael Peter Christen
14186e815e
npe fix
12 years ago
Michael Peter Christen
e20450e798
patch in HTCache and CitationIndex loading in case that a file is
...
broken: do not crash; instead ignore the file and delete it.
12 years ago
orbiter
a1c989002b
fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4652
...
generate dht data even if dht receive and dht transmission is switched
off
12 years ago
orbiter
e1bfe9d07a
- reduction of the concurrently running processes to make YaCy more
...
adjusted to smaller and 1-core devices.
- the workflow processor now starts no process at all. these are started
as soon as parser/condenser/indexing queues are filled.
- better abstraction
12 years ago
Michael Peter Christen
089dee1770
- generalized SchemaConfiguration into super-class Configuration and
...
adopted other classes which used the configuration-only access for that
class
- removed many warnings
- adjusted logging
12 years ago
orbiter
712cc37c40
if maxFileSize < 0 then the file size limit is without limit.
12 years ago
orbiter
354f0d9acd
moved static method from ClusteredScoreMap to MapDataMining because it
...
was not used in the ClusteredScoreMap class but only in MapDataMining
12 years ago
orbiter
276dd6452b
removed warnings
12 years ago
Michael Peter Christen
2f536cb54d
code cleanup: removed unised methods and made more methods and objects
...
private
12 years ago
apfelmaennchen
116f429e35
fix for java.lang.RuntimeException: TableColumnIndex not available...
12 years ago
Michael Peter Christen
8219a445f3
refactoring
12 years ago
Michael Peter Christen
00c1c777fa
refactoring
12 years ago
orbiter
563d584420
removed more dependencies in cora from kelondro
12 years ago
Michael Peter Christen
e65cecc419
- updated lucene libraries to 3.6.1
...
- added lucene-grouping which enables faceted search; try this:
http://localhost:8090/solr/select?q=*:*&start=0&rows=3&facet=true&facet.field=host_s
12 years ago
Michael Peter Christen
4d29f59a27
removed warnings
12 years ago
apfelmaennchen
d31a632951
- added dmoz RDF dump importer
...
- added indexing to Tables columns to support larger bookmark
collections
- added RDF output (HTTP) for public bookmarks at /YMarks.rdf
- YMarkRDF also provides a Jena RDF Model as "internal" API
- various other changes/fixes for YMarks (mainly backend)
12 years ago
orbiter
2094df2e4e
- correct length computation for BStringObject (bugfix suggested by
...
apfelmaennchen)
- using ASCII for string conversion for Strings generated from Integer
12 years ago
Michael Peter Christen
94a334f128
another fix to the Solr metadata reading process and to the shutdown
...
process
12 years ago
sixcooler
f32aa9a49c
prevent merge of blobs that can't be handled in memory
12 years ago
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
12 years ago
Michael Peter Christen
e432bb9cd9
better calculation of possible saving in HeapReader index data structure
12 years ago
Michael Peter Christen
9549984c65
documentation/comments
12 years ago
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
13 years ago
orbiter
482afed07c
reduced logging overhead (a bit)
13 years ago
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
13 years ago
Michael Peter Christen
b0c408788b
made class methods static where possible
13 years ago
Michael Peter Christen
0301aba1e9
removed unused method parameters
13 years ago
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
13 years ago
Michael Peter Christen
8a82609360
- smaller caches to save memory
...
- close cloneable iterators to free memory
13 years ago
Michael Peter Christen
0c345d1559
giving threads name so its easier to see whats happening during
...
debugging and within a thread dump
13 years ago
Michael Peter Christen
de3ef8ad73
removed unimportant warnings
13 years ago
Michael Peter Christen
bef823c247
close the reader if finished
13 years ago
cominch
9cbfc1a1c0
augmentedProxy, which forwards every proxy request to a
...
rewrite engine to customize existing webpages. originally implemented by
Florian Richter.
Conflicts:
source/de/anomic/http/server/HTTPDProxyHandler.java
13 years ago
Michael Peter Christen
ba10caf89a
lazy initialization of database tables
13 years ago
Michael Peter Christen
701b9a28a0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
htroot/PerformanceMemory_p.java
13 years ago
Michael Peter Christen
10c9c17d51
fixed handlemap spread factor and null iterator handling
13 years ago
Michael Peter Christen
b0095c8d3c
flush the compressor cache when a cleanup is done
13 years ago
Michael Peter Christen
96e9d77270
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java
13 years ago
Michael Peter Christen
3dd8376825
added automatic cleaning of cache if metadata and file database size is
...
not equal. It might happen that these data is different because one of
that caches is cleaned after a while or when it is too big. The metadata
is then not cleaned, but now wiped after a checkup process at every
application start. This should cause a bit less memory usage.
13 years ago
Michael Peter Christen
6bb07afcc3
accept also files with other file prefix; used to read 'foreign' cache
...
files
13 years ago
Michael Peter Christen
461a0ce052
removed warnings
13 years ago
reger
6696cb1313
bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
...
SeedDB.lookupByName searche for lowercase peerNames, while MapColumnIndex.getIndex uses peername as is in the keyset.
Changed the index init to insert lowercase peer names as key
13 years ago