orbiter
6aa474f529
- better logging for web cache access and fail reasons
...
- better Exception handling for web cache access
- distinction between access of web cache for proxy and crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6367 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3671c37989
added experimental oai-pmh reader and integrated it with the existing dublin core parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6366 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
58a00205d5
re-activated the emergency close when too many server connections exist
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6364 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
c57d2070e6
more logging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6363 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
a995b95367
tried a fix for the httpd access bug (too many unclosed sessions)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6362 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e1fba41cad
better logging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6361 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2275f885a8
possible fix for concurrency problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6360 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
a6a3090c3d
*) blacklist cleaner supports usage of regular expressions now
...
*) refacored BlacklistCleaner_p.java for better readability
*) moved check of validity of patterns to the Balcklist implementation since patterns might be valid in one implementation, but not in another
*) added method to check validity to Blacklist interface
*) fixed some minor issues like typos or wrong whitespaces
*) set subversion properties for a whole bunch of files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6359 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
5a93807781
improved web cache speed:
...
- removed one computation out of a synchronization
- removed one not necessary has() call
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6358 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2e8b2867ff
double performance of store method because it avoids one 'has'
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6357 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
afda5b1adc
new join method for indexes (not yet used)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6356 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
65b66c2c18
better handling of array files of length 0
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6355 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1957b5797a
fix for seed generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6354 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
432154f725
new strategy for concurrent database index key retrieval
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6353 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
a11cd9f80f
- removed reverse name lookup for http access logging (grr..)
...
- removed a synchronization in seed info string generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6351 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2e6bdce086
- added more logging to balancer
...
- changed balancer logic slightly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6350 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1171a72006
fix for deadlock as seen in http://forum.yacy-websuche.de/viewtopic.php?p=17521#p17521
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6343 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
031e6eefbd
some updates to dublin core, metadata browsing, file indexing and parser stability
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6342 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
hermens
62a7341c4d
Fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2204
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6341 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
f65bfaa9af
*) Removed base tag from errror page. This has been added by myself a long time ago as a workaround for some weird behavior of my router, but as it turns out, it does more bad than good in general: If HTTPS is used for communication with YaCy, entering a wrong passwort led to an errror page with a form which would send username and password unencrypted with the user possibly being unaware of this.
...
*) changed some comments, added some annotations, added SVN properties here and there
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6340 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e4797ebcde
fix for http://forum.yacy-websuche.de/viewtopic.php?p=17509#p17509
...
corrupted files are ignored
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6339 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
efa7fb34f0
better oom-awareness of miss-cache in cache
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6338 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3e9dcfc204
fix for http://forum.yacy-websuche.de/viewtopic.php?p=17504#p17504
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6337 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
c3a4aee255
some redesign with a possible fix for the ReferenceContainerCache.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6336 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
aca8a78eb8
fix for shutdown of DocumentIndex objects
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6333 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
23ab6fbca4
- navigation appear at correct position when opengeodb-results are also presented after a search
...
- show an about box if about.headline and about.body is set
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6332 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4db34eea73
fix for OOM problem in kelondro Cache
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6331 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
8ea1d7ab59
fix for wrong assert condition in search abstract generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6330 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
fbd77bd77c
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6328 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
54c7cbf1d9
- fast result for local search in case that less than 10 hits exists
...
- small change in display of RAM in profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6326 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
28d4b921b6
different approach for file search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6325 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
f99f86c5c5
added concurrency to file indexing class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6324 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
902d16cf6c
fixes to parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6323 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4a1c852435
fix in usage of RAM copy for Table objects and some cosmetics in asserts.
...
This bug affected Tables in case that a removeOne() was called and a RAM copy of the table was active. It may happen for peer owners with a lot of RAM assigned to YaCy. The bug appeared especially during crawling when the balancer tried to get new entries from the crawl queue.
This bug may help to solve report at
http://forum.yacy-websuche.de/viewtopic.php?p=17417#p17417
and will be tracked there
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6322 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
lotus
dce450e2e0
possible fix for "hung" doc-documents
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6320 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e627f75415
one more fix to badwords and stopwords
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6316 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
721b88efbd
- fixed a problem loading blacklists with new yacycore.jar
...
- fixed badwords and stopwords initialization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6315 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
80d5005044
fixed seed upload methods - replaced reflection with direct instantiation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6314 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
68465c37af
added a convenience class to add files into a YaCy index
...
to make this possible, the yacyURL must be able to process file:// urls, which has also been implemented
testing of the new class resulted in some bugfixes in other classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6313 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2e41e10ffd
- updates to yacyVersion parser (remove old targets)
...
- added javadoc target to built script (does not work yet without errors)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6312 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
27d00285aa
- added a new file reader cache that may serve as full-file-copy of blob database files. This is not yet used
...
- removed class FileWriter and replaced all usage of that class with CachedFileWriter
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6309 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
fd6b9cb7dc
refactoring of IO access classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6308 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
d64569aa39
reuturn only recommendations of words that have a greater count than the original word
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6307 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
604c37927f
used comparator for did-you-mean that uses index sizes for comparisment, but:
...
- limit comparisment to only the first 10 elements that had been sorted before without IO
- added a size cache to index computation because the size is computed at least twice in set comparator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6306 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
a58d9cae7d
- show location name in geolocalization search result
...
- added link from location icon to openstreetmap browser with coordinates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6305 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
573d03c7d7
added configuration to enable ram table copy
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6304 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3be54e1891
fix to rule when to use a ram table copy
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6302 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
700218846c
disabled or removed sleep calls
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6301 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
342c5d0fd4
fixed city name detection: finds now also substrings of city names
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6300 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
18aa0609ca
fix for caching of word hash computation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6299 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
a10a6cce45
patch for http://forum.yacy-websuche.de/viewtopic.php?p=17289#p17289
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6298 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
53bbdfd19a
*) setting SVN keywords
...
*) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6297 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
25f6145934
*) preventing null pointer exception in case empty search word or only one character is enterd or all search words are removed by filters
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6296 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
248f3fd9b5
*) cleaned up code for better readability
...
*) added a few copyright notices
*) removed redundancy in constructors of ListToken
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6295 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
eaddf2d464
- corrected layout of map preview
...
- added caption to maps containing latitude and longitude information
- prevented that maps occur on second search page
- added location names to did-you-mean
- some refactoring of did-you-mean
- added equal and compareTo test to Coordinates class to make that work in set
- fixed utf-8 support for library files
- fixed a bug in images search icon view caption
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6294 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
hermens
4b83875abd
Small fixes for the heapCacheIterator in ReferenceContainerCache:
...
- Start the iteration at startWordHash
- When used with rotation, let the iteration stop when the cache is empty
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6293 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
fd668f531b
fixed map layout
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6292 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2740d9dd79
added integration of osm maps for search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6291 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
af3a696fc4
added a fast-fail concept in search processes. The search now has better control if all the remote searches may bring any result. If all processes are finished, then all search tasks fail fast.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6290 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
ce972ff4ef
update to default ranking profile which has now some settings to deny some phpbb3 pages which are redundant in the index when crawling phpbb3.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6288 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
44579fa06d
- fixed a problem loading images through yacy's document loader,
...
this denied non-parseable documents which excluded all images
- fixed url of osm tile server
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6287 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
67eddaec4b
changed way to integrate dictionary files:
...
the must be downloaded manually by the user and placed in DATA/DICTIONARIES/source
for each externally imported dictionary file there will be a translator that converts the input file once
into a YaCy-internat data format.
Files that will be provided together with yacy releases may still be placed in <root>/dictionaries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6286 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
d656a94f55
fix for bad paths in dictionary processing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6285 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3b9aaf9e9f
- inserted new library tests inside DidYouMean
...
- some redesign of DidYouMean that was necessary to follow
a special rule how a library should be used:
- the library provides words that start or end with a test
word which may be possibly also an empty set of words
- all words that the DidYouMean produced with the four
production rules are used to generate a set of
library-completed words
- if this process results in any words from the library,
only library-genrated words are taken
- if the is no library-generated word at all, take the
artifial generated word
- all words that result from these rules are tested against
the index
- the result is ordered using a lightweight comparator that
prefers short words
- a not-so-much-io test against the index is beeing prepared
next
- insered the library initialization into the switchboard
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6284 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
8c35ffe34c
fixes to the dymlib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6283 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
bfa273bcc1
added a library provider which holds libraries in static objects,
...
which can be used by any other classes to support their functions.
libraries are designed in such a way that users can create and
insert their own library files, but can also be imported from
other sources. As an example the "Korpusbasierte Wortgrundformliste
DeReWo des Institut für Deutsche Sprache" from
http://www.ids-mannheim.de has been integrated. This dictionary
is licensed to be used for all non-profit purposes. In case that
YaCy is used for commercial uses, this library must be removed.
The new library provilder reads the original source and translates
it into a simple word list to be used for the did-you-mean library
provider. More libraries may be provided in the future using
a download-servlet which puts files from the internet into the
<application-root>/dictionaries/ path.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6282 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1762a7bcd6
- moved DidYouMean to the data package
...
- added a DidYouMeanLibrary class that shall support the did you mean function with additional word lists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6281 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
bf8ed00e9e
removed debugging code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6280 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
ead48c4b25
fix for preparation of search result pages with offset > 10:
...
- less pages are fetched in advance
- just-in-time fetch of next required pages
- fix for missing hand-over of offset to fetch threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6279 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
39a311d608
better care to do not loose the merge/dump thread
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6278 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
10d3e856b5
better concurrency, less blocking & performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6277 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1a9cfd8718
some performance hacks (CPU only, not IO)
...
this will cause better computation speed for single- and multi-core;
there are enhancements that will speed up old and slow machines as well
as multi-core CPUs. Indexing of surrogates has been speed up
from 4000 PPM to over 20000 PPM on a simple dual core office computer.
Since the enhancements are mostly in core routines, the hack should also
speed up search performance.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6276 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
92407009b2
cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6275 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
0ba1beaf56
separated rwi constraint evaluation from rwi ranking and added concurrency
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6274 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
ce7924d712
better concurrency for rwi entry parsing during search processing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6273 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
b0637600d5
enhanced url constraint computation: better position of constraint check during retrieval process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6272 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
61748285c3
more refactoring of search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6270 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
323a8e733d
removed unused classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6269 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
72e5407115
refactoring of snippet cache
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6268 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
0e471ba33b
- fixed a bug in fast digest computation
...
- added a open-on-demand hack to heap files: when a heap file is
opened the first time, it is first scanned to get a key index
and then it is closed again. This will free up file pointers
in cases where a really large number of blob files are opened
upon initialization of ArrayStack objects. This should solve
also a problem reported in
http://forum.yacy-websuche.de/viewtopic.php?p=17191#p17191
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6267 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
low012
93b2622503
*) repaired and added IM online status indicators
...
*) added some missing SVN properties
*) removed unnecessary comment, added missing copyright notice
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6266 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e7736d9c8d
more refactoring: made all variables in SearchEvent private
...
to prepare splitting of the class into two parts: local and remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6265 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4b92d0b9b7
patch for possible problems with normalization of '/' in urls. This applies in rare cases when '/' appear in post-properties
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6264 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
d8ca6e6bf1
more refactoring for search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6263 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
fe4a4e3f6b
added missing class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6261 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
72ac5bd80f
refactoring of search process.
...
this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
hermens
c4d0e22a77
Further speed upof concurrent DHT-receive
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6259 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
hermens
2fbc0696bf
Fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2334
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6258 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
f1ori
d515bc11e2
added ooxmlparser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6256 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
d9744b1b5d
replaced old caching strategy control class with lightweight simplearc
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6254 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
8e56c2ace6
fix for fixes from this afternoon
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6253 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
cf739edc2e
fix for possible deadlock, see
...
http://forum.yacy-websuche.de/viewtopic.php?p=17017#p17017
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6252 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
6354b5e447
removed possible deadlock, see
...
http://forum.yacy-websuche.de/viewtopic.php?p=17017#p17017
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6251 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
5cc17ccf8a
a better caching with less overhead and more appropriate
...
synchronisation use in more than 10 different data objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6250 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
92edd24e70
fixed problem with switching of networks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6247 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0575f12838
fix for deadlock
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6246 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fbfdaf063d
- patch to omit IndexOutOfBoundsException when a b64-encoded key appears not to be well-formed. In that case the key is still accepted but rated higher than other regular keys to create a virtual ordering between well-formed and ill-formed keys
...
- check routine at the beginning of the import of table keys that check that all imported keys are well-formed. All records that have a ill-formed key are deleted. This is a hack and is not tested since I don't have bad data here to test with. If the effect is seen in the wild, please report in the forum.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6245 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c0e17de2fb
- fixes for some problems with the new crawling/caching strategies
...
- speed enhancements for the cache-only cache policy by using special no-delay rules in the balancer
- fixed some deadlock- and 100% CPU problems in the balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6243 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
634a01a9a4
replaced wget-requests with caching requests
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6242 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c6c97f23ad
- added cache usage properties to crawl start
...
- added special rule to balancer to omit forced delays if cache is used exclusively
- extended the htCache size by default to 32GB
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6241 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c4ae2cd03f
fixed bug that caused deletion of crawl profiles at every application startup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6240 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
161d2fd2ef
redesign of access to the HTCache (now http.client.Cache):
...
- better control to the cache by using combined request-header and content access methods
- refactoring of many classes to comply to this new access method
- make shure that the cache is always written if something was loaded
- some redesign of the process how http response results are feeded into the new indexing queue
- introduction of a cache read policy:
* never use the cache
* use the cache if entry exist
* use the cache if the proxy freshness rule confirmes
* use only the cache and go never online
- added configuration options for the crawl profiles to use the new cache policies. There is not yet a input during crawl start to set the policy but this will be added in another step.
- set the default policies for the existing crawl profiles. If you want them to appear in your default profiles you must delete the crawl profiles database; othervise the policy is 'proxy freshness rule'
- enhanced some cache access methods in such a way that unnecessary retrievals are omitted (i.e. for size computation). That should reduce some IO but also a lot of CPU computation because sizes were computed after decompression of content after retrieval of the content from the disc.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6239 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
ba2e6de538
fix empty version string again
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6236 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
51534df0cb
fix for possible synchronization problem
...
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2292&hilit=&p=16787#p16787
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6234 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4da9042e8a
code simplification
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6233 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
1d8d51075c
refactoring:
...
- removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here:
http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html
We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages.
- cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http.
- because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
67da20647f
* add new odf parser based on sax-xml-parser
...
* remove odf_utils-jar
* test metadata in ParserTest
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6231 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
6d0e6d591b
* ops, fix compiler error :(
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6227 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
3e5beb1654
* fix for empty version in seedlist
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6226 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
5bb8074150
removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
...
- The indexing queue was a historic data structure that was introduced at the very beginning at the project as a part of the switchboard organisation object structure. Without the indexing queue the switchboard queue becomes also superfluous. It has been removed as well.
- Removing the switchboard queue requires that all servlets are called without a opaque generic ('<?>'). That caused that all serlets had to be modified.
- Many servlets displayed the indexing queue or the size of that queue. In the past months the indexer was so fast that mostly the indexing queue appeared empty, so there was no use of it any more. Because the queue has been removed, the display in the servlets had also to be removed.
- The surrogate work task had been a part of the indexing queue control structure. Without the indexing queue the surrogates needed its own task management. That has been integrated here.
- Because the indexing queue had a special queue entry object and properties attached to this object, the propterties had to be moved to the queue entry object which is part of the new indexing queue withing the blocking queue, the Response Object. That object has now also the new properties of the removed indexing queue entry object.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6225 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
597393db3b
changed default visibility of classes/objects in upnp lib
...
(eclipse tells me that this would improve performance,
however, this removes compiler warnings)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6224 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
eea4c17ef2
removed rpm parser
...
- no-one used that thing
- loading huge rpm files bay be causes for crashes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6223 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b332dfad67
- inserted request object into response object which carries this now instead generating new objects
...
- fixed a problem with the crawler introduced in SVN 6216
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6222 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ca72ed7526
-removed superfluous crawl cache
...
-refactoring of crawler classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6221 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
8103ccec4c
removed compiler warnings in imported classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6220 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
52e371b8f7
suppress warnings for upnplib code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6219 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
477807e0e6
* updated jxpath to latest v1.3
...
* added upnplib as source
without packages:
jmx
remote
samples
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6218 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
13c63f4082
a set of small fixes to crawling behaviour
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6216 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
a564df3984
update to mime types in parsers and httpd.mime
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6215 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
43c8defd79
enhanced parser with more extension + mime attributes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6214 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
aee35bff6f
replaced StringBuffer with StringBuilder in tar lib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6213 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
49bbb9bd45
replaced tar library with integrated apache ant tar lib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6212 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f987fc6b4a
added tar classes from apache ant tools
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6211 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b2263bc720
enhanced document type recognition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6209 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
aa38eb5a20
* maxfilesize -1 for infinite filesize
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6208 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
9cfe89c8fc
* process content-length as soon as it is received
...
* corrected indentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6206 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
50cf80056f
removed jmimemagic library
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6203 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3f113f38a8
removed unused imports
...
removed unused libs from eclipse class path
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6201 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
9f083bb6b2
check filetype before loading (no more mp4 loading)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6200 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
076ae02c44
* added pl and py to extensions excepted by htmlParser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6198 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
d5e51cfd09
* workaround for non-working build property replacements
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6197 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
f814e0fa81
enable warnings and fix most of it
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6196 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
8931c8d6b4
improvments to debianpackage:
...
* autoupdate completely disabled, display hint
* restart-button in interface works!
* moved all build-Variables to yacyBuildProperties
* fixed some warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6195 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
fc1dc38b55
*) added spaces to make sure that no words are concatinated by accident
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6194 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
f242e7d7bc
*) using Apache POI library to parse Word documents now
...
*) removed tm-extractors library (can be found at http://www.textmining.org/ if necessary again)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6193 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
caedd72400
- enhanced logging and exception details for parsers
...
- removed inconsistencies in mime type declaration (one mime type should only appear once in all parsers)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6192 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4b74ad0a46
fixed setting of parser configuration servlets
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6191 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
57a88d435b
redesign of parser mime type detection and parser steering
...
There is now a mime-blacklist instead of a mime-whitelist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6190 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
e15d27bc63
avoiding double/wrong parser errors
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6189 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
21b8704fb4
refactoring of the ParserDispatcher and ParserConfig: resulted into Idiom, Parser and Classification classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6188 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
8ca1f5d400
- some work to integrate the html parser the same way as the other parsers are integrated (not finished)
...
- added migration of code of settings pages (hmm.. does not work correctly yet, sorry)
- more refactoring
- removed more unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6187 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
1ee109761f
*) added changes which were lost
...
*) additional annotations
*) additional svn properties
*) _no_ functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6186 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
499723891d
removed all non-http daemons; they had not been used and may be a potential security risk.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6185 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0e8647d62f
refactoring of search classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6184 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
dafffd0153
refactoring of parsers and document processing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
8041e91f56
*) Ooops!
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6181 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
69551ff3d9
*) added several MIME types (derived from http://filext.com/ ), some of them might be rather uncommon
...
*) added an annotation forgotten in last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6180 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
11dfb2d54f
minor changes:
...
*) added annotations
*) set svn properties and added keywords to comments of parser classes
*) made a variable final to prevent (theoretical case of) change of object instance in synchronized block
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6179 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
77d2a3782c
removed strange debugging strings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6177 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
4320f69574
universal handling for crashed parsers
...
reverting r6090/1
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6176 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
024744245c
small refactoring to prepare for new queues
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6173 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
16efcd0366
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2252&hilit=&p=16389#p16389
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6172 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
0f3246e90a
* fix debian package
...
* add Class containing buildvariables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6171 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
8544cfd5a6
* remove seperate build-files for parsers
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6170 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
24cb6d68bc
- renamed Stack to RecordStack to avoid name confusion with new classes
...
- added new Stack class that implements a stack on BLOB files
- added new Stacks class that can be used for a set of Stacks (a 'Stack Database')
- added methods to other classes to support the new stacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6169 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
995da28c73
all stack/heap files that had been stored in DATA/PLASMA are now stored in the network-specific QUEUES path
...
There is no migration. All crawls must be restarted.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6167 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
aac89bf8ca
trying to avoid "exceeding limit" message of server
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6166 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
48d78166ed
* fix double copy of libraries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6164 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
7f868ca3c2
resource observer: support for yacyroot\DATA on an NTFS hardlink (Windows)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6162 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
409538e17a
code cleanup and code simplifcation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6161 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
160031758d
fix for problem with initializer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6160 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
302a02cec8
moved all libraries from libx to lib
...
removed libx directory
all libraries are now in lib, instead the test libraries in libt which are not part of releases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6157 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
1f1399e5c5
extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6156 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
154bbc3364
code cleanup: call of static methods directly to the class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
222850414e
simplification of the code: removed unused classes, methods and variables
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
93dfb51fd4
problems with code style
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6153 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
adf01c676e
reduce lookup time when merging a large number of BLOBs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6152 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9a674d8047
- After the removal of the Tree class some code simplifications are possible. This affects mostly the Records class, which can be refactored and the result of the refactoring results in a reduced number of classes.
...
- The EcoTable was renamed to Table.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6151 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c5122d6836
completed migration of BLOBTree to BLOBHeaps:
...
- removed migration code
- removed BLOBTree
after the removal of the BLOBTree, a lot of dead code appeared:
- removed dead code that was needed for BLOBTree
Some more classes may have not much use any more after the removal of BLOBTree, but still have some component that are needed elsewhere. Additional Refactoring steps are needed to clean up dependencies and then more code may appear that is unused and can be removed as well.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6150 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d1083a6913
maybe we have less problems with open connections to the server if we don't do BF forced sleeps (just a test)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6149 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
ebe6c823ac
*) changed svn properties agains (hopefully doing it right this time)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6147 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
a80ac3a415
*) fixed wrong parser descriptions
...
*) changed svn properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6146 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
457b6c0d6d
*) updated Apache POI library to be able to parse Visio files
...
*) updated PPT and XLS parsers to use new Apache POI library
*) added new Visio (VSD) parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6145 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
a10c8022d1
DidYouMean:
...
- limit the number of consumer threads to available CPUs
- added some javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6144 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
7eb3bff5b3
* workaround for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2220&hilit=#p16128
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6143 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
99fa265e1d
fix for search bug caused by tenant patch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6125 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
79875782af
be a bit more lazy when removing domain navigation entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6120 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
57af311627
fix for wrong urls in navigator when a tenant is used
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6119 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
76b96337e2
just some chatty code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6118 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
91785d895c
*) minor changes in comments
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6109 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
bdda140c02
fix for json output (no doubleqotes any more, doublequote quoting did not work)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6105 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
2f84736120
ignore signature files that cannot be downloaded because of failed encoding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6103 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
041d9c253e
some refactoring and more error-awareness in LogalizeHandler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6102 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
6b307d6d59
more tolerance for corrupted index entries in exported row sets
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6099 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
33aafa9b4b
better logging when writing merged dumps
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6098 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
db70badcf0
possibility to set remote host on upnp device
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6097 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4d29e90708
uaeh
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6096 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3c3e6499ae
added more logging for merge operation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6095 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
15180fc95e
- patch for future computation in SplitTable
...
- added same concurrent process for has() from SPlitTable in ArrayStack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6093 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9a5ec20b3c
avoid merge during startup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6092 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
bf6b92343c
try to avoid stuck pdf parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6091 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
c695c7f512
try to remove hung swf parser from queue
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6090 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fc69a76197
update to web structure picture:
...
- allow bigger size
- better instructions for api usage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6089 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ae015e8e98
refactoring of blob package classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6088 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
8b8877c233
moved image collector
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6087 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
be1c7ddc64
refactoring of search classes -- moved Ranking Profile to search package
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6086 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fd31a3616a
- more logging in server process
...
- fix for bas ascii in comment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6084 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
5a7fd6b4c8
just some comment lines
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6081 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
31f60a3b3e
when doing searches, also apply a online caution to DHT transmission and stop transmissions while heavy load caused by searching. This omits the many requests to the URL database that are needed for DHT transfer and it avoids collisions with URL retrieval needed for search results.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6080 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
17dc6d4be5
small fix for new Logger
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6079 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago