Commit Graph

184 Commits (6db8921a0f6b3c519e88c5cfcfd7172c99b04619)

Author SHA1 Message Date
orbiter 3014e5f6f9 - integrated live search in the IndexControlURLs input window for URLs:
15 years ago
orbiter 0769517129 added a robots.txt monitor in the crawler monitor submenu
15 years ago
orbiter 840527689b more simplification of bookmark class
15 years ago
orbiter ada0ce9de3 refactoring of bookmarks: there is a big performance problem in the bookmarks code and furthermore the bookmarks
15 years ago
orbiter 2113fcd7e5 - fixed usage of isEmpty() which is not available in java 1.5
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter 4c99d4683d possible fix for lost crawl profile handles: clean-up job did wrong measurement to see if crawl is still running.
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 26fafd85a5 - more refactoring
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter 5841ee83d3 refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
orbiter 031e6eefbd some updates to dublin core, metadata browsing, file indexing and parser stability
15 years ago
orbiter c0e17de2fb - fixes for some problems with the new crawling/caching strategies
16 years ago
orbiter 634a01a9a4 replaced wget-requests with caching requests
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago
orbiter 13c63f4082 a set of small fixes to crawling behaviour
16 years ago
f1ori 8931c8d6b4 improvments to debianpackage:
16 years ago
orbiter 0e8647d62f refactoring of search classes
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 154bbc3364 code cleanup: call of static methods directly to the class
16 years ago
orbiter bc6dd8194b refactoring: moved search query class to new search package
16 years ago
orbiter 945777aa80 replaced rwi term counting method by one that computes the maximum of the blobs that contibute to the RWI. An addition of the blob sizes is wrong/incorrect and does not reflect the real size. Truncation the size operation to the maximum of all blobs is also incorrect, but not as wrong as the sum of all blob sizes wich double-counts many rwi entries.
16 years ago
orbiter cc49aedf12 - fixed problem with remote search NPE
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter fec6f9054f some refactoring of search methods
16 years ago
orbiter 63a0255166 - refactoring: added new content package, which will contain connector classes for different types of data sources to import texts into the YaCy index
16 years ago
orbiter e16c25ddf7 (peak-) performance hacks
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
f1ori dd6b5005ff * fix missing charset handling in getpageinfo_p
16 years ago
orbiter 89ec3acb3e - full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes.
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter a29a11e526 added evaluation of incoming links in webstructure api
16 years ago
orbiter 7ba078daa1 - added fast site-operator
16 years ago
orbiter bd409fb7ba added web structure analysis for a special domain that can be requested from the api.
16 years ago
borg-0300 8c494afcfe svn attributes added
16 years ago
orbiter 67aaffc0a2 - added Latency control to the crawler:
16 years ago
orbiter 61f9dbf0cc - fixed a display problem in watch crawler
16 years ago
orbiter 83792d9233 more refactoring
16 years ago
orbiter 7f67238f8b refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter d7a493b4f5 added experimental timeline api
16 years ago
orbiter efcd95dc37 simplification of (internal) query process / refactoring
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter 62505bb3cb more bugfixes as recommendet by findbugs
16 years ago
orbiter 6a32193916 - refactoring of cache naming in web index cache (no more dht semantics there)
16 years ago
orbiter c25c334b75 replaced old DHT transmission method with new method. Many things have changed! some of them:
16 years ago
orbiter 01b97ef3f8 added new cybertag-tracking feature that was inspired by itgrl
16 years ago
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title
16 years ago
orbiter 75bef03ac6 fix for bad encoding in yacydoc.html and yacydoc.xml
16 years ago
apfelmaennchen ee3fe19c0b added /api/bookmarks/get_folders.xml
16 years ago
apfelmaennchen 7a159dc745 update for api/bookmarks/get_folders
16 years ago
f1ori bacccda6d7 * blacklist_p.xml: attrOnly = only give parameters of blacklists, no content
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter 83ce65707a (almost) completed partition of classes in kelondro
16 years ago
orbiter 7ee494fde5 more refactoring of kelondro:
16 years ago
orbiter bf93767ec6 refactoring of kelondro database classes
16 years ago
orbiter fc27bf8c4c refactoring of kelondro classes:
16 years ago
apfelmaennchen 3905caf8a1 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5536 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen 08ed14603e - fixed YaCy-UI sciencenet search
16 years ago
apfelmaennchen 9bd9ccade2 refactoring
16 years ago
apfelmaennchen 96684df1a9 - security fix for addTag.java and editTag.java
16 years ago
apfelmaennchen 6dd52422ea - added two dialogs to manage bookmark tags in YaCy-UI
16 years ago
apfelmaennchen 9317650272 forgot to post this one...
16 years ago
apfelmaennchen 92d77c3bef Major update to YaCy-UI...still not perfect...but I thought I share my progress :-)
16 years ago
orbiter dedfc7df7f removed distinction between DHT-in and DHT-out. This is necessary to make room for the new cell data structure, which cannot use this this distinction in the first place, but will enable the same meaning with different mechanisms (segments, later)
16 years ago
f1ori 34da04c7dd * fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1754
16 years ago
orbiter b423d0a036 moved all servlets from htroot/xml to htroot/api
16 years ago
orbiter 4bd927d513 the Semantic Web moves in!
16 years ago