Michael Peter Christen
1533bfd63b
refactoring
12 years ago
Michael Peter Christen
00c1c777fa
refactoring
12 years ago
orbiter
63762d8f89
removed kelondro dependencies from cora
12 years ago
Michael Peter Christen
24d9db1613
snippet retrieval loading processes may use a smaller minimum load time
...
value than crawling processes. This speeds up the search result
preparation dramatically.
12 years ago
Michael Peter Christen
d3964253ae
- added @SuppressWarnings to unused servlet method parameters
...
- removed unnecessary casts
- removed unnecessary throw statements
13 years ago
Michael Peter Christen
1825f165b8
better integration of blacklist according to use case
13 years ago
Michael Peter Christen
24bbe359ca
integrate also geonames library files for less cities. these are more
...
useful for tagging since less normal words are false-identified as
location
13 years ago
Michael Peter Christen
f1aa4c4390
- accept only location names wit a minimum length
...
- remove comma from synonym terms
13 years ago
Michael Peter Christen
cc9ad7198a
- use only names which consists of at least two parts
...
- remove word from derewo from locations
13 years ago
Michael Peter Christen
eeb4fd8b8c
refactoring (geolocalzation -> geolocation)
13 years ago
Michael Peter Christen
a0f1decd82
- added loading of the dbpedia pnd triplestore in the dictionary loader
...
- renamed the dictionary loader to knowledge loader
- some refactoring in the library provider method names
13 years ago
Michael Peter Christen
d45718251e
refactoring (Localization -> Location)
13 years ago
Michael Peter Christen
b8b3c87ba7
- renamed localization to location (that was confusing)
...
- renamed 'Locale' navigator to 'Location'
- produce Location navigation only if geolocation libraries are loaded
13 years ago
Michael Christen
bd40a10230
added autotaggig stub .. only reading and parsing of vocabularies at
...
this time
13 years ago
orbiter
d2ea250d99
refactoring:
...
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
b5252ef91f
added new word recommendation library in DictionaryLoader_p.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7913 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
sixcooler
59b767eebd
stop loading via http at defined maximum of bytes - even size is unknown before loading
...
using max-file-size of type int for parsing documents
(since content is used as byte-arrays, 'integer' should be maximum)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7855 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
115abc8917
- more attributes for search progress bar
...
- moved cache strategy to cora package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7778 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
4588b5a291
- fixed document number limitation for crawls that restrict the number of documents per domain
...
- some restructuring of the document counting and logging structures was necessary
- better abstraction of CrawlProfiles
- added deletion of logs to the index deletion option (if the index is deleted using the servlets) which is necessary to reset the domain counters for the page limitation
- more refactoring to get the LibraryProvider more clean
- some refactoring of the Condenser class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7478 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
f1ori
9d2159582f
* fix system update if urls are in blacklist (for example for very general blacklists like *.de)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7375 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
3197ca42ed
preparations to move the HTCache into cora:
...
- move the header framework classes to cora
- move the ARC caching classes to cora
- refactoring of code to call these classes from cora
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7068 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
777195e8d1
more abstraction for access of LoaderDispatcher and cache
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6937 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e43e61e502
added another geolocalization data source: GeoNames
...
- added downloader option in DictionaryLoader
- added generalization (interfaces and overarching localization)
- more abstraction using the libraries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6879 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2126c03a62
- removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
...
- migrated the opengeodb downloader to a new version of the opengeodb-dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6873 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3661cb692c
added dictionary loader servlet that can be used to get the geolocalization file:
...
/DictionaryLoader_p.html
Will also be used for more dictionary files in the future
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6872 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago