yacy_search_server

Commit Graph

Author	SHA1	Message	Date
low012	c0274bd123	*) minor changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7394 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	fe46536f6e	enhanced network scanner (less name resolving during scanning and no name resolving during search) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7392 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	e753027c43	fix for http://forum.yacy-websuche.de/viewtopic.php?p=21439#p21439 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7390 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	bf4ef1513e	- fix for map view - remove some UNRESOLVED PATTERN - maybe a fix for non-flushing cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7389 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6b70393d1d	- new java version 1.6 - replaced old gif animator by java 1.6 gif animator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7388 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	e88c428008	fix to ftp loader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7387 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	59b70a5a92	another fix to the ftp crawler: now correct directory listings according to rfc2640 (path with spaces) and better title names for such files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7386 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	9b25a33fd9	- fixed numerous bugs - better document names - fixed problem with ftp crawling - added automatic removal of search results from services that are not online according to the latest network scan: this does not delete the index but just does not show them. after the next network scan when the server is available again, the results are again showed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7385 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	7bdb13bf7f	more fixes to smb crawling: better file names git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7384 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	94c48500cc	several fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7383 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	0ac7311a62	fix for token parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7382 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	58b59f9bc8	- a collection of bug fixes and some redesign of the Scanner class - fixed smb crawling - added smbget to download script generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7381 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c288fcf634	redesigned CrawlStartScanner user interface and added more features: - multiple hosts for environment scans can be given (comma-separated) - each service (ftp, smb, http, https) for the scan can be selected - the scan result can be accumulated or refreshed each time a network scan is made - a scheduler was added to repeat a scan and add all found urls to the indexer automatically git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7378 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	9d2159582f	* fix system update if urls are in blacklist (for example for very general blacklists like *.de) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7375 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	56264dcc17	- added CamelCase parser to MultiProtocolURI: generate better to-be-indexed words from urls - integrated new parser into loader processes: enrich document parser - fixed a concurrent modification exception in kelondro iterator - hand-over of document size from crawler to indexer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7374 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	99a7fe87f9	- removed old intranet scanner (the generic scanner now completely subsumes the old one) - added information about granted access - enhanced servlet design - added submit-feedback (because it is a long-running task) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7372 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	acab6801d9	added new network scanner - you can scan any ip or host in the internet for services - this replaces the intranet scanner git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7371 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	14e4fae8e9	fixes to ftp client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7369 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a563b05b60	enhanced crawler: - added a new queue 'noload' which can be filled with urls where it is already known that the content cannot be loaded. This may be because there is no parser available or the file is too big - the noload queue is emptied with the parser process which indexes the file names only - the 'start from file' functionality now also reads from ftp crawler git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7368 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c36da90261	added a very fast ftp file list generator to site crawler: - when a site-crawl for ftp sites is now started, then a special directory-tree harvester gets the complete directory structure of a ftp server at once - the harvester runs concurrently and feeds into the normal crawl queue also in this: - fixed the 'start from file' crawl function - added a link detector for the html parser. The html parser can now also extract links that are not included in <a> tags. - this causes that a crawl start is now also possible from clear text link files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7367 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	db99db4be9	some redesign of the search-fail-response mechanism: when a search fails for a single url because the snippet cannot be generated, then the url reference is deleted from the index. This mechanism was redesign and enhanced. The process now also writes into the work tables into the table searchfl to prepare a re-indexing mechanism. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7364 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	4915d1781a	* use local backup-file, if remote network-definition is not availible * resolve single point of failure in networks, managed by central network-definitions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7363 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4e2c14efbb	fixed bugs in parser and ftp client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7360 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d78e322e84	added a directory-structure reader to ftp client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7359 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f0651e5f2f	added image search to yacyinteractive.html this causes that the search result view switches from list format to image preview format when a search is restricted to png, gif or jpg documents git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7358 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	b769cce433	- added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only - enhanced the pdf and torrent parser: better documents titles - enhanced the ftp client: more time-out time - fixed bugs in json for search results - enhanced yacyinteractive.html: added a file type navigator and a download-script generator for search result files Please have a look at yacyinteractive.html: this will become the hacker-download tool for 27c3! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7355 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	21e84539e8	one more fix to Domains git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7353 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	e192d61972	fix for latest commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7352 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	22453b13ad	implemented local host address discovery as posted in http://forum.yacy-websuche.de/viewtopic.php?p=21310#p21310 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7351 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	cc6499bf8d	- added http://blekko.com as search heuristic (like scroogle). This was easy since they deliver their search results also as rss feed - renamed YaCys search result modifications keywords for RECENT, NEAR and language: to the blekko slashtag naming scheme. YaCy now supports the following blekko-like slash built-in slashtags: /date - for search results ordered by date (most recent up) /near - for search results where search words appear near to each other (closest up) /language/<lang> - for a sorting by language where the wanted language gets up. Example: /language/de git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7350 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a9f754c45f	removed unused CR accumulation and distribution process this was never used and extended in the last years. The resulting YBR ranking criteria is still a good idea and will be used in the future. Possible generation methods for YBR ranking are: - "trust-rank" using the link structure as can be discovered in a single crawl (idea from FSCONS) - "block-rank" calculated from the local link structure - a distributed "block-rank" using the xml API to the link structure from other peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7349 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	3d945bb442	fix for ftp client: suppress bad directory listing time-out git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7348 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d4a1a1850b	removed warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7347 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	9b3fae9496	) cleaning up the code a little bit ) program to interface, not implementation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7345 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	321eb012fe	removed two warnings and reverted one change git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7340 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	fd74bc388c	* fix small bug in sessionid-removal * add testcase for seesionid-removal git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7333 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	eb79b952ef	*) cleaner code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7331 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	38fdf43587	) renamed classes according to standard Java coding conventions ) String.isEmpty() was introduced in Java 1.6, but we still use Java 1.5 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7330 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	025e3f4790	) renamed classes according to standard Java coding conventions ) removed unsused code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7328 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	a025b1da89	* fix bug when browsing local filesystem (e. g. repository) with yacy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7323 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	b87bf88ac8	using less memory on merging and rewriting blobs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7317 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	d62e449a11	* fix FilterEngine, forgot comparision-operator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7314 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	441fbc26e2	security patch for WeakPriorityBlockingQueue (produced a deadlock) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7307 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5dcb838293	- removed thread overhead when calling dns services - fixed localsearch (changed it by accident) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7306 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4c50d3428e	smaller file size for array stacks to support smaller deletion sizes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7305 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	becc463d8a	enhanced did-you-mean git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7300 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	93c535d111	fixed http://forum.yacy-websuche.de/viewtopic.php?p=21113#p21113 fixed a concurrent modification exception during search and a time-out problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7298 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	04932dc268	added rdf data structure for rss feeds git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7297 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	84f2953cd8	fix for rss loader / rss type recognition git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7296 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4c72885cba	added a sitemap entry parser and loader for sitemaps (a recursion if a sitemap refers to another sitemap) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7295 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago

1 2 3 4 5 ...

457 Commits (c0274bd1233871c987a544c6fd4c5346141c665c)