yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	c5f67a5d6d	fixed a problem with local search from solr results: now all results from solr are shown (again)	12 years ago
Michael Peter Christen	a87811bc38	more auto-commit calls when a search interface is opened, but not when a search is done there to prevent blocking during search-time.	12 years ago
Michael Peter Christen	8e1248ffe3	force a commit in advance of a search for the administrator to get most recent results even if commit time is high and an indexing is ongoing.	12 years ago
Michael Peter Christen	43f3345c90	- removed dependencies from URIMetadataRow and made direct access to URIMetadataNode which creates the opportunity to access Solr objects directly and use their information richness - lazy initialization of the URIMetadataNode object - should cause less computation and memory usage during search. - removed dead code	12 years ago
Michael Peter Christen	5f0ab25382	removed the option to prevent removal of & parts inside of the MultiProtocolURI during normalform computation because that should always be done and also be done during initialization of the MultiProtocolURI Object. The new normalform method takes only one argument which should be 'true' unless you know exactly what you are doing.	12 years ago
Michael Peter Christen	24d2ee3c52	- better date ranking - more protection against NPE and time travel effects	12 years ago
Michael Peter Christen	1533bfd63b	refactoring	12 years ago
Michael Peter Christen	e49359cc95	removed tenant query attribute since it is not used any more and is replaced by the site-operator in the GSA interface. This operator can also be simulated in the Solr interface using the collections_sxt field.	12 years ago
Michael Peter Christen	8219a445f3	refactoring	12 years ago
Michael Peter Christen	00c1c777fa	refactoring	12 years ago
orbiter	63762d8f89	removed kelondro dependencies from cora	12 years ago
orbiter	a55e77a115	added twitter search heuristic	12 years ago
Michael Peter Christen	31d4d38804	- extended the solr interface by a references-by-word-count method - reduced danger that a non-existing RWI database causes NPEs - added Solr queries to did-you-mean: this makes it possible that our did-you-mean algorithm works together with only Solr and without RWIs	12 years ago
Michael Peter Christen	0cab06c47c	refactoring	12 years ago
Michael Peter Christen	18f989dfb1	- refactoring (load -> getMetadata) - added getDocument to retrieve Solr documents which shall replace getMetadata	12 years ago
Michael Peter Christen	6197caf698	added clear-text search words in query params	12 years ago
Michael Peter Christen	24d9db1613	snippet retrieval loading processes may use a smaller minimum load time value than crawling processes. This speeds up the search result preparation dramatically.	12 years ago
Michael Peter Christen	1687737771	Abstraction of HandleMap and HandleSet	12 years ago
Michael Peter Christen	315d83cfa0	cleanup	12 years ago
reger	36c9875b6e	removed localized number formatting from num-results_totalcount response (this is only used in xml and json where localized format is not valid)	12 years ago
orbiter	69e743d9e3	- more abstraction for the RWI index as preparation for solr integration - added options in search index to switch parts of the index on or off	12 years ago
orbiter	c00a3cf74d	less usage of generic logger to avoid logger generation overhead	13 years ago
orbiter	62202e2d71	refactoring of query attribute variable names for better consistency with (next) stored query words	13 years ago
Michael Peter Christen	b0c408788b	made class methods static where possible	13 years ago
Michael Peter Christen	7c1ba99755	removed more unused method parameters	13 years ago
Michael Peter Christen	241dd8410a	removed snippet pattern filter - it was not used	13 years ago
Michael Peter Christen	1825f165b8	better integration of blacklist according to use case	13 years ago
Michael Peter Christen	03280fb161	removed segments-concept and the Segments class: the segments had been there to create a tenant-infrastructure but were never be used since that was all much too complex. There will be a replacement using a solr navigation using a segment field in the search index.	13 years ago
Michael Peter Christen	96aeb127e3	generalized localhost naming. this is also a preparation for a better IPv6 implementation.	13 years ago
Michael Peter Christen	77f795756c	fixing redirects and status codes: storing of status code in ResponseHeader to make it available for late evaluations, like storage in solr.	13 years ago
Michael Peter Christen	9264d8b4af	removed old navigation practice using subject tags in favor of triplestore-tags	13 years ago
Michael Peter Christen	eeb4fd8b8c	refactoring (geolocalzation -> geolocation)	13 years ago
Michael Peter Christen	8b53771db2	changed behavior of navigation processing: - vocabulary annotation is not done any more into the metadata of urldb - vocabularies are written into the jena triplestore using a rdf vocabulary - vocabularies for rdf tripel must be updated; refactoring done - with the new navigation tags in the triplestore a faster pre-urldb-lookup is possible: navigation is processed now within the RWI during pre-ranking retrieval - added also a Owl vocabulary stub to add the plain-text url to the triplestore using the owl:sameas predicate	13 years ago
Michael Peter Christen	dd14b19c26	lazy initialization of block rank table ... only normal web search uses this. When interactive search or location search is used, the block rank is switched off	13 years ago
Michael Peter Christen	a61f44f9e4	lazy initialization of block rank table. this causes that the table is not initialized when there is no search is done. the effect is most strong if YaCy is started headless which causes no browser pop-up which otherwise would load the search page and therefore trigger the initialization of the table.	13 years ago
Michael Peter Christen	96c8119b50	added GeoLocation / GeoPoint classes which uses less memory than Location/Coordinates and has initializers with correct order of lat,lon coordinates	13 years ago
Michael Peter Christen	a1fe65b115	performance hacks	13 years ago
Michael Peter Christen	2fe207f813	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
Michael Peter Christen	5aee19daa4	added show from cache in search results (not yet finished)	13 years ago
Michael Peter Christen	e0d8643226	- performance hacks - added log warnings in case that search processes run into time-out situations - better concurrency for Integer formatter (used a non-synchronized formatter before) - bugfix for search termination (a poison pill was missing) - added timeout parameters for search (again) -> target is, that they are never reached.	13 years ago
Michael Peter Christen	9b4c699526	ehanced location search: - search request are now made using a map boundary - search results are only computed for the map boundary - the number of results is adopted to the results in the visible range - added a double-buffering for the search result markers - added a search query option for the search results: /radius/<lat>/<lon>/<radius>	13 years ago
Michael Peter Christen	8b974905ee	changed log-in text for all servlets with authentication: - added hint how to set the password using a shell script - added a shell script to change the password	13 years ago
Michael Peter Christen	7bf421b9dd	- fixed image search page navigation - removed some deadlocks and ConcurrentModificationExceptions during DidYouMean collection	13 years ago
Roland 'Quix0r' Haeder	5f983faef9	No & in JavaScript-embeded URLs, added ability to stop focus in ConfigPortal.html preview (is this not secured with _p????) Conflicts: htroot/yacyinteractive.java htroot/yacysearch.java	13 years ago
Michael Peter Christen	77f8e9fb9b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
Michael Peter Christen	ba6aaabc51	refactoring + parser bugfixes	13 years ago
Michael Peter Christen	a18b6dee04	Merge remote branch 'bbyacy-rc1/master'	13 years ago
reger	ea932f841c	changed link to opensearchdescription document to a absolute uri (in yacysearch.html and yacysearch.rss) see http://www.opensearch.org/Specifications/OpenSearch/1.1/Draft_5#The_.22Description.22_element	13 years ago
Michael Peter Christen	5c66880be2	fix for search result selection in case that contentdom is not set	13 years ago
Michael Peter Christen	4aa0eedead	one more scroogle...	13 years ago
Michael Peter Christen	347612ddd4	removed scroogle parser	13 years ago
Michael Peter Christen	f8cd57c92f	new indexing strategy: ALL links that appear anywhere are indexed, not only links where the content can be parsed. All non-parseable links are placed into the noload queue. The search process must therefore be able to filter out non-text search results. - This fixes the problem that image search results appeared in the text search. - The interactive search can retrieve now ALL types of links - The p2p interface is now extended to retrieve only certain types of links (text, image, video, apps) - The search process has an extension to filter the right document type according to the search query	13 years ago
Michael Peter Christen	14f67f217c	refactoring of ContentDomain: now subclass of Classification	13 years ago
Lotus	78f0d8f046	no focus on preview frames for search integration fixes bug http://bugs.yacy.net/view.php?id=161	13 years ago
Michael Peter Christen	e2f8f263e8	changed storage of search words: keep order	13 years ago
Michael Peter Christen	88b86afc89	no DoS protection for intranet mode	13 years ago
Michael Peter Christen	e8d24fd802	author navigator can be switched off	13 years ago
Michael Peter Christen	d5ead5314d	changed navigation links: now using checkboxes. This looks better and allows that negative checkboxes (such that remove the navigation) are possible. These are not yet implemented (comming next)	13 years ago
Michael Peter Christen	dc165275ad	bugfix for usage of multiple vocabulary navigators	13 years ago
Michael Peter Christen	83009d86f7	added the vocabulary navigator. It can be very simply tested by switching on the locale dictionaries.	13 years ago
stbrumm	d18095dc48	Patch fuer Issue 0000102 and fixes to Patch (private peer status is a property of a peer, not a status)	13 years ago
Michael Christen	5bfb287753	make a bad fix even worse	13 years ago
Michael Christen	9e5894c784	Removed handling of components objects for URIMetadataRows. This is a preparation to replace this rows with nodes from the node store.	13 years ago
Michael Peter Christen	0bcef2d156	added feature as requested in http://forum.yacy-websuche.de/viewtopic.php?f=18&t=3461 The search can now be configured with a non-display host list. the search will always exlude the given list of host unless they are requested directly using the host navigation	13 years ago
orbiter	4b8ff84705	- search bugfixes (page counter and number of results per page; recognition of new search) - experiments to speed-up the network image production (commented out) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8130 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	6c9320e82a	fix for latest navigation feature... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8121 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	5b2e68b60d	fixed page navigation counter git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8113 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	ebd840ebf6	- enhanced description on search front page - fixed language and heuristic modifier - added hint to crawl start that we can do also ftp and smb crawls - added a protocol extension to remote crawls to transport all search modifiers to remote peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8108 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	e22f8497c9	- tested the ARC methods - removed strict authentication (if password is empty; this was buggy and not useful; can be switched on if necessary globally and not for each interface method) - increased speed of CrawlResults page (no dns lookup any more) - increased speed of favicon display (removed dns lookup) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8104 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	5a55397f99	some last-minute performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	0cf9ebc3b0	speed enhancements when parsing RWI rows (makes search slightly faster) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8096 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	ee8b1d4de1	fixed unresolved pattern and unwanted local/global switch when using votes on search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8093 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	c584db991f	creating a bookmark from the search results now works again .. with new YMarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8092 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	1019c36dad	bug fixes and speed enhancements for search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8085 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	8e0b2c5832	fixed cluster search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8083 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	804e48888b	smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter should also be a little bit faster git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8057 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	84c3fc9d97	local/global fixes in search, better abstraction git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8054 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	0d858d48ec	replaced String with StringBuilder in suggestion process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8020 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	94eab08794	- updated opensearchdescription text and icon - removed automatic setting of maxitems during search (can be set now elsewhere) - updated RSSMessage.java git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8009 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	9e4875230f	performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8001 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	a7df70221e	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	d2ea250d99	refactoring: - moved many classes from de.anomic to net.yacy - made more sub-packages for search classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	594d8f546a	#cccamp11 maintenance fix: anons may find up to 1000 items in interactive search (was: 100) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7866 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	59b767eebd	stop loading via http at defined maximum of bytes - even size is unknown before loading using max-file-size of type int for parsing documents (since content is used as byte-arrays, 'integer' should be maximum) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7855 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	11dc653de3	added a visualization of peer pings to the performance graphic git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7837 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6d9e5865ee	faster appearance of search result page (but complete search time is the same) this was inspired by http://bugs.yacy.net/view.php?id=37 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7801 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	31283ecd07	- added a search option to filter only specific network protocols. i.e. get only results from ftp servers. Just add '/ftp' to your search. for example search for "passwd /ftp". This can also be done with /http /https and /smb - fixed some search throttling processes that should protect your peer against search DoS or strong search load git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7794 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	115abc8917	- more attributes for search progress bar - moved cache strategy to cora package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7778 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	87082f407e	less String object creation during search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7756 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4bea3f9714	hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes). The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	e28bd0d038	fix for some possible causes of memory leaks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7741 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	8d9b5dda3b	disabled did-you-mean computation for json and rss search results where this info is not used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7739 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5b579e21a3	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7713 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	0621a15f89	fix for wrong search result counter: added a counter for all filtered out entities see also http://bugs.yacy.net/view.php?id=5 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7704 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6e42d4de88	- added full-String search function: find things that match exactly what is quoted in the query - re-structuring authentification methods to fix a problem with API steering git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7697 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	1ff9947f91	) added new user right: extended search right (allows to define users who can query more results than anonymous users) ) cleaned up code a little bit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7635 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	156cf02703	- added an index constraint 'has location' to the condenser - added evaluation of the 'has location' constraint to search using the /location operator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7633 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	16cd919795	) fixed Exceptions which caused 500 error when entering invalid URL mask or invalid prefer mask, invalid masks are ignored, error message is displayed on yacysearch.html (what about yacysearch.rss and yacysearch.json?) ) fixed "more options" link on yacysearch.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7623 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	ba03ca8620	added more configuration options for search: - removed configuration button for 'search only for admin' from index.html and added this to ConfigPortal - added configuration of link verification options (iffresh, cacheonly, nocache, ifexist) to ConfigPortal - added configuration of navigation options to ConfigPortal - added an option to switch off automatic index cleaning in case that a link verification method fails git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7613 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	2861d0888a	) simplified code\n) fixed potential NumberFormatExceptions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	bea8137997	) minor changes ) fixed potential NPE in suggest.java git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7571 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	3e03963b1c	*) minor changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7570 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	cb1f49d0f2	replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	bed79402be	introduction of a new remote search load control: the remote search has taken 10 results per peer with a time-out of 3 seconds so far. The attributes of number of results per peer and time-out time can now be configured. This has two aspects: the user who searches may want to increase these values to get more results and more load on the remote side and the user of the server which is accessed for this search may want to restrict the load. Both sides can now be configured. The server-site maximum load parameters are defined by a network definition and the client-side search request load can be defined by each user individually but when the remote search is done the requested service is limited to the network definition. You can find now in the network definition file: network.unit.remotesearch.maxcount and network.unit.remotesearch.maxtime and in the yacy.conf file: remotesearch.maxcount and remotesearch.maxtime There is currently no web interface to define the client-side remote search attributes, please set them manually git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7548 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	24909b3006	slightly less restrictive values for DoS git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7509 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	311f57d360	DoS to prevent online snippet fetch: allow read from cache. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7508 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5892fff51f	introduction of dht-burst modes: this can expand the number of target peers in some cases where a better heuristic is needed. The problematic cases are either when a muti-word search is made (still a hard case for our term-oriented DHT) or when a network operator wants that all robinson peers are asked. We therefore introduced two new network steering values that switch on more peers during the peer selection. Because the number of peers can now be very large, the number of maximum httpc connections was also increased. Please see new coments in yacy.network.freeworld.unit for details of the new DHT selection methods. The number of maximum peers is now not fixed to a specific number but may increase with - the partition exponent - the number of redundant peers - the robinson burst percentage - the multiword burst percentage The maximum can then be the number of senior peers (all visible peers). git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7479 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4588b5a291	- fixed document number limitation for crawls that restrict the number of documents per domain - some restructuring of the document counting and logging structures was necessary - better abstraction of CrawlProfiles - added deletion of logs to the index deletion option (if the index is deleted using the servlets) which is necessary to reset the domain counters for the page limitation - more refactoring to get the LibraryProvider more clean - some refactoring of the Condenser class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7478 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	0cdfb82963	replaced more appearance of double values by float values git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7461 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	88773e4daa	changed the default port from 8080 to 8090 see also: http://forum.yacy-websuche.de/viewtopic.php?p=21683#p21683 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7454 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	efb4ca8fa8	modified auto-delete of search failure-words: - words are now not deleted from the search index automatically if index receive is switched off - a flag in the network definition defines if this feature is switched on at all - the search filter for not-found word references is switched off for server-side remote searches git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7441 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c93f4dda72	- cleaned up yacy news - removed unused methods - avoid news generation in case that the peer runs in robinson mode git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7431 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a4c9d27287	- moved some variables from Stwitchboard to new class AccessTracker - added a limitation in access tracking to delete queries which are older than 10 minutes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7410 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	b2ed4cfaf8	more small bugfixes and light refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7401 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	1b6702146f	remove '*' from query string (people believe that this is a wild card) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7400 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4565b2f2c0	removed the display option from index.html, yacysearch.html and yacyinteractive.html instead, a setting at ConfigPortal.html can be made to define if the topmenu shall be shown at these pages or if there is no naviagtion at all. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7366 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	db99db4be9	some redesign of the search-fail-response mechanism: when a search fails for a single url because the snippet cannot be generated, then the url reference is deleted from the index. This mechanism was redesign and enhanced. The process now also writes into the work tables into the table searchfl to prepare a re-indexing mechanism. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7364 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	18d33b5c6d	fixed several search result navigation bugs fixed bad behaviours during search result collection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7362 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	49b5a206cd	- better caclculation of search result size - predefined search recommendations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7361 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f0651e5f2f	added image search to yacyinteractive.html this causes that the search result view switches from list format to image preview format when a search is restricted to png, gif or jpg documents git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7358 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	cc6499bf8d	- added http://blekko.com as search heuristic (like scroogle). This was easy since they deliver their search results also as rss feed - renamed YaCys search result modifications keywords for RECENT, NEAR and language: to the blekko slashtag naming scheme. YaCy now supports the following blekko-like slash built-in slashtags: /date - for search results ordered by date (most recent up) /near - for search results where search words appear near to each other (closest up) /language/<lang> - for a sorting by language where the wanted language gets up. Example: /language/de git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7350 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d4a1a1850b	removed warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7347 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	9b3fae9496	) cleaning up the code a little bit ) program to interface, not implementation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7345 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	7bb4b001ed	- view image files from cache - fixed generic header settings; affects CORS functionality git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7344 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	70c95608d4	Added CORS Access header for yacysearch.rss output used some of the recommendations from Copro: http://forum.yacy-websuche.de/viewtopic.php?p=21015#p21015 Original Request: http://forum.yacy-websuche.de/viewtopic.php?p=20829#p20829 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7288 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	facfd204e9	added a parent configuration option. see /ConfigPortal.html requested here: http://forum.yacy-websuche.de/viewtopic.php?p=21099#p21099 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7271 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	863065abc4	added user agent logging to access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7256 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	de722090b5	enhancements in did-you-mean guessing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7243 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a59c885ee0	autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7242 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	45b1ab3d07	custom + generic skins: - added a generic skin which is filled with actual color assignment using a servlet - enabled css servlets - added a generic color scheme in configuration file - added configuration input in Customization/Appearance servlet - added a jquery color picker widget - placed color picked widget to input field of generic colour definition input fields git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7235 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	0d363a94d7	more performance hacks this makes YaCy search results VERY fast for all verify=false search cases and it enhances the search speed also for all other snippet-fetch cases. With this change my peer performed 100 Queries Per Second (!!!) while doing 10 queries simultanously (!!!) in an intranet index of 20000 URLs on my 16-core Mac Check this yourself by doing: cd bin ./searchtestmulti.sh after finishing the run, divide 1000 by the given time per query (which is the qps for one thread) and then multiply again by 10 (because 10 search threads has been started) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7231 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	10a9cb1971	simplified snippet computation process and separated the algorithm into two classes also enhances selection criteria for best snippet line computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7182 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	84a023cbc8	fixed several search bugs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7180 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
lotus	937dd956d3	save default number of search items via web interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7179 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	f32bb5e51f	) Changed image in Steering.html from linked image to embedded image because shutdown is so fast now, browsers can't load image before Yacy instance is gone already. Had to make image smaller since IE does not accept large Base64 encoded images. ) Decreases wait time in Steering.html before first check since *) HTML fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7165 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	906c572621	- enhanced index create menu structure - clear search log caches each time a search is done git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7142 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	64860dc1bb	enhanced search event logging (to be used for further improvements) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7140 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	34a25856a5	- added navigation to next/prev search page using arrow keys (left/right) - better information text for YaCy GUI application git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7134 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
lotus	b73ea6581d	fix json in case of query includes " git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7125 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	3197ca42ed	preparations to move the HTCache into cora: - move the header framework classes to cora - move the ARC caching classes to cora - refactoring of code to call these classes from cora git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7068 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	171f2bd84e	- removed unused network oanet - added new network definition 'allip' which can be used in networks where intranet and internet-addresses shall be indexed - added a auto-switch-off for global search if there are no global peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7030 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	43e6ce62af	use heuristics only if user is authenticated git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6962 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b6fb239e74	redesign of parser interface: some file types are containers for several files. These containers had been parsed in such a way that the set of resulting parsed content was merged into one single document before parsing. Using this parser infrastructure it is not possible to parse document containers that contain individual files. An example is a rss file where the rss messages can be treated as individual documents with their own url reference. Another example is a surrogate file which was treated with a special operation outside of the parser infrastructure. This commit introduces a redesigned parser interface and a new abstract parser implementation. The new parser interface has now only one entry point and returns always a set of parsed documents. In case of single documents the parser method returns a set of one documents. To be compliant with the new interface, the zip and tar parser had been also completely redesigned. All parsers are now much more simple and cleaner in its structure. The switchboard operations had been extended to operate with sets of parsed files, not single parsed files. additionally, parsing of jar manifest files had been added. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6955 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	11b7853940	added a configuration page for search heuristics. currently you can switch on there: - a site-operation heuristic that loads all direct links from a portal page if the site-operator is used - a direct crawl for search results from scroogle for the given search terms The configuration page can be found directly beside the network configuration page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6951 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1557e0f2d0	- some refactoring for internal RSSFeed (protocol of all actions as seen on status page) - added dht-out to internal RSSFeed (you can see now messages about distributed indexes on status page) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6948 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	dcd01698b4	added a 'transition feature' that shall lower the barrier to move from ggle to yacy (yes!): Here a new concept called 'search heuristics' is introduced. A heuristic is a kind of 'shortcut' to good results in IT, here for good search results. In this case it will be used to get a very transparent way to compare what YaCy is able to produce as search result and what ggle produces as search result. Here is what your can do now: - add the phrase 'heuristic:scroogle' to your search query, like 'oil spill heuristic:scroogle' and then a call to scroogle is made to get anonymous search results from ggle. - these results are _not_ taken as meta-search results, but are used to instantly feed a crawling and indexing process. This happens very fast, here 20 results from scroogle are taken and loaded all simultanously, parsed and indexed immediately and from the results of the parsed content the search result is feeded, along to the normal p2p search - when new results from that heuristic (more to come) get part of the search results, then it is verified if such results are redundant to existing (they had been part of the normal YaCy search result anyway) or if they had been completely new to YaCy. - in the search results the new search results from heuristics are marked with a 'H ++' and search results from heuristics that had been already found by YaCy are marked with a 'H ='. That means: - you can now see YaCy and Scroogle search results in one result page but you also see that you would not have 'missed' the ggle results when you would only have used YaCy. - to make it short: YaCy now subsumes g**gle results. If you use only YaCy, you miss nothing. to come: a configuration page that let you configure the usage of heuristics and get this feature by default. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6944 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	3a9dc52ac2	added a fascinating new way to search _and_ start a web crawl at the same time: implemented a hint from dulcedo "use site: - operator as crawl start point". YaCy already was able to search using a site-constraint. This function is now extended with a instant crawling feature. When you now use the site-operator, then the landing page of the site iand every page that is linked from this page are loaded, indexed and selected for the search result within that search request. When the remote server responds quickly enough, then this process can result in search results during the normal search result preparation .. just in some seconds. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6941 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	d7767e7589	IFFRESH is too strong, IFEXIST sufficient for cache policy when doing a link verification (this is as it was two commits before) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6938 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	777195e8d1	more abstraction for access of LoaderDispatcher and cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6937 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	7bcfa033c9	more abstraction of the htcache when using the LoaderDispatcher: a cache access shall not made directly to the cache any more, all loading attempts shall use the LoaderDispatcher. To control the usage of the cache, a enum instance from CrawlProfile.CacheStrategy shall be used. Some direct loading methods without the usage of a cache strategy have been removed. This affects also the verify-option of the yacysearch servlet. If there is a 'verify=false' now after this commit this does not necessarily mean that no snippets are generated. Instead, all snippets that can be retrieved using the cache only are presented. This still means that the search hit was not verified because the snippet was generated using the cache. If a cache-based generation of snippets is not possible, then the verify=false causes that the link is not rejected. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6936 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago

1 2 3 4 5 ...

554 Commits (49b79987c91c2f4734384d8f68a806e5a5370dbb)