yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	bedd8dfbe2	- added image sorting by image size. This is the default now. This is performed using a 3-stage sorting process: - sort by relevance, then do snippet-fetch - sort snippets by relevance then do image link extraction - sort image links by image size; unknown sizes are handled like small sizes - only the exact amount of images as requested are shown git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4499 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	727feb4358	- fixed some bugs in ranking computation - introduced generalized method to organize ranked results (2 new classes) - added a post-ranking after snippet-fetch (before: only listed) using the new ranking data structures - fixed some missing data fields in RWI ranking attributes and correct hand-over between data structures git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4498 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f4c73d8c68	- fixed highslide usage - some enhancement to index management, better types git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4497 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3441ec3928	- some small changes to highslide integration to get it working... (does not work yet) - performance enhancement for url list parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4495 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6c3cd2b4f2	- added new way to watch images from the image search: they appear as separate, floating window above the search results, not in a new window - added highslide javascript library for feature mentioned above - removed dir servlet. This thing was not used as it was supposed to be (as an example applet) and was a major problem for intranet-indexing when files are hosted on the same peer. - added yacy-httpd-internal directory listing. Because YaCy is a search engine, directory listings are similar to search result listings. Intranet indexing from the same peer will get nice index pages for document collections. - removed unused test applet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4494 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	61a81820e3	- refactoring of search tracker - added link to search history to repeat the search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4493 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4079c38ce0	- probably slightly better default ranking - added experimental right column to new search page (no function, only container) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4487 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	8fd5e52f04	added basket icons and experimental gif animation class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4485 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	451cde3d92	added images folder git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4484 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cfe499d8c9	first test of alternative search interface (only a stub but working!) try http://localhost:8080/yacy/user/ysearch.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4482 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	52a7cf0cc9	re-added list interface (blacklist imports need them) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4480 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	22485dcca8	rename opeer -> oseed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4462 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	77ba446332	seedDB helpers update/cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4461 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bd63999801	- faster search: using different data structures that avoid multiplr calculations - no more table copy for error-eco table - optional table copy for lurl-entries - more abstractions (less single constant strings) - better logging (using host names instead of ips) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	a8d336c379	undo 4448 (no bug) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4455 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	d1758eb17d	mistake corrected cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4454 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	9f69b1f08f	small change (2) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4449 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	5ac71729d8	small change git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4448 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	85a82950e0	seedDB helpers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4447 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7404256997	- no more search time-out! - fixed a bug with last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4430 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a8a5df4a51	- more dublin core naming of page metadata - better presentation of result counters in search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4420 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	45339c3db5	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4341 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a6ca3b51be	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4322 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a5054c038d	- added large number of generics - redesign of ordering structures in kelondro (old did not work with strict generics) - 50% IO reduction during read access on kelondroFlex (ommiting of read on index table) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4320 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	71bcf02d3a	- removed pro-version (is the same as standard version, use the standard instead) - changed yacy logo - removed crawlOrder protocol (unused) - removed file index in kelondroFlex (will not work, it takes too long to maintain) - fixed remoted crawl for clusters (now denies remote crawls from peers outside cluster) - 0.562 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4317 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ecd7f8ba4e	- added NEAR operator (must be written in UPPERCASE in search query) - more generics - removed unused commons classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4310 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	03e7782269	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	d517e96714	last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	33ee6745f6	more cleanup in serverDate - remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead - remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster - default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset. - continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	21b8d1b918	small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	270d016d89	fix for missing anonymization in search profiling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4274 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f243e338cf	implemented online caution also for local and remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4252 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b46bcaa5d8	changed method of profiling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4248 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f645408ae9	added url retrieve option to uls.xml interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4239 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cc20870267	fix for constraint handover problem: old yacy versions set a catchall-constraint if no constraint is given, but the new versions expect a null-constraint. see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=565&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4238 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9b0ae4b989	added referrer to remote crawl url list git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4236 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d59c1a7936	removed test data git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4233 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	89b9b2b02a	redesigned remote crawl process: - instead of pushing urls to other peers, the urls are actively pulled by the peer that wants to do a remote crawl - the remote crawl push process had been removed - a process that adds urls from remote peers had been added - the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version - the list-interface had been removed - servlets using the list-interface had been removed (this implementation did not properly manage double-check) - changes in configuration file to support new pull-process - fixed a bug in crawl balancer (status was not saved/closed properly) - the yacy/urls-protocol was extended to support different networks/clusters - many interface-adoptions to new stack counters git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2fcd18a972	- fixed bad behaviour of search event worker processes - fixed export of url lists in xml git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	edba2b7bcc	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=543 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4224 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c48b73cda2	redesign of ranking data structure - the index administration now uses the same code base for url selection and collection as the search interface. The index administration is therefore a good test environment for ranking order control - removed old postsorting-algorithms, will be replaced with new one - fixed many bugs occurred before during ranking; especially the contraint filtering method removed too many links - fixed media search flags; had been attached to too many urls. The effect should be a better pre-sorting before media load within snippet fetch git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4223 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6f1308da2f	- some enhancements to IndexControlURLs (shows more links, connects referrer to another query) - some refactoring to search process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4222 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c527969185	- enhanced monitoring of ranking parameters for details, please try http://localhost:8080/IndexControlRWIs_p.html - fixed computation of ranking ordering in some cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bc2368e907	fix for problem with remote crawl referrers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4210 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6eaa5a0e64	enhanced local search speed. The ranking process is now 6 times faster that before. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	425e4ead66	Allow absolute paths in configuration settings. - before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging). - abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path. - exceptions (hardcoded): DATA/LOG/yacy.logging DATA/SETTINGS/httpProxy.conf DATA/SETTINGS/user.db TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example. - add missing workPath to yacy.init (it was used in code, but there was no default in the file) - fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos. - replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a31b9097a4	preparations for mass remote crawls: two main changes must be implemented to enable mass remote crawls: - shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused as crawl agent for unwanted file retrieval - implement new index files that control double-check of remotely crawled urls After removal of robots.txt checking from stacker threads, the multi-threading of this process is void. Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since creation of these threads is not resource-consuming, for a detailed explanation see svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	0e1738899f	* Complete number localization and provide a more reasonable interface to serverObjects: - put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation. - putASIS(...) have been removed, now done with simple put(...) (see above). - puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()). - putHTML(...) escapes special characters into corresponding HTML enities ('<' => '<') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ". In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value. A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values. * added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456 * removed duplicate code (mostly related to the big changes above). TODO: - make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 - probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting. - further improve the speed of page creation for the WatchCrawler. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	f717beecb1	- Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers. - some minor code cleanups (mostly unnecessary casts, null checks) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	ce0bb1dc8a	Increase defaults for the DHT Recieve Limits to prevent "busy" states. see git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4155 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago

1 2 3 4 5 ...

301 Commits (81687b6bd575c6e39d22ea415e6d3ce0cd75fd64)