yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	8313d58ae7	- integrated the collage into the Web Visualization menu - added a counter for the public and private queue on the page (testing..) - fixed wrong public/private categorization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4686 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	82bf9ac1c8	- added Collage servlet from datengrab and modified it: * all images are queued * private/public is respected * inserted into switchboard * added collageQueue class that stores all the queued images git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4683 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	202a3adb3e	refactoring of HttpClient Writer processes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4678 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e356625b22	- refacotring of stream copy handling to support time-consuming operations - made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer) - introduced another timeout setting (java internal property) - more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4674 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c3342e1178	- removed class with only one static method - removed connection method with too long time-out git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4672 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	5c3c1fdf41	replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7f9f639d20	- refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering - refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling - removed unused code parts from condenser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d6050b9ffb	- separated the LURL data storage and Crawl result stack for process supervision. this is another step to enable multiple, concurrent fulltext-indexes - another try to make the yacy-httpc more stable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	541b817502	refactoring of switchboard queueing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	275a226cc5	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	bc3d3b4c97	fixed rebuildTags() to correctly rebuild folders... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4523 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2327451653	- changed order of database initialisation (index first) - removed mainly unused init-time for databases (was only used for tree tables, which are not used any more) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4496 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	9ecc17baef	fixed double Blog entrys git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4492 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	94e256e13b	* removed single Blogview, now links direct to BlogComments.html * some other small changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4483 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	00f5f917de	- more refactoring to blog - fixed moderate comment bug. see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=860 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4478 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7f445f34a6	bitte die Java 5 - typischen Warnings einschalten! (unboxed-Fehler wies auf Programmfehler hin und Typangabe fehlte) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4476 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	c1b9a03304	* some refactoring to Blog * changed default sort order to reverse (newest first) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4475 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	766a04bc06	fixed sort problem in Blog. see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4474 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bd63999801	- faster search: using different data structures that avoid multiplr calculations - no more table copy for error-eco table - optional table copy for lurl-entries - more abstractions (less single constant strings) - better logging (using host names instead of ips) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	8358652fa9	some small changes to blog git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4457 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	6a85764e1a	Second bugfix for numberbug in Blog. This update fix automatic existing blogentrys. A backup is not needed but almost a good idea ;) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4451 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	40a0591942	Fixed numberbug in Blog, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639 . This wont fix existing Blogentrys (comes later). git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4443 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7d875290b2	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4417 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9d693ee635	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4415 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0f5c4abaca	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4a80902081	- added ViewProfile as rdf in foaf syntax - added link to rdf and vCard version on html page - can be seen on http://localhost:8080/ViewProfile.html?hash=localhash - more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	b1fae9b5af	fixed import Netscape Bookmarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4401 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	f3a9e9c542	added getFolderList() to bookmarksDB added cleanTagsString() to bookmarksDB added getFoldersString() to Bookmark modified getTagsString() to exclude folderTags git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4383 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	e81bced2bd	reorganized the code and adjusted getTagIterator() to suit folders git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4357 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	53367d941a	more information (BASE64) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4324 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	704de4dee8	Neue Funktion angelegt - notwendig für Einschränkung der Tagwolke public Iterator getTagIterator(String tagName, boolean priv) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4313 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	03e7782269	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	d517e96714	last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	4748d5c1ab	Some enhancements to time management: - remove unnecessary generation of Calendar and Date objects - synchronized SimpleDateFormat objects in blog-, message- and wikiBoard - correct use of TimeZones and SimpleDateFormats git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	1cb6e431a6	Replace the ISO8601 aka W3C datetime parser by one that supports every representation allowed by this standard, see http://www.w3.org/TR/NOTE-datetime - useful expecially for sitemaps parsing, where this date format is used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4286 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	33ee6745f6	more cleanup in serverDate - remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead - remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster - default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset. - continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	21b8d1b918	small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c527969185	- enhanced monitoring of ranking parameters for details, please try http://localhost:8080/IndexControlRWIs_p.html - fixed computation of ranking ordering in some cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6eaa5a0e64	enhanced local search speed. The ranking process is now 6 times faster that before. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	425e4ead66	Allow absolute paths in configuration settings. - before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging). - abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path. - exceptions (hardcoded): DATA/LOG/yacy.logging DATA/SETTINGS/httpProxy.conf DATA/SETTINGS/user.db TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example. - add missing workPath to yacy.init (it was used in code, but there was no default in the file) - fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos. - replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a31b9097a4	preparations for mass remote crawls: two main changes must be implemented to enable mass remote crawls: - shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused as crawl agent for unwanted file retrieval - implement new index files that control double-check of remotely crawled urls After removal of robots.txt checking from stacker threads, the multi-threading of this process is void. Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since creation of these threads is not resource-consuming, for a detailed explanation see svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	0e1738899f	* Complete number localization and provide a more reasonable interface to serverObjects: - put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation. - putASIS(...) have been removed, now done with simple put(...) (see above). - puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()). - putHTML(...) escapes special characters into corresponding HTML enities ('<' => '<') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ". In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value. A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values. * added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456 * removed duplicate code (mostly related to the big changes above). TODO: - make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 - probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting. - further improve the speed of page creation for the WatchCrawler. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	06e6a1ff62	Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html\|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code). NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	9b0948cb4c	gnarf. mixed up the positions. finally fixed... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4143 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	c0f5fc51ef	bugfix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4142 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	c5a8585ac6	fix more encooding problems in yacysearch.rss. - URL encoding for search terms where required - removed "ugly" CDATA escaping - UTF-8 encoding for the XML - no HTML style escaping for XML/RSS element values Note: some unicode characters might still be encooded in a wrong way. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4140 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	01e0669264	re-designed some parts of DHT position calculation (effect is the same as before) and replaced old fist hash computation by new method that tries to find a gap in the current dht to do this, it is necessary that the network bootstraping is done before the own hash is computed this made further redesigns in peer initialization order necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	842308ea97	- redesigned crawl start menu, integrated monitoring pages - removed web structure picture from indexing menu and grouped it together with htcache monitor - added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database - extended crawl profile edit servlet, shows now also terminated crawls - option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues! - fixed here and there problems with indexing queues - enhances indexing speed by changing cache flush sizes. - changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched. next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	11b4f80bde	- fixed non-closing client connections - added client connection tracker in connections servelet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1488769e1f	cleanup of unmaintained and outdated performance methods: removed object pools in httpc. Object pooling is not recommended, if the creation of the object is not time-intensive. Object pools are only useful, if there is much computation necessary to create some basic data that is stored in the object pool and can be re-used. This does not apply to object pools in YaCy. Object pooling of client sessions would make sense if they would allow re-use of living connections to other yacy clients. But every connection is closed after usage of an object in the client pool, therefore the YaCy server client objects are not such that hold hardware/network-allocated entities. See: http://www.javaperformancetuning.com/news/qotm033.shtml http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling http://docs.sun.com/source/816-7159-10/pt_chap5.html http://www.microjava.com/articles/techtalk/recylcle2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago

1 2 3 4 5 ...

395 Commits (c7021c14bbfb65f631c48fc9cd395293a2d49727)