yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	5e31bad711	- the webgraph shall store all links which appear on a web page and not all unique links! This made it necessary, that a large portion of the parser and link processing classes must be adopted to carry a different type of link collection which carry a property attribute which are attached to web anchors. - introduction of a new URL class, AnchorURL - the other url classes, DigestURI and MultiProtocolURI had been renamed and refactored to fit into a new document package schema, document.id - cleanup of net.yacy.cora.document package and refactoring	11 years ago
Roland Haeder	841a28ae76	Added 'final' for all exception blocks as this helps the Java compiler to optimize memory usage Conflicts: source/net/yacy/search/Switchboard.java	11 years ago
Michael Peter Christen	5878c1d599	- refactoring of log to ConcurrentLog: jdk-based logger tend to block at java.util.logging.Logger.log(Logger.java:476) in concurrent environments. This makes logging a main performance issue. To overcome this problem, this is a add-on to jdk logging to put log entries on a concurrent message queue and log the messages one by one using a separate process. - FTPClient uses the concurrent logging instead of the log4j logger	12 years ago
Marc Nause	75f9568472	) only install files from the RELEASE directory ) minor changes	12 years ago
Marc Nause	3bc5ee6e3d	*) added protection against CSRF in update download page (http://localhost:8090/ConfigUpdate_p.html?releaseinstall=../../test.txt&deleteRelease=Delete+Release does not work anymore)	12 years ago
Michael Peter Christen	c5f67a5d6d	fixed a problem with local search from solr results: now all results from solr are shown (again)	12 years ago
Michael Peter Christen	f8f05ecba7	- added a delete button in host browser to delete a complete subpath - removed storage of default collection name - default is now "user" - made stacking of crawl start points concurrently	12 years ago
Michael Peter Christen	b400fc7b4d	fix for file parser problem	12 years ago
Michael Peter Christen	6017691522	added an exception catch	12 years ago
Michael Peter Christen	613cf7da7f	enhancement to post argument parsing - possible fix to zero-filled parameter values	12 years ago
Michael Peter Christen	a8167e6e5b	clean-up: removed unused methods in kelondro	12 years ago
orbiter	0cbda0b2b8	- replaced all length() == 0 and size() == 0 with isEmpty() - replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be done automatically - implemented some isEmpty() methods	13 years ago
Michael Peter Christen	ce8d4b87d9	fixes for new eclipse 'Juno' warning 'Resource leak'.	13 years ago
Michael Peter Christen	b9d42fd9c8	using com.google.common.io.Files instead of homebrew methods	13 years ago
Michael Peter Christen	3b992e6b00	using utf8 String compression in Webstructure database	13 years ago
Michael Peter Christen	c639248c23	protection against strange answers from remote peers during search	13 years ago
Marek Otahal	f75b5e40e0	little fix in copy() Signed-off-by: Marek Otahal <markotahal@gmail.com>	13 years ago
Michael Christen	e7e429705a	- less automatic indexing after a search (needs to reset the default crawl profiles) - fix for concurrency problem in storage of serverSwitch Properties - markup update	13 years ago
orbiter	775b44017e	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8033 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	eb9c9edb01	enhanced table method (used by almost all yacy api interfaces) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8000 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	035ebfbf3b	- performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill) - this may have also (good) performance side effects on other parts of YaCy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7982 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	9170a434ed	throwing an exception again in FileUtils.copy(reader, writer) OOMs could occour here and should not be ignored git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7858 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	fe0c08455b	more concurrency (enhancement) hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7759 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a36fda991e	hack to increase speed of url hash computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7751 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4bea3f9714	hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes). The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	746e3c3b06	Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties A synchronization is not needed here and applies an overhead to the httpd process which is now removed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7745 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	10e2f588f8	- enhanced ybr ranking computation - many speed/performance hacks - added solr charding and new charding web interface - added option to switch off the yacy index when using solr - added new fail-url categories which are used to make a distinction which fail-urls to be sent to solr - refactoring/renaming of some method names to distinguish host/url hashes better - a large number of bug/npe fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7738 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	1989ebc24b	removed more warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	a07a1a8b1e	removed type cast warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7593 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	694fa3a2a5	- replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion - changed menu structure slightly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	e1b6916423	always try to guess the size of a StringBuilder to prevent too many memory re-allocations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
low012	3b40b98256	) set SVN properties ) minor changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	cb1f49d0f2	replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	993b9bc1a8	memory/performance hacks, less synchronization, better concurrency git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7544 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	1110d16af9	performance hack: replaced generic row.getColBytes() call with row.getPrimaryKeyBytes() where the column is 0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7529 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	804ae2275b	- do not delete idx and gap files if the heap is not modified this change may have bugs in it which may cause damage to your existing data. please use with care. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7516 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	af87af0d4c	- removed synchronization in serverSwitch which should improve speed - fixed wrong assert in network graph - enhanced double check method in table class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7511 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	982aa689ef	* fix StringIndexOutOfBoundException in WebStructureGraph * add better escaping to saveMap and loadMap git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7458 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	4915d1781a	* use local backup-file, if remote network-definition is not availible * resolve single point of failure in networks, managed by central network-definitions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7363 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	48c0d508ac	fixes for crawling of smb links (file length not always available) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7190 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	b6fb239e74	redesign of parser interface: some file types are containers for several files. These containers had been parsed in such a way that the set of resulting parsed content was merged into one single document before parsing. Using this parser infrastructure it is not possible to parse document containers that contain individual files. An example is a rss file where the rss messages can be treated as individual documents with their own url reference. Another example is a surrogate file which was treated with a special operation outside of the parser infrastructure. This commit introduces a redesigned parser interface and a new abstract parser implementation. The new parser interface has now only one entry point and returns always a set of parsed documents. In case of single documents the parser method returns a set of one documents. To be compliant with the new interface, the zip and tar parser had been also completely redesigned. All parsers are now much more simple and cleaner in its structure. The switchboard operations had been extended to operate with sets of parsed files, not single parsed files. additionally, parsing of jar manifest files had been added. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6955 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	60e71876ad	- more abstraction (HashMap -> Map) - more concurrency-awareness (HashMap -> ConcurrentHashMap) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6910 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	fc5efcc05a	enhanced and fixed OAI-PMH import - now importing OAI-PMH server list fron two sources - simultanous import from several servers (even > 2000) - check buttons on OAI-PMH server list to select multiple servers for import start - it is possible to select all servers at once for import - imported XML data is gzipped after import from surrogate reader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6847 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1a8a134e0c	continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790 The result should be a less usage of new String() and less memory usage (since a String-encapsulated byte[] has 40 bytes overhead) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6815 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	25aef069a6	continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6790 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	0f8004f9da	enhanced html parser to recognize a href tags inside header tags git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6743 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	a06f7ddb33	more PMD recommendations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6572 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	dd459281c8	applied code changes that are recommended by PMD git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6563 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	d77a8f3b3e	added some modifications recommended by PMD for better performance git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6560 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	5afd9f7a91	fix for crlf writing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6477 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago

1 2

54 Commits (e56aa4fe93e3ecc82728f1e253ce1b3ff564852a)