yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	6e59ca4ebf	removed jena library and all code that depended on jena. When jena was introduced, it was also used for search facets. The generic search facets are now deduced from generic solr fields which makes jena as tool for facet semantics superfluous.	11 years ago
sixcooler	0512e46c6a	bump to httpclient-4.3.2	11 years ago
Michael Peter Christen	84cf7e8e9f	backmigration from solrj 4.6.0 to 4.5.1. This is necessary because solrj.4.6.0 has a bug which prevents the attachment of a remote solr (as tested with a SolrCloud). See bug report https://issues.apache.org/jira/browse/SOLR-5532 This bug shall be fixed in Solr 4.6.1. Fortunately, solrj-4.5.1 works together with solr-4.6.0 thus the current index does not need to be changed.	11 years ago
reger	37f2a82a5d	making root context (htroot) a WebAppContext - this allows additional features, like servlet configuration via web.xml and many more things. - currently the standard servlets are still configured in the code (so the supplied defaults/web.xml is not realy needed, yet), but could be expanded - lookup for web.xml - 1. in /DATA/SETTINGS then in /defaults	11 years ago
reger	1b6d173b14	update to Jetty 8.1.14	11 years ago
sixcooler	8954b2d25f	removed classpathentry to 'remove obsolete htroot/solr htroot/gsa YaCy-servlets'	11 years ago
sixcooler	37859dfc85	missing entrys for: 'updated poi-3.9 / poi-scratchpad-3.9'	11 years ago
Michael Peter Christen	7603e879dc	Merge branch 'master' into HEAD Conflicts: .classpath source/net/yacy/cora/federate/solr/SolrServlet.java	11 years ago
Michael Peter Christen	8b97489ff2	updated guava to 15.0	11 years ago
Michael Peter Christen	34b4eda4a8	upgraded json-simple to 1.1.1	11 years ago
Michael Peter Christen	75ae36da9c	upgraded jsch to 0.1.50	11 years ago
Michael Peter Christen	db793a2a5e	removed mysql connector which was used only for testing in the past	11 years ago
Michael Peter Christen	7ebc74b76a	migrated to pdfbox 1.8.3	11 years ago
Michael Peter Christen	2f16770681	migrated to solr 4.6.0	11 years ago
reger	cb2dbcb843	add graceful Jetty shutdown option - as Jetty stop is not synced, yet - include jetty jars and servlet-3.0 api jar in Eclipse .classpath	11 years ago
reger	1adb4b8741	merge rc1/master	11 years ago
sixcooler	dfb73c9519	bump to httpclient-4.3.1 - a bugfix release	11 years ago
reger	a44eede8b8	merge rc1/master	11 years ago
Michael Peter Christen	21aa6a0321	migration to Solr 4.5.0	11 years ago
sixcooler	15b1bb2513	bump to httpClient-4.3	11 years ago
reger	105cf8f593	changes to adjust jetty to recent code changes	11 years ago
reger	aafef72a8a	merged current rc1/master into jetty branch to allow further development with latest version ServerSideIncludes and servlet return values need further work (for working jetty integration) - TODO: added nasty quickfix to allow SSI - needs further work - TODO: YaCy servlet return values/parameters are not handled	11 years ago
Michael Peter Christen	5b7c0d0745	update to pdfbox 1.8.2	11 years ago
Michael Peter Christen	f13df9dbb6	migration to solr 4.4.0	11 years ago
Michael Peter Christen	dc1002e511	cleaned sourcepaths from eclipse classpath	11 years ago
Michael Peter Christen	c4538d8d91	added metadata-extractor-2.6.2.jar to eclipse classpath, removed old lib	12 years ago
Michael Peter Christen	9bd2aee180	migrated to solr 4.3.0	12 years ago
Michael Peter Christen	ad050ec88d	- upgraded httpclient, httpcore and httpmime - removed httpclient 3.1 which has been used by solrj < 4.x.x and is now not used any more - fixed some parts in YaCy which used methods from httpclient 3.1	12 years ago
Michael Peter Christen	4b100f8b48	Merge branch 'master' of ssh://gitorious.org/yacy/rc1	12 years ago
Michael Peter Christen	3abf516ca7	merged classpath Bitte geben Sie eine Versionsbeschreibung für Ihre Änderungen ein. Zeilen,	12 years ago
orbiter	48e9a54e80	updated pdf parser	12 years ago
Michael Peter Christen	27907c9739	added missing library after solr upgrade	12 years ago
Michael Peter Christen	cf0acd2cb4	upgrade to solr 4.2.1	12 years ago
Michael Peter Christen	461d46101d	- Removed log4j from libraries. This can be removed because the package log4j-over-slf4j is there. From slf4j all loggings are routed to the jdk logger. Now all loggings are consistently done to the jdk logger. - added some lines to the logging properties to suppress many solr logging statements. The number of the logging entries had already become a performance issue, therefore removing these from the log should increase performance.	12 years ago
Michael Peter Christen	b349c8145b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
orbiter	36f9b0fc16	updated wstx-asl to 3.2.9	12 years ago
Michael Peter Christen	788288eb9e	added the generation of 50 (!!) new solr field in the core 'webgraph'. The default schema uses only some of them and the resting search index has now the following properties: - webgraph size will have about 40 times as much entries as default index - the complete index size will increase and may be about the double size of current amount As testing showed, not much indexing performance is lost. The default index will be smaller (moved fields out of it); thus searching can be faster. The new index will cause that some old parts in YaCy can be removed, i.e. specialized webgraph data and the noload crawler. The new index will make it possible to: - search within link texts of linked but not indexed documents (about 20 times of document index in size!!) - get a very detailed link graph - enhance ranking using a complete link graph To get the full access to the new index, the API to solr has now two access points: one with attribute core=collection1 for the default search index and core=webgraph to the new webgraph search index. This is also avaiable for p2p operation but client access is not yet implemented.	12 years ago
Michael Peter Christen	09a2b09c48	guava update	12 years ago
Michael Peter Christen	80fe3d7860	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Conflicts: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java	12 years ago
Michael Peter Christen	4323621a76	update to Solr 4.1.0	12 years ago
sixcooler	639c114199	remove jetty from classpath - as it was moved last commit	12 years ago
sixcooler	f3e705c4fe	bump to httpclient / httpcore 4.2.3 (bugfix-release)	12 years ago
Michael Peter Christen	9dfc9c95d8	updated slf4j and log4j	12 years ago
Michael Peter Christen	95712fdc8b	update to pdf parser	12 years ago
Michael Peter Christen	a1a4d9aa94	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Conflicts: source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java	12 years ago
Michael Peter Christen	e2c4c3c7d3	migration to solr 4.0.0	12 years ago
Michael Peter Christen	69aa39d664	update to libraries required by solr 4.0.0	12 years ago
sixcooler	9d062873d2	bump to httpclient-4.2.2	12 years ago
sof	5cb244b79b	Merge remote branch 'origin/master'	12 years ago
apfelmaennchen	88b062210c	Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based on the jaudiotagger library. The parser is disabled by default as it needs to store temporary files for non file:// protocols, which might be disliked. For your local MP3-collection it loads nicely Artist, Title, Album etc. from the audio files meta data.	12 years ago
sixcooler	9aa21506be	bump to httpcore-4.2.2 (maintenance release)	12 years ago
Michael Peter Christen	d0015df61c	added lucene memory library which is now necessary as solr has to process more complex queries	12 years ago
Michael Peter Christen	e65cecc419	- updated lucene libraries to 3.6.1 - added lucene-grouping which enables faceted search; try this: http://localhost:8090/solr/select?q=:&start=0&rows=3&facet=true&facet.field=host_s	12 years ago
Michael Peter Christen	ff3eaa21b0	added remote search to solr on YaCy peers! - when doing a remote search, node peers are selected for solr queries - the solr query is done concurrently to the standard YaCy rwi search - the solr search result is feeded into the same data structure that prepares the rwi search result - the same remote seach that is done to several outside peers is done to the local solr index - the search process works now also without any 'old' RWI data using solr	12 years ago
Michael Peter Christen	d39463a85c	added deleteByQuery to solr connectors	12 years ago
Michael Peter Christen	2ccf1dba71	upgrade to solr 3.6.1	12 years ago
Michael Peter Christen	ea49a8aa8c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
Michael Peter Christen	d988ba50cf	added a very rudimentary, incomplete, non-verified GSA response writer for solr. Try this: http://localhost:8090/gsa/searchresult?q=pdf&site=col1&num=10	12 years ago
cominch	e74d66e28c	augmented browsing: remove htmlparser library	12 years ago
cominch	e2119f4e76	augmented browsing: replace htmlparser by jsoup, which is more stable and reliable	12 years ago
Michael Peter Christen	bf4968d748	source change in classpath	13 years ago
sixcooler	a99ef68422	bump to httpclient-4.2.1	13 years ago
Michael Peter Christen	65f56b1fd4	Merge branch 'master' of ssh://gitorious.org/yacy/rc1 into jetty Conflicts: .classpath build.xml htroot/Status.java source/de/anomic/http/server/HTTPDProxyHandler.java source/net/yacy/yacy.java	13 years ago
Michael Peter Christen	7b53be141f	upgraded to pdfbox 1.7.0 changes in http://www.apache.org/dist/pdfbox/1.7.0/RELEASE-NOTES.txt with many bugfixes, including performance related	13 years ago
Michael Peter Christen	fad3b14813	added jetty libraries, needed for future use as web server and as application server for the solr search interface	13 years ago
Michael Peter Christen	b9d42fd9c8	using com.google.common.io.Files instead of homebrew methods	13 years ago
Michael Peter Christen	1be0025a9c	- added test for EmbeddedSolrConnector - added needed libraries for this test this includes most (all) files needed for an embedded solr	13 years ago
Michael Peter Christen	90b82ce994	using guava for host resolution (non-blocking for ips) and time-out	13 years ago
Michael Peter Christen	3f55dc7c1e	- added solr core and libraries that solr needs (lucene is missing, will follow later) - added embedded solr connector which can connect to solr programmatically (without using a server in between)	13 years ago
Michael Peter Christen	5fc6524ca8	- moved triple store to net.yacy.cora.lod (should be generalized there later - added abstract add, delete, get methods in the triplestore - added generation of triples after auto-annotation - migrated all MultiProtocolURI objects to DigestURI in the parser since the url hash is needed as subject value in the triples in the triple store	13 years ago
cominch	5d20cd324a	Add Triplestore and RDF query interface Conflicts: build.xml defaults/yacy.init source/net/yacy/interaction/AugmentHtmlStream.java	13 years ago
cominch	b21048892b	augmentedParser add features and integrate external html parser to modify existing web pages Conflicts: addon/YaCy.app/Contents/Info.plist build.xml	13 years ago
sixcooler	56087c1f23	bump to httpclient- httpcore-, httpmime- 4.2	13 years ago
Michael Peter Christen	4d3cc02168	replaced old bzip2 library against better documented commons-compress package from http://commons.apache.org/compress/	13 years ago
Michael Peter Christen	1795a7325b	made HandleSet serializable	13 years ago
Michael Peter Christen	62f2554a01	- fixed build problems (deprecated methods using httpclient 3.1) - removed httpclient 3.1 lib which was used by solrj (solrj now uses httpclient 4)	13 years ago
Michael Peter Christen	248299d10f	updated solrj lib	13 years ago
Michael Peter Christen	f838997126	updated commons io from 2.0.1 to 2.1	13 years ago
Michael Peter Christen	eeb57ae824	updated http client libraries	13 years ago
Michael Peter Christen	ef5192f8c9	using the generic document parser for crawl starts instead of the html parser. This makes it possible that every type of document can be a crawl start point, not only text documents or html documents. Testet this with a pdf document.	13 years ago
Michael Peter Christen	a30b028cc0	updated libraries	13 years ago
Michael Christen	e69afae87e	class path for servlets in eclipse	13 years ago
Al Sutton	8993cac4d8	Initial performance improvements	13 years ago
orbiter	5a7cec59f3	moved ynetSearch to get all files out of htroot/api/util/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8042 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	65ab067491	migration to solrj 3.4.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7952 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	52b477cf6f	bump to httpclient-4.1.2, httpcore-4.1.3 - bugfixrelease git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7876 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	48560a44a9	bump to httpcore-4.1.2: a bugfixrelease git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7853 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c0d9474b31	update to eclipse class path environmen git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7834 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	528b59e078	replaced xerces.jar library that was originally added 2005 with SVN 126 to the libx directory and that was moved to lib in SVN 5781 the new replacement is taken from http://xerces.apache.org and has the version 2.11.0 and was inside the file Xerces-J-bin.2.11.0.tar.gz and consists of two files named xercesImpl.jar and xml-apis.jar The original purpose of that library was to support: - content parsers - optional seed uploader - SOAP API (which will be committed later) Since the SOAP API does not exist any more the purpose is to support content parser and an optional seed uploader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7819 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	77fe69395d	added jempbox-1.5.0.jar which is required by pdfbox-1.5 as stated in http://pdfbox.apache.org/dependencies.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7774 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	efcd21e0ed	new httpclient, httcore (bugfixrelease) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7769 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	761b1c71dc	added latest pdfbox git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7761 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	0abd99621c	correct slip of click in classpath from last commit - I wonder there are 7658'is around apflemaenchen, please don't take this amiss git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7659 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
apfelmaennchen	a0e4960a4d	YMark: - first attempt for a firefox json bookmark importer - added JSON library json-simple-1.1.jar git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7658 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	19fd13d3bc	Added federated index storage to solr. YaCy supports now the storage to remote solr indexes. More federated storage (and search) methods may follow. The remote index scheme is the same as produced by the SolrCell; see http://wiki.apache.org/solr/ExtractingRequestHandler Because this default scheme is used, the default example scheme can be used as solr configuration This is also the same scheme that solr uses if documents are imported with apache tika. federated solr storage is switched off by default. To use this, do the following: - set federated.service.solr.indexing.enabled = true - download solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/ - extract the solr (3.1) package, 'cd example' and start solr with 'java -jar start.jar' - start yacy and then start a crawler. The crawler will fill both, YaCy and solr indexes. - to check whats in solr after indexing, open http://localhost:8983/solr/admin/ Until now it is not possible to use the solr index to search with YaCy in that solr index. This functionality is now available for two reasons: 1) to compare the functionality of Solr and YaCy and to compare the search speed 2) to use YaCy as a search appliance for people who need a crawler or other source harvesting methods that YaCy provides (like dublin core reading, wikimedia dump reading, rss feed reader etc) if people still want to use solr instead of YaCy. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7654 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
Florian Richter	351d264a48	* yacy domain handler for jetty * rewrite from / to /index.html	14 years ago
Florian Richter	68ca0fbb2e	* add copyright info * implement basic authentication * update jetty to 7.3.0	14 years ago
sixcooler	9199b9e3c6	also putting jcifs-1.3.15 into classpath (let me me build YaCy again :-) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7588 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
Florian Richter	1989ba64c0	* jetty	14 years ago
sixcooler	45dcfa3460	update to httpclient-4.1 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7473 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	ca738ac924	- added a tag cloud to search results (using the topics) - some refactoring of score classes - added default package for new classes add_ymark and delete_ymark git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	f4357dff03	bump to httpclient-4.0.3 which fixes a number of bugs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7197 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	e670e1ef8e	add charset auto-detection for htmlParser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7186 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	3552476fbe	terminated migration from apache httpclient-3.1 to 4.1: - remove the library - added two classes from the httpclient-3.1 library as source code to YaCy because these classes were used by the YaCy HTTP Server - modified the added classes ChunkedInputStream and ContentLengthInputStream in such a way that: * there are no more dependencies to httpclient-3.1 * these classes had been simplified to serve only the purpose for the YaCy httpd git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7171 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f9a27a05e5	migrated to log4j 1.2.16 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7153 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5c67e6ca49	migrated to latest apache commons fileupload 1.2.2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7152 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5fe828fa06	- replaced pdfbox and fontbox version 1.1.0 with 1.2.1 - added some clear statements that shall clear static cache size within the pdfbox library - the pdfbox library contains a memory leak; it is unsafe to run a peer with pdf parser permanently on. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7120 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
lotus	965aa97993	including sbbi upnplib as source again http://www.sbbi.net/site/upnp/index.html renamed package to yacy all options are also named "yacy" instead of "sbbi" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6986 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
sixcooler	c5c67f0504	start migrating to HttpComponents-Client-4.x see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2872 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6965 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b5e190099d	- updated pdfbox and fontbox to 1.1.0 - added license file to sbbi-upnplib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6946 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	37b8827a7a	- removed the UPnP library sources from sbbi and added the jar library again. The library was included to get support for fedora releases, but after this time the fact that the sbbi cannot be part of fedora should be re-discussed. If this will still not be possible, then we may integrate the sbbi UPnP package using reflection. - cleaned uo the code. The new eclipse helios provided new warnings for dead code. This change cleans up most of these warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6945 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	56ff9d5fd4	- extended news size from 512 to 1024 characters - a new news db will be created (news1024.db), the old one (news.db) can be deleted - peers with too large news payload are not ignored any more (they may have been invisible because they had a too large news payload!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6917 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	fc5efcc05a	enhanced and fixed OAI-PMH import - now importing OAI-PMH server list fron two sources - simultanous import from several servers (even > 2000) - check buttons on OAI-PMH server list to select multiple servers for import start - it is possible to select all servers at once for import - imported XML data is gzipped after import from surrogate reader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6847 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	24e5faee75	added exif parsing for jpg images git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6745 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1bbe14d23f	SVN 6716 unfortunately contained parts of the unfinished SMB integration. To fix compile errors the remaining parts of the SMB implementation stub is added with this commit. This adds the jcifs smb library. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6717 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	f5ec7ad077	replaced four old libraries with latest version git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6702 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1e2c011c98	updated the jsch lib from 0.1.21 to 0.1.42 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6688 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	c2b505ae87	updated bouncy castle libraries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6687 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	681f4d185f	replaced microsoft office document parser POI 3.5 with latest version 3.6 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6686 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	e9cdddcd0f	updated parser libraries fontbox and pdfbox with latest version of jar files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6685 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
lotus	945e0ba5a5	allow global search if res. observer disabled index transmission git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6658 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	cde1611919	updated junit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6428 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	9db928ce53	replaced fontbox 0.7.3 with fontbox 0.8.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6414 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
lotus	0975b1b493	update for apache poi library possible solves http://forum.yacy-websuche.de/viewtopic.php?p=17736#p17736 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6411 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	72ac5bd80f	refactoring of search process. this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	d515bc11e2	added ooxmlparser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6256 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	67da20647f	* add new odf parser based on sax-xml-parser * remove odf_utils-jar * test metadata in ParserTest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6231 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	eea4c17ef2	removed rpm parser - no-one used that thing - loading huge rpm files bay be causes for crashes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6223 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	477807e0e6	* updated jxpath to latest v1.3 * added upnplib as source without packages: jmx remote samples git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6218 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	43c8defd79	enhanced parser with more extension + mime attributes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6214 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	49bbb9bd45	replaced tar library with integrated apache ant tar lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6212 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d26161dd1	removed unused libraries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6204 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	50cf80056f	removed jmimemagic library git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6203 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3f113f38a8	removed unused imports removed unused libs from eclipse class path git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6201 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	9f083bb6b2	check filetype before loading (no more mp4 loading) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6200 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	302a02cec8	moved all libraries from libx to lib removed libx directory all libraries are now in lib, instead the test libraries in libt which are not part of releases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6157 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	34be6f82d2	fixed build path for eclipse git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6148 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d793bb0d76	the mysql lib was not in releases included; moved library from libx to lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5987 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c72a5cf326	added stub for PHPBB3 extraction code using direct access to mySQL git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5979 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c08f9b36a4	refactoring of wiki parser. This was done to prepare the wiki parser as parser for wikipedia dumps, which will be used for performance test (to omit crawling) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5785 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4905a17f6a	moved xerces.jar from libx to lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5781 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	4aad461100	added UPnP support YaCy can now automatically forward ports on home routers off by default git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5609 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	01b97ef3f8	added new cybertag-tracking feature that was inspired by itgrl from the forum discussion in http://forum.yacy-websuche.de/viewtopic.php?p=12612#p12612 The feature will provide two basic entities: - you can integrate image links which point to your yacy installation anywhere in the web. the image can be loaded with <img src="http://<yourpeer>:<yourport>/cytag.png?icon=invisible&nick=<yournickname_or_community_id>&tag=<anything>"> This will place a invisible 1-pixel image. If you change the icon=invisible to icon=redpill, you will see a red pill Use this, to track your activity in the web. - you can view your tracks at http://localhost:8080/Tracks.html - There is a public api to your tracks at http://localhost:8080/api/tracks_p.json which needs authentication git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5581 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b423d0a036	moved all servlets from htroot/xml to htroot/api the file server contains a patch that temporary matches all xml paths to api, that means all interfaces still work. Please adopt all your interfaces to the new path. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5497 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a70c3d5599	classpath for new api directory git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5491 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	d162cce6b4	classpath update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5477 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	963da8c3f9	* updated tm-extractors to new version 1.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5405 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	51f1a1927c	* remove saaj.jar and axis.jar and references to it (was for soap-stuff?) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5404 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	214277dad6	- revert r5202 - cleanup - installer checks for JRE 1.6 only git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5210 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bdae051d9a	- extended new performance graph (better timing) - added paths for new libraries in classpath for eclipse - refactoring to remove compiler warnings (static access to finals variables) - removed some unused import git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5055 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	f8a1e3175e	new yacyTray this will make a YaCy icon in the tray area on supported platforms enabled by default the search page will open on double click used JDIC 0.9.4 from https://jdic.dev.java.net/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4992 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f5ef7f222e	- fixed a bug in parser (directory paths had not been recognized) - no access check when a search is made only local without snippet fetch - added comment and status message in resourceObserver (this takes very long at startup time!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4911 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	74b1a60043	fixed "java.lang.NoClassDefFoundError: org/a" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4784 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	ae03a54d23	pdfParser: updated lib, fixed ClassNotFoundException: CMSError git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4776 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	58830e9b28	added new terminal servlet using current visualization methods and a new one: a processing (processing.org) applet. the new servlet can be found at http://localhost:8080/terminal_p.html ..to be enhanced.. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4773 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3082edfdbc	ups git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4766 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d2ba1fd2ab	major step forward to network switching (target is easy switch to intranet or other networks .. and back) This change is inspired by the need to see a network connected to the index it creates in a indexing team. It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder. The remaining YACYDB is superfluous and can be deleted. The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy). The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT). No other functional change has been made. The next steps to enable network switcing are: - shift of crawler tables from PLASMADB into the network (crawls are also network-specific) - possibly shift of plasmaWordIndex code into yacy package (index management is network-specific) - servlet to switch networks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	c5d1d7faca	undo wrong commited files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4685 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	2617f4dcdb	Connections_p.html: better formatting and remove very old entries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4684 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ac8592a102	eclipse build path update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4655 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	5c3c1fdf41	replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7cc4ff05c9	some code enhancements and bugfixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3441ec3928	- some small changes to highslide integration to get it working... (does not work yet) - performance enhancement for url list parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4495 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6c3cd2b4f2	- added new way to watch images from the image search: they appear as separate, floating window above the search results, not in a new window - added highslide javascript library for feature mentioned above - removed dir servlet. This thing was not used as it was supposed to be (as an example applet) and was a major problem for intranet-indexing when files are hosted on the same peer. - added yacy-httpd-internal directory listing. Because YaCy is a search engine, directory listings are similar to search result listings. Intranet indexing from the same peer will get nice index pages for document collections. - removed unused test applet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4494 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	47b98bde86	classpath update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4486 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cfe499d8c9	first test of alternative search interface (only a stub but working!) try http://localhost:8080/yacy/user/ysearch.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4482 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	39566d598a	classpath update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4316 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	69521d92e5	Add another external dependency from PDFBox package ("Bouncy Castle"). This is necessary for parsing of some encrypted PDF files. bcprov-jdk14-132.jar is the binary jar as it is provided in the PDFBox-0.7.3 package (same as our FontBox, PDFBox packages). Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=453 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4231 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	ca83f5a8d9	Add external lib FontBox which is part of the PDFBox (they extracted the font handling code into this package in 0.7.3). Add the packages to the eclipse .classpath. Closes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=453 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4165 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f323e1813d	added commons.logging again (is used by mimeTypeParser) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3989 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	03847bebc1	removed unused libs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3971 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9da0e53fe8	repaired rss feed reader - removed old rss parser - removed unused rss parser libraries - added new rss reader - added previously removed FeedReader_p.java and adopted it to new rss parser - adopted parser interface for rss indexing to new rss parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3970 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2f34f32ce3	added .classpath git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3756 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3562fe1706	should not be there git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3751 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	0a64047081	- plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method) - added 7zip parser - added 'text/sgml' to realtime parseable mimetypes (sometimes returned by the mime type parser) - added new cached output stream class, very suitable for parsers because of limited memory git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3740 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	e12e934ade	*) Fixed broken compile process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3650 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	782db9099d	version independent name for commons-pool lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3082 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c3ac9aac11	bugfix for latest bugfix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2914 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f1ed55a5fc	bugfix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2913 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
octoate	1c4076da8a	First version of the MS Powerpoint parser based on Apache POI git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2753 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	db294687ea	enhanced logging - more logging output - fix in log line preparation - added filter to log page - some small bugfixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2707 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
borg-0300	08aa9d4c07	duplicate removes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2706 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	b114def2f8	duplicate classpath entry git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2699 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	2ab09e71a7	removing absolute Classpaths git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2698 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	df1629b05a	- code cleanup - version 0.471 - moved surftipps to own web page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	523e80445f	*) adding libs to eclipse classpath file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2331 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	7b0e2521bb	Support for a supertemplate, which can do all thing, a normal template can do. Its a layer under the servlets, this means, #[page]# will be replaced by serverletcode, the rest can be set by you. (TODO: if we use this for layout, we need to read "TITLE" from the servlet's tp, to set it outside of the servlet.) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2302 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	0f750c2ed6	new Templates removed locales from Buildpath git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1391 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	a603b1f5bc	.classpath is not superfluous git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1338 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	ec13ce9cdf	removed superfluous .classpath file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1335 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
(no author)	55f3232219	Patch for the Coockie management. Version 0.1 Start Yacy, go to localhost:8080/CookieTest.html Play around with cookies Look into CookieTest.java to See, how it works This behavior will be changed such that httpHeader will be responsible for the cookies in the future git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1332 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	ba96cefe0c	packages for xml/* bugfix for servlets with packages from theli. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1272 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	26bab876db	more del.icio.us Api Bugfix for http in gettitle_p git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1268 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	2e2fa99501	bookmarksManager: -gettitle_p.xml and AJAX to use it -classpath change httpc: -simple wget function git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1267 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	64f5d980c1	changing classpath for htroot/xml git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1266 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	4ff3d219e8	increased delay for cacheScan start and slowed down scan process to provide more time to other tasks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1210 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	97990299fa	*) Changing lib names after migration to newer versions of PDFBox + jsch Thanks to Hydrox for the advice git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@799 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	508599eeb3	eclipse files from Goligo see http://www.yacy-forum.de/viewtopic.php?t=810 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@484 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago

... 2 3 4 5 6 ...

348 Commits (8303e15419e789cad94b94a1d65e00f9627cd5f1)