yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	9dfc9c95d8	updated slf4j and log4j	12 years ago
Michael Peter Christen	95712fdc8b	update to pdf parser	12 years ago
Michael Peter Christen	a1a4d9aa94	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Conflicts: source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java	12 years ago
Michael Peter Christen	e2c4c3c7d3	migration to solr 4.0.0	12 years ago
Michael Peter Christen	69aa39d664	update to libraries required by solr 4.0.0	12 years ago
sixcooler	9d062873d2	bump to httpclient-4.2.2	12 years ago
sof	5cb244b79b	Merge remote branch 'origin/master'	12 years ago
apfelmaennchen	88b062210c	Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based on the jaudiotagger library. The parser is disabled by default as it needs to store temporary files for non file:// protocols, which might be disliked. For your local MP3-collection it loads nicely Artist, Title, Album etc. from the audio files meta data.	12 years ago
sixcooler	9aa21506be	bump to httpcore-4.2.2 (maintenance release)	12 years ago
Michael Peter Christen	d0015df61c	added lucene memory library which is now necessary as solr has to process more complex queries	12 years ago
Michael Peter Christen	e65cecc419	- updated lucene libraries to 3.6.1 - added lucene-grouping which enables faceted search; try this: http://localhost:8090/solr/select?q=:&start=0&rows=3&facet=true&facet.field=host_s	12 years ago
Michael Peter Christen	ff3eaa21b0	added remote search to solr on YaCy peers! - when doing a remote search, node peers are selected for solr queries - the solr query is done concurrently to the standard YaCy rwi search - the solr search result is feeded into the same data structure that prepares the rwi search result - the same remote seach that is done to several outside peers is done to the local solr index - the search process works now also without any 'old' RWI data using solr	12 years ago
Michael Peter Christen	d39463a85c	added deleteByQuery to solr connectors	12 years ago
Michael Peter Christen	2ccf1dba71	upgrade to solr 3.6.1	12 years ago
Michael Peter Christen	ea49a8aa8c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
Michael Peter Christen	d988ba50cf	added a very rudimentary, incomplete, non-verified GSA response writer for solr. Try this: http://localhost:8090/gsa/searchresult?q=pdf&site=col1&num=10	12 years ago
cominch	e74d66e28c	augmented browsing: remove htmlparser library	12 years ago
cominch	e2119f4e76	augmented browsing: replace htmlparser by jsoup, which is more stable and reliable	12 years ago
Michael Peter Christen	bf4968d748	source change in classpath	13 years ago
sixcooler	a99ef68422	bump to httpclient-4.2.1	13 years ago
Michael Peter Christen	65f56b1fd4	Merge branch 'master' of ssh://gitorious.org/yacy/rc1 into jetty Conflicts: .classpath build.xml htroot/Status.java source/de/anomic/http/server/HTTPDProxyHandler.java source/net/yacy/yacy.java	13 years ago
Michael Peter Christen	7b53be141f	upgraded to pdfbox 1.7.0 changes in http://www.apache.org/dist/pdfbox/1.7.0/RELEASE-NOTES.txt with many bugfixes, including performance related	13 years ago
Michael Peter Christen	fad3b14813	added jetty libraries, needed for future use as web server and as application server for the solr search interface	13 years ago
Michael Peter Christen	b9d42fd9c8	using com.google.common.io.Files instead of homebrew methods	13 years ago
Michael Peter Christen	1be0025a9c	- added test for EmbeddedSolrConnector - added needed libraries for this test this includes most (all) files needed for an embedded solr	13 years ago
Michael Peter Christen	90b82ce994	using guava for host resolution (non-blocking for ips) and time-out	13 years ago
Michael Peter Christen	3f55dc7c1e	- added solr core and libraries that solr needs (lucene is missing, will follow later) - added embedded solr connector which can connect to solr programmatically (without using a server in between)	13 years ago
Michael Peter Christen	5fc6524ca8	- moved triple store to net.yacy.cora.lod (should be generalized there later - added abstract add, delete, get methods in the triplestore - added generation of triples after auto-annotation - migrated all MultiProtocolURI objects to DigestURI in the parser since the url hash is needed as subject value in the triples in the triple store	13 years ago
cominch	5d20cd324a	Add Triplestore and RDF query interface Conflicts: build.xml defaults/yacy.init source/net/yacy/interaction/AugmentHtmlStream.java	13 years ago
cominch	b21048892b	augmentedParser add features and integrate external html parser to modify existing web pages Conflicts: addon/YaCy.app/Contents/Info.plist build.xml	13 years ago
sixcooler	56087c1f23	bump to httpclient- httpcore-, httpmime- 4.2	13 years ago
Michael Peter Christen	4d3cc02168	replaced old bzip2 library against better documented commons-compress package from http://commons.apache.org/compress/	13 years ago
Michael Peter Christen	1795a7325b	made HandleSet serializable	13 years ago
Michael Peter Christen	62f2554a01	- fixed build problems (deprecated methods using httpclient 3.1) - removed httpclient 3.1 lib which was used by solrj (solrj now uses httpclient 4)	13 years ago
Michael Peter Christen	248299d10f	updated solrj lib	13 years ago
Michael Peter Christen	f838997126	updated commons io from 2.0.1 to 2.1	13 years ago
Michael Peter Christen	eeb57ae824	updated http client libraries	13 years ago
Michael Peter Christen	ef5192f8c9	using the generic document parser for crawl starts instead of the html parser. This makes it possible that every type of document can be a crawl start point, not only text documents or html documents. Testet this with a pdf document.	13 years ago
Michael Peter Christen	a30b028cc0	updated libraries	13 years ago
Michael Christen	e69afae87e	class path for servlets in eclipse	13 years ago
Al Sutton	8993cac4d8	Initial performance improvements	13 years ago
orbiter	5a7cec59f3	moved ynetSearch to get all files out of htroot/api/util/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8042 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	65ab067491	migration to solrj 3.4.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7952 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	52b477cf6f	bump to httpclient-4.1.2, httpcore-4.1.3 - bugfixrelease git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7876 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
sixcooler	48560a44a9	bump to httpcore-4.1.2: a bugfixrelease git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7853 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c0d9474b31	update to eclipse class path environmen git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7834 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	528b59e078	replaced xerces.jar library that was originally added 2005 with SVN 126 to the libx directory and that was moved to lib in SVN 5781 the new replacement is taken from http://xerces.apache.org and has the version 2.11.0 and was inside the file Xerces-J-bin.2.11.0.tar.gz and consists of two files named xercesImpl.jar and xml-apis.jar The original purpose of that library was to support: - content parsers - optional seed uploader - SOAP API (which will be committed later) Since the SOAP API does not exist any more the purpose is to support content parser and an optional seed uploader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7819 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	77fe69395d	added jempbox-1.5.0.jar which is required by pdfbox-1.5 as stated in http://pdfbox.apache.org/dependencies.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7774 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	efcd21e0ed	new httpclient, httcore (bugfixrelease) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7769 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	761b1c71dc	added latest pdfbox git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7761 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	0abd99621c	correct slip of click in classpath from last commit - I wonder there are 7658'is around apflemaenchen, please don't take this amiss git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7659 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
apfelmaennchen	a0e4960a4d	YMark: - first attempt for a firefox json bookmark importer - added JSON library json-simple-1.1.jar git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7658 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	19fd13d3bc	Added federated index storage to solr. YaCy supports now the storage to remote solr indexes. More federated storage (and search) methods may follow. The remote index scheme is the same as produced by the SolrCell; see http://wiki.apache.org/solr/ExtractingRequestHandler Because this default scheme is used, the default example scheme can be used as solr configuration This is also the same scheme that solr uses if documents are imported with apache tika. federated solr storage is switched off by default. To use this, do the following: - set federated.service.solr.indexing.enabled = true - download solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/ - extract the solr (3.1) package, 'cd example' and start solr with 'java -jar start.jar' - start yacy and then start a crawler. The crawler will fill both, YaCy and solr indexes. - to check whats in solr after indexing, open http://localhost:8983/solr/admin/ Until now it is not possible to use the solr index to search with YaCy in that solr index. This functionality is now available for two reasons: 1) to compare the functionality of Solr and YaCy and to compare the search speed 2) to use YaCy as a search appliance for people who need a crawler or other source harvesting methods that YaCy provides (like dublin core reading, wikimedia dump reading, rss feed reader etc) if people still want to use solr instead of YaCy. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7654 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
Florian Richter	351d264a48	* yacy domain handler for jetty * rewrite from / to /index.html	14 years ago
Florian Richter	68ca0fbb2e	* add copyright info * implement basic authentication * update jetty to 7.3.0	14 years ago
sixcooler	9199b9e3c6	also putting jcifs-1.3.15 into classpath (let me me build YaCy again :-) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7588 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
Florian Richter	1989ba64c0	* jetty	14 years ago
sixcooler	45dcfa3460	update to httpclient-4.1 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7473 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	ca738ac924	- added a tag cloud to search results (using the topics) - some refactoring of score classes - added default package for new classes add_ymark and delete_ymark git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	f4357dff03	bump to httpclient-4.0.3 which fixes a number of bugs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7197 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	e670e1ef8e	add charset auto-detection for htmlParser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7186 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	3552476fbe	terminated migration from apache httpclient-3.1 to 4.1: - remove the library - added two classes from the httpclient-3.1 library as source code to YaCy because these classes were used by the YaCy HTTP Server - modified the added classes ChunkedInputStream and ContentLengthInputStream in such a way that: * there are no more dependencies to httpclient-3.1 * these classes had been simplified to serve only the purpose for the YaCy httpd git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7171 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f9a27a05e5	migrated to log4j 1.2.16 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7153 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5c67e6ca49	migrated to latest apache commons fileupload 1.2.2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7152 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	5fe828fa06	- replaced pdfbox and fontbox version 1.1.0 with 1.2.1 - added some clear statements that shall clear static cache size within the pdfbox library - the pdfbox library contains a memory leak; it is unsafe to run a peer with pdf parser permanently on. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7120 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
lotus	965aa97993	including sbbi upnplib as source again http://www.sbbi.net/site/upnp/index.html renamed package to yacy all options are also named "yacy" instead of "sbbi" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6986 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
sixcooler	c5c67f0504	start migrating to HttpComponents-Client-4.x see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2872 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6965 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b5e190099d	- updated pdfbox and fontbox to 1.1.0 - added license file to sbbi-upnplib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6946 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	37b8827a7a	- removed the UPnP library sources from sbbi and added the jar library again. The library was included to get support for fedora releases, but after this time the fact that the sbbi cannot be part of fedora should be re-discussed. If this will still not be possible, then we may integrate the sbbi UPnP package using reflection. - cleaned uo the code. The new eclipse helios provided new warnings for dead code. This change cleans up most of these warnings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6945 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	56ff9d5fd4	- extended news size from 512 to 1024 characters - a new news db will be created (news1024.db), the old one (news.db) can be deleted - peers with too large news payload are not ignored any more (they may have been invisible because they had a too large news payload!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6917 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	fc5efcc05a	enhanced and fixed OAI-PMH import - now importing OAI-PMH server list fron two sources - simultanous import from several servers (even > 2000) - check buttons on OAI-PMH server list to select multiple servers for import start - it is possible to select all servers at once for import - imported XML data is gzipped after import from surrogate reader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6847 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	24e5faee75	added exif parsing for jpg images git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6745 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1bbe14d23f	SVN 6716 unfortunately contained parts of the unfinished SMB integration. To fix compile errors the remaining parts of the SMB implementation stub is added with this commit. This adds the jcifs smb library. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6717 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	f5ec7ad077	replaced four old libraries with latest version git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6702 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1e2c011c98	updated the jsch lib from 0.1.21 to 0.1.42 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6688 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	c2b505ae87	updated bouncy castle libraries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6687 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	681f4d185f	replaced microsoft office document parser POI 3.5 with latest version 3.6 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6686 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	e9cdddcd0f	updated parser libraries fontbox and pdfbox with latest version of jar files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6685 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
lotus	945e0ba5a5	allow global search if res. observer disabled index transmission git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6658 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	cde1611919	updated junit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6428 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	9db928ce53	replaced fontbox 0.7.3 with fontbox 0.8.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6414 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
lotus	0975b1b493	update for apache poi library possible solves http://forum.yacy-websuche.de/viewtopic.php?p=17736#p17736 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6411 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	72ac5bd80f	refactoring of search process. this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	d515bc11e2	added ooxmlparser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6256 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	67da20647f	* add new odf parser based on sax-xml-parser * remove odf_utils-jar * test metadata in ParserTest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6231 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	eea4c17ef2	removed rpm parser - no-one used that thing - loading huge rpm files bay be causes for crashes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6223 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	477807e0e6	* updated jxpath to latest v1.3 * added upnplib as source without packages: jmx remote samples git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6218 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	43c8defd79	enhanced parser with more extension + mime attributes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6214 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	49bbb9bd45	replaced tar library with integrated apache ant tar lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6212 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d26161dd1	removed unused libraries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6204 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	50cf80056f	removed jmimemagic library git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6203 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3f113f38a8	removed unused imports removed unused libs from eclipse class path git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6201 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	9f083bb6b2	check filetype before loading (no more mp4 loading) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6200 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	302a02cec8	moved all libraries from libx to lib removed libx directory all libraries are now in lib, instead the test libraries in libt which are not part of releases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6157 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	34be6f82d2	fixed build path for eclipse git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6148 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d793bb0d76	the mysql lib was not in releases included; moved library from libx to lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5987 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c72a5cf326	added stub for PHPBB3 extraction code using direct access to mySQL git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5979 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c08f9b36a4	refactoring of wiki parser. This was done to prepare the wiki parser as parser for wikipedia dumps, which will be used for performance test (to omit crawling) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5785 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4905a17f6a	moved xerces.jar from libx to lib git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5781 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	4aad461100	added UPnP support YaCy can now automatically forward ports on home routers off by default git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5609 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5

206 Commits (a1ac4c3b76ab0ea01c7a2f2e52721d94bff01717)