yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger	ac6e198bd1	add unit test for Domains.stripToPort, simplify ipv6 check	8 years ago
luccioman	a0dfbaca6a	FileUtils : added some JavaDocs and unit test cases	8 years ago
reger	395f2e8946	Make ServletRequest implement the standardized HttpServletRequest interface, to make all readily available information from the original ServletRequest available to YaCy servlets (without converting data to internal structures). The implementation of the common interface allows easier integration of YaCy servlets with the servlet standard (e.g. shared login service with the servlet container etc.)	8 years ago
luccioman	7296e3884f	Switched even more URLs to pure relative ones. Thus a YaCy peer can run behind a reverse proxy subfolder without need for the reverse proxy to rewrite HTML links (a CPU costly operation). Tested on Debian Jessie with an apache2 reverse proxy. See related mantis issues http://mantis.tokeek.de/view.php?id=106 and http://mantis.tokeek.de/view.php?id=701	8 years ago
luccioman	731684105a	Improved absolute URLs rendering in OpenSearch desc and RSS feeds. When the peer is behind a reverse proxy providing SSL/TLS encryption, the rendered absolute URLs should start with https when the user browser requested https : added limited support to the X-Forwarded-Proto HTTP header notably provided on Heroku platform. Also added some unit tests.	8 years ago
reger	c9e81d2fa0	fix Column parsing from celldefinition string, without cellwidth def. (outofbound exception)	8 years ago
reger	af39a76bf6	Reduce number of default max. search navigator lines (from 10000) to 100 + make it configurable	8 years ago
reger	20a1b29ed3	add simple test case for ReferenceContainer helpful for debugging calculated ranking parameter	8 years ago
reger	3c7220bc7b	Refacture rwi reference word position and word distance calculation used for rwi ranking. Main changes: - introduce a posintext() to access the stored value. This reduces also mem alloc of position array for WordReferenceRow (index access) - use the positions() array for joined references on multi-word queries if needed (otherwise allow positions() to be null - adjust assignments and the min() max() and distance() calculation accordingly	8 years ago
luccioman	c3c4a52408	Added more examples in Blacklist JUnit test.	8 years ago
reger	8b74a6bf57	fix min/max calculation of WordReferenceVars.distance() Issue was the calculation in AbstractReference with positions.clear() call, this made distance result always 0 (distance needs min 2 positions) and created concurrency issues. + unit test of changes	8 years ago
luccioman	93ea366778	Updated license header file name	8 years ago
luccioman	4c0be4d5d4	Fixed maven compilation error Removed unit test yacysearchitemTest from default maven Junit tests path, as yacysearchitem class is not in maven build classpath.	8 years ago
luccioman	7717a3d43d	Fixed license headers on files created to improve favicon management.	8 years ago
luccioman	6e1959f469	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Conflicts: htroot/yacysearchitem.java source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java source/net/yacy/search/schema/CollectionConfiguration.java source/net/yacy/server/serverObjects.java	8 years ago
luccioman	7136b1ad60	HTML validation : fixed URL encoding of Pictures link.	8 years ago
luccioman	3ccd89e274	Fixed MultiProtocolURL.resolveBackpath to handle remaining '..' segments	8 years ago
luccioman	f1f4459f88	Added some unit tests for Blacklist.isListed()	8 years ago
reger	e68b00678e	prevent negative score on URIMetadataNode - in the special case were no solr score is supplied. + assert before use & test case	8 years ago
reger	b752bcfecb	adjust date in text detection to ignore some program version strings like "3.1.2.0102" see http://mantis.tokeek.de/view.php?id=650 + expand test case	8 years ago
reger	b017e97421	optimize condenser language detection a little. langdetect probabilities take letter case into account, add words from description and anchors etc. as is. + add it to javadoc	8 years ago
reger	ae3717d087	adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! ) + remove unused sentenceword map (we use only the count) + upd test case for sentence count	8 years ago
reger	474f0476c6	adjust Tokenizer sentence count on trailing text after last recognized sentence + upd test case for rwi multi-word-query (leaving results known to fail untested)	8 years ago
reger	1a79c64495	generalize DateDetection with holiday date rules readily available in icu to make sure current dates are recognized (was fixed to 2014 - 2016) + adjust holiday date parser from pattern.match to pattern.find to deal with leading and trailing text + moved relative date recognition (morgen, tomorrow) to parseline (used by query parser only), as not working and problematic for indexing + add test case for parseline (used by query parser)	8 years ago
reger	32a2e3a22a	have RSSFeed.getChannel return empty message on missing channel element, a) required b) prevent NPE in rss servlets + add test	8 years ago
luccioman	4585a60d7e	Made use of the constant corresponding to the hard-coded value.	8 years ago
luccioman	1bb0b135ac	Avoid duplication of various MS Windows file URLs flavors Fix for mantis 692 (http://mantis.tokeek.de/view.php?id=692)	8 years ago
reger	6f8c3ccea4	improve url hash computation for file path with mixed java & windows file.separator to compute equal hashes (by normalizing path for computation) + expand test case for to check mixed java / windows file url notation like e.g. file:///c:/test/file.html vs. file:///c:\test/file.html - relates partially to http://mantis.tokeek.de/view.php?id=692	8 years ago
reger	330768c8a2	fix for solr write.lock after mode change http://mantis.tokeek.de/view.php?id=686 The embedded core holds a lock on the index and must be closed. Earlier commit comment states that core should be closed with solr instance instead on close of connector. Adjusted the InstanceMirror.close() to take care of closing the embedded instance to release the lock. In 2 routines of fulltext this was already explicite implemented (disconnectLocalSolr). Now this disconnect is part of the InstanceMirror.close().	8 years ago
reger	11786457b7	add test case for EmeddedSolrConnector close() for issue http://mantis.tokeek.de/view.php?id=686 (without solving the issue here)	8 years ago
reger	585d2a6441	test case: for NewsPool to check the id modificator (for unique id) and observe the distribution order .. hands on. + add test/DATA to gitignor	8 years ago
reger	ff6589fc0f	test case: simulating multi word query for local rwi index Purpose of the test case is to be able to (controlled) analyse the rwi ranking for multi word searches (with focus on posintext and word-distance ranking)	8 years ago
reger	7f63fc50f3	prepare a IndexSegment test case for RWI index testing + prevent NPE in Segment.clear() on missing embedded solr instance.	8 years ago
reger	272cdd496a	reactivate sentence counter in WordTokenizer for phrasepos ranking, by counting punktuation (delivered as 1 char word) again.	8 years ago
Michael Peter Christen	5e165a8150	removed unused imports	8 years ago
reger	e310ec5f70	fix posInText ranking calculation to score 0 on no position info + fix Word posInText calc in Tokenizer to start with 1 + test case	8 years ago
reger	39dd244693	fix ConcurrentScoreMap.set() calculation of totalCount() + test case	8 years ago
reger	ebde21079a	refactor xlsParser to include Excel file attribute (like author) in parser result doc. Similar to ppt and doc parser, completing a TODO in xlsParser.	8 years ago
reger	5e335b32da	fix Blacklist.contains() matching path pattern to string similar to `5e9e871192` + add proof testcase	8 years ago
reger	f89d4eb51d	fix MultiProtocolURL init (assign of host) for urls with '/' in query part + add to test case	8 years ago
reger	87fcfc6d78	Adjusted hash computation and toNormalform for file:// protocol to deliver same hash same file on Windows filesystem path with forward- and backslash in path. Background see http://mantis.tokeek.de/view.php?id=671 +Test case	8 years ago
reger	7b226afc33	fix HostQueueTest - changed open parameter	8 years ago
luccioman	893a40995a	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	8 years ago
reger	fcc29c36f0	test case for HostBalancer issue in intranet mode with file:// protocol, 2 hostqueues accessing same cache file concurrently http://mantis.tokeek.de/view.php?id=668 Reason seems to be diff. hosthash key of hostqueues on reopen. Internal queue key and external representation (directoryname currently hostname.port) must be adjusted to fix it (not done yet).	8 years ago
luccioman	6e96c7341a	Merge remote-tracking branch 'origin/master' Conflicts: htroot/Load_MediawikiWiki.java htroot/Load_PHPBB3.java htroot/ViewImage.java	8 years ago
reger	a476d06aec	wiki header code test string add "closing" tag	9 years ago
reger	d4da4805a8	internal wiki code, require header line to start with markup (to allow something like "one=two" as text) + incl. test case	9 years ago
reger	223071337b	Translator to take caution of word boundaries to identify text portion to be translated. To avoid key="TEST" sourcetext="this is a myTESTcase for it" translation of partial terms/words. Add check of word boundary before and after sourcetext (incl. take care of current praxis for key to be delimetered by > < + add test case	9 years ago
reger	a6ba1faa80	introduce a translation edit servlet Translator_p.html YaCy's UI text translation This is the 1st rudimentary approach to support the translatio utilities. It allows currently to edit untranslated text and save it in a local translation file in the DATA/LOCALE directory. + refactor Translator (less static's) to leverage on class overrides and support garbage collection for this 1 time routine + adjust TranslatorXliff to check for local translations in DATA/LOCALE, this includes storing manually downloaded translation files in DATA as well (to keep default untouched) + on 1st call of Translator_p a master tanslation file is generated, checking the supported languages for missing translation text (later this masterfile is planned to part of the distribution, to harmonize translation key text between the languages) Outlook: the local modifications (possibly as translation fragments instead of complete file) to be shared with maintainer using xlif features.	9 years ago
reger	b74cddc49c	upd to Jetty v9.2.16.v20160414 - exclude unused mime4j - remove unused yacy-cora build	9 years ago
reger	24b0fa2a38	extend snapshot Html2Image.pdf2image to use PDFBox image export capability if no external tool installed (and for Win) Resulting jpg are not always perfect (if graphic included) but imho sufficient.	9 years ago
reger	902e79e261	Introduce a TranslatorXliff wich can read/write xliff from/to internal translation map. This eases up suggested initatives from http://mantis.tokeek.de/view.php?id=649 Allows longer term also to store translation maps for the htroot files in standardized/reuseable xliff format ( http://docs.oasis-open.org/xliff/xliff-core/xliff-core.html ). + added test case creating and comparing xliff file with internal custom prop file. (currently the introduced class is not used in core code)	9 years ago
reger	ec24a0c85a	add test case for optimized toTokens()	9 years ago
luc	26f1ead57c	Created ViewFavicon class specialized in favicon viewing. Main image processing is now in ImageViewer, used by both ViewImage and ViewFavicon. Fixed URIMetadataNode.getFavicon to use non-standard icons with no size ass fallback.	9 years ago
luc	07222b3e1a	Added favicon url transmission in RWI chunks.	9 years ago
luc	53781299d8	Extracted intranet and filtype related rules from getFaviconURL func	9 years ago
luc	3cc5619d93	Improved HTML icons indexing and rendering in search results. See http://mantis.tokeek.de/view.php?id=629	9 years ago
luc	ef83e34b8a	Merge branch 'master' of https://github.com/yacy/yacy_search_server	9 years ago
reger	84c970eaec	move test classes to test/java (subdirectory as in Maven standard subdir layout) because ViewImage*Test.java breaks test run	9 years ago
luc	cfdbc2b487	Improved URLLicence reliability for use by conccurrent non authaurized users. Removed URLLicence generation when unnecessary (authorized users)	9 years ago
luc	571bc55937	Refactoring : use StandardCharsets constants instead of hard-coded charset names.	9 years ago
reger	1af0e9ef74	remove workaround for Solr bug regarding multivalued date fields fixed in 5.4.0 http://issues.apache.org/jira/browse/SOLR-8050	9 years ago
reger	4d2b934487	prevent mailto links getting into parser result document's in/outbound link collection by checking mailto scheme early. - fix upper case mailto protocol assignment - add test case for getProtocol	9 years ago
reger	288acceac3	fix test htmlParserTest, charset parameter + upd maven templating-plugin version	9 years ago
luc	f01d49c37a	Process large or local file images dealing directly with content InputStream.	9 years ago
luc	0de6988604	Added links to more image test suites.	9 years ago
luc	745e97a575	Merge branch 'master' of https://github.com/yacy/yacy_search_server	9 years ago
luc	2895ab552a	Made ViewImagePerfTest extend ViewImageTest to ease automated image render tests	9 years ago
luc	4a03cf06e1	Corrected encoding extension arg parsing	9 years ago
reger	d223cf0ae4	adjust MediaWiki importer geo coordinate calculation - allow lat/long 0.xxx - south / west assignment include test class	9 years ago
luc	8da20718aa	Created a class to test ViewImage rendering against multiple image files.	9 years ago
luc	ec04d27473	Corrected APNG test suite link name.	9 years ago
luc	cbb84ba073	Detailed javadoc.	9 years ago
luc	70111876d2	Filled ViewImageTest.html with all remaining IANA image file formats. Added some links to test suites and specifications.	9 years ago
luc	e093fb228d	Created a generic ViewImage performance render test.	9 years ago
luc	3ad564e2e4	Created a ViewImage rendering performance measurement test.	9 years ago
luc	b3f044072e	Updated table headers and SVG file url for case sensitive OS.	9 years ago
luc	f5746b5490	Added ico and bmp sample pictures	9 years ago
luc	baede48161	Added JPEG 2000 and FITS samples	9 years ago
luc	7c9d80c5d0	Added image formats and informations for each format.	9 years ago
luc	0ae9297ca5	Created a html test page to check ViewImage rendering with different file formats.	9 years ago
reger	bad34804fe	optimize parseInt for <img> tag attribute parsing Performance better as using Numberformat.parse or parseInt(substring())	9 years ago
reger	d2cc11ea8f	fix html parser taking <style> content as text. Noticed some result description contain css content from style tag. Added <style> to tag list to scrape it's content not as text + test case included	9 years ago
reger	e594130aec	add test case for partial update - to discover effect on YaCy for update of documents with multivalued date fields (like dates_in_content_dts) current result: loss of fields/information in index document, see EmbeddedSolrConnectorTest.testUdate_withMultivaluedDateField()	9 years ago
reger	d5da9e5a38	fix test methode (add throw for URIMetadataNode)	9 years ago
reger	4cf875336c	complete TODO: getFileExtension handle dot in query part + testcase	9 years ago
reger	c37dda8849	fix NPE on MultiProtocolURL on url with parameter value and '=' in getAttribute - added test case for it	10 years ago
reger	71bf95af8a	upd parser calls in test cases	10 years ago
reger	f63fff9008	fix snippet containig number with comma as desmo point http://mantis.tokeek.de/view.php?id=344 to keep it as one word (by altering the split regex) - added sniipet test case with number - regex for word split to match multiple splitcars	10 years ago
reger	2ef8ffdb60	apply UTF-8 encoding copied from escape()	10 years ago
reger	7120ea42f1	fix for path with char code > 255 (causing index out of bound exception) + test cas for it	10 years ago
reger	1d81bd0687	fix url encoding for path see http://mantis.tokeek.de/view.php?id=559 So far we used same escape procedure for all parts of the url (which includes x-www-form-urlencoded for all url components) Added capability to use different encoding rules for the different url components (through specific bitset for each component). (this is inspired by org.apache.http.client and java.net.uri implementation). - Added test case for http://mantis.tokeek.de/view.php?id=559	10 years ago
reger	f94e34058c	fix url (path) %-decoding http://mantis.tokeek.de/view.php?id=519 - add test case for this	10 years ago
reger	16bc267a32	add test case for snippet html encoding check	10 years ago
reger	77851fa53c	fix parser test cases (Vocabulary paramete)	10 years ago
reger	df83fcc4fc	disable optimistic GC assumption in StandardMemoryStrategy After several tests found that eom is not prevented. Major reason in testing was assumption future GC will free avg of last 5 GC. Disabeling this check improved eom exceptions. Added simplest testcase used for verification	10 years ago
Michael Peter Christen	68c605d637	replace with CommonPattern.SPACE for split	10 years ago
reger	9edc7308aa	update to metadata-extractor-2.7.0.jar add 2 simple JUnit test cases for jpeg and tif parsing	10 years ago
reger	5d67e165d9	remove redundant null check in ResponseHeader.lastModified added a JUnit testcase for ResponseHeader dates (using age()), adjusted age() to pass all tests	10 years ago
reger	ea633a794c	including small junit test case for WordTokenizer	10 years ago
reger	aa2e15d846	allow url parameter in worktable apicall allow url=wwwl?param=a&param=b (with ?, & encoded) fix: http://mantis.tokeek.de/view.php?id=100 fix double adding of '&' in MultiProtocolURL.escape()	10 years ago
reger	e88537522d	allow single quote " ' " in query see http://mantis.tokeek.de/view.php?id=379 -add QueryGoal test case for this	10 years ago
reger	e50b2b4d04	fix test case MultiProtocolURL.toString() (only allowed on AnchorURL)	10 years ago
reger	b510b182d8	- update Maven pom - add ppt parser test case	10 years ago
Michael Peter Christen	2de159719b	added an option to set 'obey nofollow' for links with rel="nofollow" attribute in the <a> tag for each crawl. This introduces a lot of changes because it extends the usage of the AnchorURL Object type which now also has a different toString method that the underlying DigestURL.toString. It is therefore not advised to use .toString at all for urls, just just toNormalform(false) instead.	10 years ago
reger	1f2eba977d	add test case for Records (used in HostBalancer) - simulating seek error (http://mantis.tokeek.de/view.php?id=411)	11 years ago
reger	e94efd4d7c	update to JUnit 4.11 - fix build.xml -> parserTest error on Windows due to javac encoding	11 years ago
reger	3b77e41f1a	adding test for HostQueue crawl stack - simulating problem with zero length stack file (but not fixing it) - adding test data clean to maven pom	11 years ago
reger	431a5f9c4e	added test case for TextSnippet, removed obsolete/unused parameter and reference to MediaSnippet	11 years ago
reger	7847a93558	fix AbstractParser.singleList not adding null strings - prevents null titles in oo... parser (as detected by ParserTest) - correct ParserTest dc_description check (dc_description allowed to return 0 length array)	11 years ago
reger	0b6db04e40	fix contentscraper img height/width parsing prevent numberformat exception on common "100px" property - include in test case	11 years ago
reger	bb8181b2be	fix: resolve url without path but searchpart e.g. http://yacy.net?q=test was resolved as host "yacy.net?q=test" now host="yacy.net" path="/" fixes http://mantis.tokeek.de/view.php?id=47 added test case for getHost	11 years ago
reger	86f6975edc	exclude html tags in in/outboundlinks_anchortext_txt parsed text - some outboundlinks_anchortext_txt in index contain e.g. <span>text</span> or more tags, remove all tags for text property (inline img tags are still parsed) - added test case for above (to htmlParserTest) - fix solr test case	11 years ago
reger	71649bf22d	add test case htmlParser.parse - getCharset (which fails)	11 years ago
reger	6878c90f99	fix: IPv6 INTRANET_PATTERNS for local ip (see http://bugs.yacy.net/view.php?id=378 ) requiring following ":" for fc and fd prefix and made pattern match case insesitive - add some more ipv6 test cases to MultiProtocolURLTest.java	11 years ago
reger	c8d437b69a	clean up test sources rename to current package names and move to default location	11 years ago
reger	18a56446ce	reorg URL test classes add isLocal test with some IPv6 examples - putting in default location and clean old package names - add some valid RFC IPv6 sample urls (which don't pass the isLocal test)	11 years ago
reger	10a6346056	clean-up test cases to work with current source	11 years ago
reger	b4fdb8c887	cleanup test directory from Jetty 9 implementation samples - current Jetty implementation advances so that it seems not beneficial to keep the code as it makes the test unuseable and use of Jetty 9 is due to Java 1.7 dependency not in sight.	11 years ago
reger	71d2655c02	downgrade to Jetty 8 to assure support of JRE 1.6 - introduce a YaCyHttp interface to modulize/separate http server - adjust the Jetty version specific implementation part (in package net.yacy.http) - putting the version specific code in classes starting with Jetty8xxxx - moved existing Jetty9xxx implementation into a test class (to keep the code) - adjust build to the changed jars - make use of the introduced YaCyHttpServer interface in related htroot servlets - adjust other test cases/classes	11 years ago
reger	f7f86d8a5d	update to Jetty 9 jars - include javax.servlet 3.0	11 years ago
reger	fe87fb638a	adjust test/ParserTest to dc_description data type	11 years ago
Roland Haeder	841a28ae76	Added 'final' for all exception blocks as this helps the Java compiler to optimize memory usage Conflicts: source/net/yacy/search/Switchboard.java	11 years ago
reger	97ab5b90e8	- odt & ooxml (office document) parser correction to add content to fulltext index - adjust Junit yacyVersionTest & ParserTest - update yacyVersion.combined2prettyVersion to the default 4-digit minor ver.	12 years ago
reger	4fec35a665	adjust Test case EmbeddedSolrConnector	12 years ago
reger	160ce568b3	move testing SolrServlet.main to test, making include of jetty.jar in distribution and classpath obsolete - move jetty.jar to test library - move SolrServlet.main as is to test, add also a junit test simulating main - add build.xml cleanup for EmbeddedSolrConnectorTest created test/DATA - adjust some test compile errors	12 years ago
orbiter	d2ea250d99	refactoring: - moved many classes from de.anomic to net.yacy - made more sub-packages for search classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	49e5ca579f	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7931 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	cb1f49d0f2	replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	cd19d0517e	added dns resolve to HTTPClient POST using a dns cache to prevent that that not-thread-safe built-in dns cache inside apache http client is used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7513 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	01cb3bbaec	* fix patchCharsetEncoding-test (patchCharsetEncoding now returns null on input null) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7465 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	fd74bc388c	* fix small bug in sessionid-removal * add testcase for seesionid-removal git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7333 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	3197ca42ed	preparations to move the HTCache into cora: - move the header framework classes to cora - move the ARC caching classes to cora - refactoring of code to call these classes from cora git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7068 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	844f158686	- removed dependencies in header framework: moved http date methods from DateFormatter to HeaderFramework changed logging to log4j - added ftp load access to MultiProtocolURI - ensured termination of RSS feed iteration git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7067 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	b6fb239e74	redesign of parser interface: some file types are containers for several files. These containers had been parsed in such a way that the set of resulting parsed content was merged into one single document before parsing. Using this parser infrastructure it is not possible to parse document containers that contain individual files. An example is a rss file where the rss messages can be treated as individual documents with their own url reference. Another example is a surrogate file which was treated with a special operation outside of the parser infrastructure. This commit introduces a redesigned parser interface and a new abstract parser implementation. The new parser interface has now only one entry point and returns always a set of parsed documents. In case of single documents the parser method returns a set of one documents. To be compliant with the new interface, the zip and tar parser had been also completely redesigned. All parsers are now much more simple and cleaner in its structure. The switchboard operations had been extended to operate with sets of parsed files, not single parsed files. additionally, parsing of jar manifest files had been added. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6955 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	11639aef35	- added new protocol loader for 'file'-type URLs - it is now possible to crawl the local file system with an intranet peer - redesign of URL handling - refactoring: created LGPLed package cora: 'content retrieval api' which may be used externally by other applications without yacy core elements because it has no dependencies to other parts of yacy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6902 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b68deb407a	- moved test data from /bin to /test/words - refactoring of stopYACY.sh by introduction of /bin/apicall which is able to call any api file with attached authorization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6691 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	3528b970d6	- refactoring - added new experimental (not-yet-working) image parser - added new test image git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6431 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b79f4f062f	refactoring of yacy documents and parsers: they depend now only on the kelondro classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6426 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	34c71b22e8	fix and enable parser unit tests (tested with eclipse) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6419 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	ce8dc575ca	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	bea3b99aff	moved table and util classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6397 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	ce7924d712	better concurrency for rwi entry parsing during search processing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6273 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	72ac5bd80f	refactoring of search process. this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	d515bc11e2	added ooxmlparser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6256 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
f1ori	8c1b02af04	* fix warning in testcase git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6255 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	65b1d51e70	added xml version of windows office test files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6244 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	67da20647f	* add new odf parser based on sax-xml-parser * remove odf_utils-jar * test metadata in ParserTest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6231 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	06557485f5	* added parser unittest! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6229 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	69dfd03985	reactivate unittests * fix old tests * add buildtarget "ant test" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6228 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5 ...

287 Commits (13e42c2dd27894043892a2600679cbecfba05339)