yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	1092e798a5	fixed double content postprocessing	11 years ago
Michael Peter Christen	aee5b108e5	added linkScraperParser, a parser which ignores the text like the generic parser but extracts links like the htmlParser. This should be used for ASCII documents without known text format annotation like source code files or json documents. Probably also good for xml files without known schema.	11 years ago
Michael Peter Christen	f384fd624b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	2b8cc5832c	fix seek error for 0 file size records file by add extra check for file size = 0 in cleanlast() - (http://mantis.tokeek.de/view.php?id=411)	11 years ago
reger	1f2eba977d	add test case for Records (used in HostBalancer) - simulating seek error (http://mantis.tokeek.de/view.php?id=411)	11 years ago
reger	2ba394333f	fix Crawler HostQueue release of stackfile - close stackfile inputstream at end of ChunkIterator This should solve startup delay while unfinished crawl jobs exist (maybe also too many open file situation)	11 years ago
reger	40133ba2d0	fix NPE in Condenser, discovered by calling IndexControlRWI, "Word Deletion" with "for every resolvable and deleted URL reference"	11 years ago
reger	e94efd4d7c	update to JUnit 4.11 - fix build.xml -> parserTest error on Windows due to javac encoding	11 years ago
reger	3b77e41f1a	adding test for HostQueue crawl stack - simulating problem with zero length stack file (but not fixing it) - adding test data clean to maven pom	11 years ago
reger	ba5a59a28d	make search result also avail. as atom feed via /yacysearch.atom - fix logo in rss feed	11 years ago
orbiter	59160984cc	timeline performance update	11 years ago
orbiter	54bea96e67	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	11 years ago
Michael Peter Christen	15b2fad6a2	reverted latest change for reindexing because that works actually only for internal Solr indexes. This is mainly caused by the fact that an external Solr may be also a SolrCloud which do not support LukeRequests, which are needed to request the old Schema.	11 years ago
Michael Peter Christen	841cc77391	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	e09218129c	remove check for local solr. This check was made during a time when Solr was optional and another alternative metadata store was available. Since that store is now removed, Solr is always available (internally or externally)	11 years ago
orbiter	2073e69034	fix for long periods in timeline	11 years ago
reger	1f94df29e7	fix NPE in solr rss where snippet contains only the title text and adjusted xslt, for solr snippets (&hl=true) to decode the xml encoded html <b> tag by adding disable-output-escaping (still open item description may be double as dc: tag and rss.description tag)	11 years ago
Michael Peter Christen	09dcdb9b19	update to solr 4.9.0	11 years ago
Michael Peter Christen	282b53db42	update of commons-io and slf4j-api (as preparation for Solr 4.9.0)	11 years ago
Michael Peter Christen	1cd4b2e8be	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	8c52f0651b	refactoring of AccessTracker events & timeline fix	11 years ago
reger	431a5f9c4e	added test case for TextSnippet, removed obsolete/unused parameter and reference to MediaSnippet	11 years ago
Michael Peter Christen	5b94a257ce	no timeout for large reference collections	11 years ago
Michael Peter Christen	f5b817bac4	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	cb2c17d236	extract author and keywords in .doc and .ppt parser	11 years ago
reger	a5707cd2eb	enable proper Author navigator - author facet is based on omitted author_sxt field - adjust to make author nav available on exist of author field but keep using author_sxt to construct the facet (why!?) - add check for querymodifier author in searchevent	11 years ago
Michael Peter Christen	1b279d7a7e	fixed external link	11 years ago
Michael Peter Christen	74206a10c7	refactoring	11 years ago
orbiter	fec673c9d1	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	11 years ago
orbiter	4a66af716d	added apkParser stub (work in progress)	11 years ago
orbiter	c59da9fe7a	added access tracker log reader stub	11 years ago
reger	2d67f29244	adjust mergeDocument after parsing to - preserve charset and languages - fix merge of author	11 years ago
Michael Peter Christen	0d29b972cc	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	36e623d8bf	enhanced metadata enrichment for media file type search: - Web servers may now deliver YaCy-specific http header field with a title and keywords. The new http header fields are: X-YaCy-Media-Title - to be used for media (image, audio, video) titles X-YaCy-Media-Keywords - to be used for media (image, audio, video) keywords - both fields are written to document fields title and keywords and are searched also during image search. - to make the usage of arbitrary http header fields (including this new fields) possible in the /api/push_p.json servlet, a new POST argument is also introduced to push http header fields. The new POST attribute is named "responseHeader-X" (where X is the counter). It is allowed to use this attribute as multi-attribute several times, each can be filled with a http header line. - see /api/push_p.html for examples	11 years ago
Michael Peter Christen	49886fab08	enhanced debugging	11 years ago
Michael Peter Christen	b893c42a0f	bugfix for image search	11 years ago
Michael Peter Christen	c7995d3e2a	increased fixed limit for http POST request sizes to 100MB	11 years ago
reger	7847a93558	fix AbstractParser.singleList not adding null strings - prevents null titles in oo... parser (as detected by ParserTest) - correct ParserTest dc_description check (dc_description allowed to return 0 length array)	11 years ago
Michael Peter Christen	8acae852a0	write <em>-tagged texts also into the bold_txt field	11 years ago
reger	a88ea14e09	harmonize use of style for "delete" button - apply the monstly used btn-danger class	11 years ago
sixcooler	66c784c552	bump to httpclient-4.3.4	11 years ago
reger	b9f6acee23	update to Jetty 9.2.1	11 years ago
reger	90c4576361	add a link to recrawl index entry to metadata html page - to allow manually renew index content for this url (e.g. in case it is a remote search result with metadata only) - use simply a QuickCrawlLink_p javascript snippet (minimalistic 1st solution)	11 years ago
Michael Peter Christen	8fd72b5e8b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	81d0f01a6f	added 'synchronous' and 'commit' flags in push api	11 years ago
Michael Peter Christen	2626c8f6db	using concurrency to do base64 encoding in file POST commands	11 years ago
Michael Peter Christen	e132689818	fixed and enhanced Base64 (en)coder (again)	11 years ago
Michael Peter Christen	2415e3db43	enhanced ASCII byte[] -> String conversion	11 years ago
reger	5043eff33a	move page navigation below results (image search) force page navigation to be displayed below results in image search for any number of displayed images instead to be displayed to the right of last image.	11 years ago
Michael Peter Christen	4751ed974f	enhanced base64 encoding	11 years ago

... 2 3 4 5 6 ...

11104 Commits (a65df4ce7ee4a4030e6bfc0da344d09d00c7e0c1) All Branches Search

11104 Commits (a65df4ce7ee4a4030e6bfc0da344d09d00c7e0c1)

All Branches