yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger	4eeb448eb3	use DigestURL in UrlProxyServlet as parameter to pass requested url to handler. UrlProxyServlet splits url in parts to pass it on as parameter and HeaderFramework constructs a url from param parts. This is obsolete if already created url is used (makes HeaderFramework.getRequestURL obsolete = removed)	8 years ago
reger	66cc0dd173	refactor: move GSA specific date formatter to GSAservlet adjust return type to String for HeaderFrameWork.getSingle	8 years ago
reger	d525967999	refactor: move convertHeaderFromJetty to ProxHandler (only used with active proxy not needed for standard servlets)	8 years ago
luccioman	0806de8fdc	Ensure file input stream are closed in both normal and error cases.	8 years ago
reger	d631fbc019	make more use of the new ServletRequest interface methodes getScheme, getServerPort (in QuickCrawlLink_p & YaCyDefaultServlet)	8 years ago
reger	395f2e8946	Make ServletRequest implement the standardized HttpServletRequest interface, to make all readily available information from the original ServletRequest available to YaCy servlets (without converting data to internal structures). The implementation of the common interface allows easier integration of YaCy servlets with the servlet standard (e.g. shared login service with the servlet container etc.)	8 years ago
luccioman	7296e3884f	Switched even more URLs to pure relative ones. Thus a YaCy peer can run behind a reverse proxy subfolder without need for the reverse proxy to rewrite HTML links (a CPU costly operation). Tested on Debian Jessie with an apache2 reverse proxy. See related mantis issues http://mantis.tokeek.de/view.php?id=106 and http://mantis.tokeek.de/view.php?id=701	8 years ago
luccioman	731684105a	Improved absolute URLs rendering in OpenSearch desc and RSS feeds. When the peer is behind a reverse proxy providing SSL/TLS encryption, the rendered absolute URLs should start with https when the user browser requested https : added limited support to the X-Forwarded-Proto HTTP header notably provided on Heroku platform. Also added some unit tests.	8 years ago
luccioman	2da5f339f8	Fixed /News.html and /Wiki.html pages in Search Portal mode (issue #87 ). Also fixes theses pages rendering when the peer is not online. Re-factored code in common with /opensearchdescription.xml and ConfigPortal.html.	8 years ago
luccioman	f0639d810c	Customized name for Threads still using the default "Thread-n" pattern. This makes threads monitoring easier to read.	8 years ago
luccioman	a588ed7628	Applied image headers customization to the new ViewFavicon servlet.	8 years ago
luccioman	7717a3d43d	Fixed license headers on files created to improve favicon management.	8 years ago
luccioman	6e1959f469	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Conflicts: htroot/yacysearchitem.java source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java source/net/yacy/search/schema/CollectionConfiguration.java source/net/yacy/server/serverObjects.java	8 years ago
luccioman	b3b75b0498	Accessibility : add a customizable alternative text to YaCy log Applied W3C recommendations : https://www.w3.org/TR/html51/semantics-embedded-content.html#a-link-or-button-containing-nothing-but-an-image and https://www.w3.org/TR/html51/semantics-embedded-content.html#logos-insignia-flags-or-emblems	8 years ago
Michael Peter Christen	9934f546bb	added default fl to solr query, removed large texts retrieval and changed snippet to description tag if no other description is available	8 years ago
Michael Peter Christen	a9316ceff6	force browser-caching of favicons from search results	8 years ago
reger	16e8ed3f01	Introduce additional language setting "browser/Browser Language" for UI internationalization. If language is set to "browser" the client/user browser language is used to choose from available translation. simply: one users browser speaks English -> YaCy responds in English, other users browser speaks French -> YaCy responds in French. ! To make a translation/language available you have to activate the language once ! (or manually use the utility class TranslateAll) In ConfigBasic.html availabel translations are marked green on setting language=Browser The client language is determined by http header Accept-Language (checked in DefaultServlet)	8 years ago
reger	3b47a07dd1	change unused servletProperties entry CONNECTION_PROP_CLIENT_REQUEST_HEADER to use directly HttpServletRequest. This is used to get the http protocol version in HTTPDProxyHandler.fulfillRequestFromWeb() for error response to client. - adjust YaCyProxyServlet and UrlProxyServlet accordingly - use more http_version constants in headerframework and httpdeamon - equalize servlets (3) use of HeaderFramework.CONNECTION_PROP_HOST to HeaderFramework.HOST	8 years ago
Michael Peter Christen	d8504418b6	enhanced browser-caching of static content	8 years ago
luccioman	744c9a2615	Opensearch desc : handle https protocol url with default port (443) This completes modifications made for mantis 669 (http://mantis.tokeek.de/view.php?id=669)	8 years ago
reger	bf6ce33da3	Correct use of _htDocsPath config in YaCyDefaultServlet to use servlet config variable + add some javadoc and remove a not useful static declaration	8 years ago
reger	3811184abd	fix GSA servlet clientIP retrival	8 years ago
luccioman	6e96c7341a	Merge remote-tracking branch 'origin/master' Conflicts: htroot/Load_MediawikiWiki.java htroot/Load_PHPBB3.java htroot/ViewImage.java	8 years ago
reger	6bf9c55584	adjust Solr select servlet to lates bugfix for boostquery (bq param) to split query into multiple parameter on line separator in input query. e.g. split "crawldepth_i_0^10.0 \n crawldepth_i:1^5.0" but allow "url_file_ext_s:jpg OR url_file_ext_s:png" to be unsplitted	9 years ago
reger	d9adc2c255	load handler for Transparent Proxy on startup only if feature is activated to save the resources and keep handler chain small if the feature is not used. +add a warning message on settingsack_p page to restart on first activation	9 years ago
Michael Peter Christen	f12a900f3e	harmonization of http post of files for one and several files - this had been differently - and wrong for several files. also: base64-encoding for gzipped push files because our data structures currently only supports ASCII POST pushes..	9 years ago
reger	58a959403d	fix mixed logfactory in UrlProxyServlet, Class doesn't use functions of declared ancestor, change to extend on httpservlet	9 years ago
reger	42a7bdb2af	fix SolrSelectServlet authentication to default to true	9 years ago
luc	7aa1a29e33	Return more accurate HTTP status 400 with detail message when some error occurs on ViewImage : - missing required parameters - url licence invalid	9 years ago
luc	0076f9f97d	Updated documented sample url	9 years ago
reger	c91e712178	further refactor using standard java / (one) utf-8 charset variable extending initiative of commit `9a25751850`	9 years ago
luc	571bc55937	Refactoring : use StandardCharsets constants instead of hard-coded charset names.	9 years ago
reger	e9539b1086	reintroduce special handling of file upload multipart/form-data from HTTPDemon.parseMultipart - add filename to parameter fieldname - add filecontent to special parameter fieldname$file (some servlets use this $file parameter) fix for http://mantis.tokeek.de/view.php?id=542	9 years ago
reger	9da1712a31	increase http header EXPIRES for css and images in DefaultServlet to increase browser cache hits for not changing content	9 years ago
reger	d5fd031449	fix reading of ippattern config array in URLProxy	9 years ago
reger	b7e8358645	make use of header.getContentType where possible (mime is normalized afterwards) otherwise use header.mime() differentiated in prev. commit.	9 years ago
luc	2a67d2ba6f	Corrected error management for unsupported image formats, parsing errors, and unavailable resources : avoid logging to much Exceptions as these errors easily occur when searching images.	9 years ago
luc	1565559df8	Refactoring : extracted write InputStream method.	9 years ago
luc	07437986e7	Merge branch 'master' of https://github.com/yacy/yacy_search_server	9 years ago
reger	97cc03ef6a	start using a template for urlproxy header It is included as iframe /proxmsg/urlproxyheader.html to allow full servlet functionallity and flexibility to display some index/meta data in future.	9 years ago
luc	4e673ffc9a	Ensure closing of InputStream even when an exception occurs.	9 years ago
luc	aa70ff4ff6	Corrected images alpha channel rendering	9 years ago
reger	367fe388b9	fix exception throw after sendError in DefaultServlet - reduce debug exception logs in crawler	9 years ago
reger	206883f80d	fix: Preserve protocol in url proxy to connect to http/https. Display warning if https target is viewed over http	9 years ago
sixcooler	e427efbe54	Next Try for a fix for upload-connection staying in blocked state. This was caused by reading via GZIP from close-wait connection an caused high cpu- and system-loads. Instat of implementing handling of the RedListener now I found a timelimeted 'get' "realy" solving this problem.	10 years ago
sixcooler	ef6a64b2a4	Fix for upload-connection staying in blocked state. This was caused by reading via GZIP from close-wait connection an caused high cpu- and system-loads. Solved by implementing handling of the RedListener.	10 years ago
reger	572cfe8fd4	improve character encoding for urlproxy servlet for none utf-8 pages	10 years ago
reger	6bc8a9b11e	make Quality of Service Servlet available to prioritize requests from local host This assigns priorities to incoming requests. Higher priority numbers are served before lower. (disabled by default in defaults/web.xml, uncomment or copy entry to DATA/Settings/web.xml)	10 years ago
Michael Peter Christen	fed26f33a8	enhanced timezone managament for indexed data: to support the new time parser and search functions in YaCy a high precision detection of date and time on the day is necessary. That requires that the time zone of the document content and the time zone of the user, doing a search, is detected. The time zone of the search request is done automatically using the browsers time zone offset which is delivered to the search request automatically and invisible to the user. The time zone for the content of web pages cannot be detected automatically and must be an attribute of crawl starts. The advanced crawl start now provides an input field to set the time zone in minutes as an offset number. All parsers must get a time zone offset passed, so this required the change of the parser java api. A lot of other changes had been made which corrects the wrong handling of dates in YaCy which was to add a correction based on the time zone of the server. Now no correction is added and all dates in YaCy are UTC/GMT time zone, a normalized time zone for all peers.	10 years ago
Michael Peter Christen	f5a032f293	split query into filter query and text query to get better ranking results and faster results	10 years ago
Michael Peter Christen	f9ba50379d	added an expansion option to search facets on result page: - if less or equal of 8 facet options are present, they are shown by default - if more facet options are present, they are hidden To view or hide all facets, just click on the facet header bar	10 years ago
reger	de56d934b2	apply query parameter getQueryFields() to GSA servlet	10 years ago
reger	9b0de2de64	introduce getQueryFields to return default query fields (queryparamter QF) calculated from boostfields config, making sure title, description, keywords and content is always searched. - apply change to solrServlet makes sure every remote query uses at least all locally defined boost fields for search - apply to local solr search - simplify select query by using QF defaults	10 years ago
Michael Peter Christen	b5ac29c9a5	added a html field scraper which reads text from html entities of a given css class and extends a given vocabulary with a term consisting with the text content of the html class tag. Additionally, the term is included into the semantic facet of the document. This allows the creation of faceted search to documents without the pre-creation of vocabularies; instead, the vocabulary is created on-the-fly, possibly for use in other crawls. If any of the term scraping for a specific vocabulary is successful on a document, this vocabulary is excluded for auto-annotation on the page. To use this feature, do the following: - create a vocabulary on /Vocabulary_p.html (if not existent) - in /CrawlStartExpert.html you will now see the vocabularies as column in a table. The second column provides text fields where you can name the class of html entities where the literal of the corresponding vocabulary shall be scraped out - when doing a search, you will see the content of the scraped fields in a navigation facet for the given vocabulary	10 years ago
Michael Peter Christen	bee5ee7cce	removed some warnings	10 years ago
Michael Peter Christen	4c9d2a7c64	reverted 'do not show all options' strategy. This is actually confusing new users. Will be activated maybe again if there is an optional tutorial mode which can be switched on for this special purpose of running a tutorial.	10 years ago
reger	4eb89d7f15	revert clickservlet (default was indeed a mistakenly)	10 years ago
Michael Peter Christen	c9e2128260	please commit new files under your own name, this file was not created by me.	10 years ago
reger	d44d8996d0	Added a “don't store remote search results” option This is intended for peers who want to participate in the P2P network but don't wish to load/fill-up their index with metadata of every received search result. The DHT transfer is not effected by this option (and will work as usual, so that a peer disabling the new store to index switch still receives and holds the metadata according to DHT rules). Downside for the local peer is that search speed will not improve if search terms are only avail. remote or by quick hits in local index. To be able to improve the local index a Click-Servlet option was added additionally. If switched on, all search result links point to this servlet, which forwards the users browser (by html header) to the desired page and feeds the page to the fulltext-index. The servlet accepts a parameter defining the action to perform (see defaults/web.xml, index, crawl, crawllinks) The option check-boxes are placed in ConfigPortal.html	10 years ago
reger	1f9389396a	fix NPE related 500 (Bad Request) response of UrlProxy on blacklisted urls, by adding parameter HTTPDeamon and removing unused hostAddress lookup code in sendRespondError	10 years ago
Michael Peter Christen	28683530cd	fixes to usage of no-cache: use and recognize also the no-store directive	10 years ago
Michael Peter Christen	c9c700b510	reduction of http requests to YaCy using the correct cache-control, expires and last-modified headers in http response.	10 years ago
Michael Peter Christen	1cfddea578	added (very experimental) Solr response writer for snapshot image results	10 years ago
Michael Peter Christen	3354cd63be	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	63846ddb89	add final SolrQueryRequest.close to SolrServlet	10 years ago
Michael Peter Christen	578ae29f1e	added a note that the servlet is linked using web.xml	10 years ago
reger	6c3f36def1	- fix path to default heuristic.cfg - deprecate unused ProxyServlet	10 years ago
Michael Peter Christen	226aea5914	added a servlet which can create preview images, preview tumbnails and preview pdfs from web pages, i.e.: http://localhost:8090/api/snapshot.png?url=http://yacy.net/en/&width=128&height=128 http://localhost:8090/api/snapshot.jpg?url=http://yacy.net/en/&width=128&height=128 http://localhost:8090/api/snapshot.pdf?url=http://yacy.net/en/ This supports also an on-the-fly generation of the preview documents if the user is an administrator. Otherwise, the servlet fails. To enable this, you must add wkhtmltopdf, imagemagick and (on headless servers) xvfb to your operation system. for detailed instructions, see `97f6089a41`	10 years ago
Michael Peter Christen	c0f9f6ac66	added option to change the navbar-default, i.e. usable for dark skins	10 years ago
reger	fe9f1c594e	fix char encoding parameter in UrlProxy	10 years ago
orbiter	a922b122a3	added a hack to forward solr search results from an external attached solr to the YaCy built-in solr search servlet. Its not complete and not fully correct (there is still a utf8 encoding problem) but it is a way to get easily requests forwarded through YaCy to an external Solr.	10 years ago
Michael Peter Christen	eab0d3e1a9	bugfix for wrong lock display, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5321&p=30484#p30484	10 years ago
orbiter	49d4f95faf	bugfix to latest commit	10 years ago
orbiter	68211f8244	enable Crawler_p servlet if a rss feed or a wiki dump import was submitted.	10 years ago
orbiter	b4f2a1db6e	added a unlock icon for all protected pages that are unlocked because the administrator is logged in.	10 years ago
Michael Peter Christen	6e1dc444c3	added a snippet test function in ViewFile: you can now search for a specific word on the document; the servlet returns the snippet in the same way as it would be shown in a search result.	10 years ago
reger	47f201a6b8	Add Solr default query fields (&qf) to select servlet according to the ranking profiles boost fields defined by the peer (if df/qf is not specified in query). This allows for pretty simple queries ( q=word) without the need to know about the specific index configuration. Making sure all relevant fields (as determined by the index owner) are searched, still maintaining the option to query specific fields and does not relay on the duplication of text to text_t. - add author to reset-default boost fields (support results for author nav)	10 years ago
reger	b24572f304	fix GSA filter query assignment - use more parameter constants	10 years ago
reger	665e12f88e	move startup time from old serverCore to switchboard (most used here) to make servercore eventually obsolete.	11 years ago
Michael Peter Christen	c7995d3e2a	increased fixed limit for http POST request sizes to 100MB	11 years ago
Michael Peter Christen	2626c8f6db	using concurrency to do base64 encoding in file POST commands	11 years ago
orbiter	0bbb5040b8	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	11 years ago
orbiter	9d5d86cd03	Added filter query options to the ranking servlet /RankingSolr_p.html. Filter queries are not actually related to ranking, but user requests have pointed out that specific boost queries to move results to the end of the result list are not sufficient. Such boost filters may be better executed as actual filter and therefore such a filter can now be statically applied to every search request. A typical use could be the expression "http_unique_b:true AND www_unique_b:true" which uses the recently introduced fields http_unique_b and www_unique_b which are true only for one of the alternatives with/without http(s) and with/without prefix 'www.' in host names.	11 years ago
Michael Peter Christen	d2151857f1	Added collection navigation: The collection field (can be filled i.e. in Crawl Start) can be used to add categories to YaCy index entries. The usage of that field was restricted to solr searches and post argument filters as implemented in commit `f7571386a3`. This commit extends collections to a full navigation option in the standard YaCy search interface. The field is not active by default but can be activated easily in the /ConfigSearchPage_p.html servlet (just check the 'Collection' facet field). Collections can now be used for (at least) two purposes: - to provide search tenants (through post argument collection) - to provide self-made category navigation Search requests may now have (independently from switched on or off collection facet) a "collection:<collection-name>" modifier attached; firthermore collection names may use disjunctions using the '\|' pipe symbol. For example, this is a valid search request: www collection:user\|proxy	11 years ago
Michael Peter Christen	f13c8aa7dd	re-implementation of file push option in the context of POST http requests. The internal representation of post-arguments is String and therefore not appropriate for byte[] object as submitted by file pushes. Therefore all pushed files are encoded to base64 _after_ uploading with an http form (you do not need to do that encoding yourself) to hand-over the byte[] as string in the post argument. Servlets which read such files must decode the base64 data to get the original byte[] array. This is considered as a temporary solution for file uploads and a proper implementations would need to consider all attributes as handed over as Objects with either String or byte[] Object instances. This would be a major code change and is not done at this time here now. The feature was submitted to realize a feature as pushed with the next commit.	11 years ago
reger	8e233e2eb4	- fix typo in Message_p (defaultpath) - use more existing switchboardconstants for getproperties - replace depriciated call defaultservlet	11 years ago
orbiter	97983ba89f	fixed generics warnings for generic array instantiation that appeared after migration to Java 7	11 years ago
orbiter	c9f66be20b	move unnecessary nested else out of condition	11 years ago
reger	cd8c0dbda9	assign serialVersionUID for proxyservlet, too.	11 years ago
reger	b300d7f4ce	set serialVersionUID on urlproxyservlet to skip compiler warning - remove commented out code	11 years ago
reger	e9060d31bd	update to Jetty 9 besides adjustments in code it makes the servlet settings in web.xml significant. This applies to solr, gsa and proxy servlet. There is no longer a default setup in code during init (as jetty 9 checks for double definition).	11 years ago
Michael Peter Christen	4e734815e8	enhanced snippets: remove lines which are identical to the title and choose longer versions if possible. Prefer the description part.	11 years ago
reger	d812f80784	add exit proxy link to UrlProxy on proxied pages a link to exit proxy is added to top of page. Link text can be configured in web.xml init-parameter (see default/web.xml). If missing no link is displayed.	11 years ago
reger	d51f9cc863	add custom Jetty errorhandler to provide custom error page footer line - remove redundant mime check in UrlProxyServlet	11 years ago
reger	710054bb37	implement gzip input handling directly in defaultservlet (making reference to legacy httpdemon obsolete)	11 years ago
Michael Peter Christen	734778c0c8	fixed a time-out problem in the default servlet which is also a logging problem because the error log showed the wrong reason (file not found) instead the actual reason (time-out).	11 years ago
orbiter	41730c8048	better logging in template engine: shows filename of servlets where errors in templates occur	11 years ago
reger	da413af664	move baseurl after parsing orig source in urlproxyservlet to calculate absolute href links for rewrite from unmodified source.	11 years ago
orbiter	b1ba764d81	fix for first start options and added german translation for popup texts	11 years ago
orbiter	429a874222	- added COLS field in GSA response (non-gsa standard by customer request) - updated document link in GSA response writer	11 years ago

1 2 3 4

191 Commits (15b7461bc7968bcfd8dfc516cf91c3548195e4a3)