yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Christen	57484eb1cc	xss protection	5 years ago
luccioman	42c8a251c8	Render a relevant message and status on blocked search requests When unauthenticated (or with insufficient rights) client is blocked either because blacklisted or excessive request rate, render an error message and a relevant HTTP status for API requests, instead of an empty response that appears broken.	6 years ago
luccioman	a8316c79da	Allow JS resorting of search results by unauthenticated users Acces rate limitations to this search mode by unauthenticated users are set low by default to prevent unwanted server overload but can be customized through the SearchAccessRate_p.html configuration page Fixes #291	6 years ago
luccioman	0ab2b49c31	Made /yacysearch access rate limitations user configurable With a new admin page at /SearchAccessRate_p.html in menu Network Access > Local Search > Access Rate Limitations	6 years ago
luccioman	9782a98a9c	Added the possibility to customize facets sort type and direction Previously search navigators/facets elements were sorted only by counts. Now from the ConfigSearchPage_p.html admin page, sort direction (ascending/descending) and type (on counts or labels) can be customized independently for each navigator.	6 years ago
luccioman	b726b2b532	Removed unnecessary '+' character URL decoding from search query Manually replacing '+' character or "%20" by a space character in the search query parameter was necessary in YaCy a long time ago to properly decode application/x-www-form-urlencoded format (commit `9842fab6e4` in 2010). Since the introduction of Jetty as the embedded HTTP server (commit `4b77733e59` in 2013), this is no more necessary as Jetty internals already do this for us in org.eclipse.jetty.util.UrlEncoded.decodeUtf8To(). So we can remove now this duplicated decoding as it prevents a proper use of the '+' character in search requests, as reported in issue #216.	6 years ago
luccioman	07e8628853	Added HTML5 embedded audio for results playing on supporting browsers Restricted to authenticated or localhost users only to prevent redistribution license issues.	7 years ago
luccioman	0cdee4e26a	Fixed loss of "meanCount" search param when using facets or page buttons Then on new search queries, no suggestions at all could be displayed.	7 years ago
luccioman	a9dc0874c0	Remove old query terms from search results suggestions links. Especially when old terms were misspelled, suggestions links then provided most of the time empty results.	7 years ago
luccioman	c71b545235	Enable results suggestions (Did you Mean) even when RWI is not enabled. RWI is no more necessary for suggestions processing since commit `c40ba51ca6`. Revealed by a question about spell check from ouahpiti on YaCy forum (http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6084 ).	7 years ago
luccioman	8a4ea1c11e	Added UI switch to control content domain constraint per search request	7 years ago
luccioman	e6907fdab3	Added optional search parameter/setting to control content domain filter Thus allowing to choose at configuration or per search request, whether extending or not results beyond strict content domain filter (image, video, audio or application). Related graphical controls to be added to user interface.	7 years ago
luccioman	f9cba827c0	Made "tld:" modifier case insensitive and IDN complient. Thus allowing typing internationalized top-level domains with non ASCII characters as tld: modifier.	7 years ago
luccioman	8e732d437c	Enable HTTP Digest authentication for non admin users. Also ensure authentication is not lost by Digest timeout when navigating between index.html and search results page. This way, running searches with extended features on a remote peer or a password protected peer works with a regular user (with "Extended search" rights). When authenticating on the search page with a user without "Extended search" rights, it appears as authenticated, but has just its usual access to the public search features.	7 years ago
luccioman	d0bed78d02	Use the same top nav bar on index.html and search results. Thus eventually including the same optional login link/status in the search start page than in the results page, for the same convenient login without the need to use the Administration section.	7 years ago
luccioman	af198b990b	Added an optional login link/status to the search public top nav bar. Thus allowing a more convenient way (wihout the need to go to the admin section) to login when searching on your remote or password protected peer and benefit from extended search features such as Heuristics, Bookmarking or JavasScript resorting. Can be disabled using the ConfigSearchPage_p.html.	7 years ago
luccioman	27ab733685	Ensure private search features are not lost on Digest auth timeout This is a fix for mantis 766 ( http://mantis.tokeek.de/view.php?id=766 ) Since the upgrade to Digest authentication, access to protected search features was indeed disabled once the Digest nonce timed out. After Digest auth timeout the browser no more sent authentication information and as the search results page is not private, protected features were simply be hidden without asking browser again for authentication. Adding a supplementary parameter when accessing the search results as authenticated fixes this.	7 years ago
luccioman	ef8aea7f8d	Made the dates navigator max elements number user configurable. Also used object properties on QueryParams instances, rather than using mutable class (static) properties.	7 years ago
luccioman	9049a926a5	Restrict JS results resorting to authenticated users. Until a more efficient DOM refresh model needing less XHR requests per search is implemented.	7 years ago
luccioman	d00a35576c	Apply JS resort only when currently relevant : p2p text search	7 years ago
luccioman	9e86d183b8	Disable manual search results resorting when resorting is done with JS Also added a constant for the js resorting setting key.	7 years ago
JeremyRand	d37df75afa	(WIP) Optionally sort HTML search items via Javascript. TODO: Expose a GUI setting for this.	7 years ago
luccioman	a28428047a	Fixed count of filtered results from local solr. Was inadequately modified in my previous related commits (making next pages buttons unavailable in Search portal mode), as SearchEvent.local_solr_available did not count the total filtered results but only the ones within the currently fetched result page(s).	7 years ago
luccioman	30c2f50e0b	Use final results counts in progress bar detailed statistics. Using unfiltered detailed counts (local and remote entries found before doubles detection and before applying query modifiers) was confusing and inconsistent with the total count. It could let think more results are to come in the next pages, without understanding why they are not displayed.	7 years ago
luccioman	a1a0515312	Added a button to manually refresh sorting of p2p search results. As a server-side oriented alternative to the JavaScript realtime resorting feature proposed in PR #104. The goal is the same as in this PR : having the possibility compensate the network latency of various peers results fetching and obtain once possible a consistently ranked result set.	7 years ago
luccioman	870a5eae26	Removed temporary test main method commited by mistake.	8 years ago
reger	396ed3c769	On negative result vote also delete document from fulltext index (not only from dht)	8 years ago
luccioman	c25e48e969	Enabled displaying results after 14th page for local search queries. Fixes issue #90 for local queries only: Stealth mode, Portal mode or Intranet mode. For P2p mode, the issue would probably be difficult to solve with reasonable performance. This is still to dig. Also switched some InterreputedException catch log messages to warn level as this is normal behavior when shutting down a peer. Fixed yacysearch buttons navbar behavior to deal correctly with total results count or offset over 1000. Also improved the buttons navbar to be able to navigate over 10th page for local queries.	8 years ago
reger	395f2e8946	Make ServletRequest implement the standardized HttpServletRequest interface, to make all readily available information from the original ServletRequest available to YaCy servlets (without converting data to internal structures). The implementation of the common interface allows easier integration of YaCy servlets with the servlet standard (e.g. shared login service with the servlet container etc.)	8 years ago
luccioman	6a0b218ae5	Absolute URLs in yacysearch.* : ensure no downgrade from https to http Also removed unnecessary use of deprecated Seed.getIP().	8 years ago
luccioman	60df09fff9	Fixed some HTML validation errors : Illegal character in query Now encode space characters in URLs query part.	8 years ago
reger	7bac756720	prevent dealing with -UNRESOLVED_PATTERN- eventID parameter in html includes on first landing on search page	8 years ago
reger	8c9684cc45	optimize surftip data load, double load (index, loader) not neccessary, getMetadata already suficient + lng file adjustments	9 years ago
reger	4765e374e6	altered clac. of search result items per page to display taking the existing limits into account but make it consistent with search option screen for admin and public user changes: - configured default number of items per page (ConfigPortal_p.html) is used as is (no hardcoded limit) - otherwise requests are limited to 100 results per page ( = search option, index.html) (this basically is the major change, inc. limit from 20 to 100 for public user) P.S. - the older grant of more (1000), if no online snippet calculation, is kept (for the time being) see http://mantis.tokeek.de/view.php?id=627	9 years ago
reger	abd8ecb503	remove contendom depending override of search result items per page initially introduced `e4570bffaf (diff-ae6c130fc11088c830b00ed9256ab56b)` (as one part of unexpected difference in actual vs requested results, partial bugfix for http://mantis.tokeek.de/view.php?id=627 )	9 years ago
reger	e8256bb3b1	remove blekko from opensearch config (not available) see https://blekko.com/ http://searchengineland.com/goodbye-blekko-search-engine-joins-ibms-watson-team-217633	9 years ago
reger	28b8bc290a	fix use of NETWORK_SEARCHVERIFY for rwi verification was not used to set the searchevent parameter (done in SearchEventCache.getEvent) - remove unused corresponding QueryParams.filterfailurls param.	9 years ago
reger	020630efd8	remove unused network scanner parameter from queryparameter Search event is not using networkscanner (removed filterscannerfail param always init to false)	9 years ago
reger	a60b1fb6c2	differentiate api call getLocalPort() from getConfigInt()	9 years ago
Michael Peter Christen	df3314ac1a	added a new facet type based on a probabilistic classifier using bayesian filters. This can be used to classify documents during indexing-time using a pre-definied bayesian filter. New wordings: - a context is a class where different categories are possible. The context name is equal to a facet name. - a category is a facet type within a facet navigation. Each context must have several categories, at least one custom name (things you want to discover) and one with the exact name "negative". To use this, you must do: - for each context, you must create a directory within DATA/CLASSIFICATION with the name of the context (the facet name) - within each context directory, you must create text files with one document each per line for every categroy. One of these categories MUST have the name 'negative.txt'. Then, each new document is classified to match within one of the given categories for each context.	9 years ago
Michael Peter Christen	dbbad23e12	removed warnings	9 years ago
Michael Peter Christen	1fec7fb3c1	suppress access to solr when doing search suggestions in case that the index has more than two million documents. This protects the index from beeing flooded with search requests that cannot be resolved before the real search query has to be computet.	10 years ago
reger	b47267b79c	precaution against NPE on createorgetBookmark on search result	10 years ago
reger	8a5b8f8789	on bookmaring of search result, remember orig. query in separate bookmark property (instead of using the description field) - adjust display and autosearch - don't overwrite existing bookmark but combine info	10 years ago
Michael Peter Christen	fed26f33a8	enhanced timezone managament for indexed data: to support the new time parser and search functions in YaCy a high precision detection of date and time on the day is necessary. That requires that the time zone of the document content and the time zone of the user, doing a search, is detected. The time zone of the search request is done automatically using the browsers time zone offset which is delivered to the search request automatically and invisible to the user. The time zone for the content of web pages cannot be detected automatically and must be an attribute of crawl starts. The advanced crawl start now provides an input field to set the time zone in minutes as an offset number. All parsers must get a time zone offset passed, so this required the change of the parser java api. A lot of other changes had been made which corrects the wrong handling of dates in YaCy which was to add a correction based on the time zone of the server. Now no correction is added and all dates in YaCy are UTC/GMT time zone, a normalized time zone for all peers.	10 years ago
Michael Peter Christen	6578ff3ddb	enhanced suggest function	10 years ago
Michael Peter Christen	efbc9a3561	introducting a new getConfig method which parses comma-separated llists from setting fields; refactoring for all places where such lists are parsed	10 years ago
Michael Peter Christen	69eacdf4eb	applying precompiled CommonPattern.COMMA.split to all places where split(",") was used	10 years ago
Michael Peter Christen	3d717b749a	fix for urlmaskfilter	10 years ago
reger	24f68a4eb7	refactor opensearch heuristic introduce FederateSearchManager handling search heuristic to external systems via specific FederateSearchConnectors, which provide the query() functionallity, the translation to YaCy schema .toYaCySchema() and the search() routine to deliver results to searchevents, which is generally implemented in Abstract connector. The manager enforces now a min 15s delay between calls to external systems. Besides the OpensearchConnector a SolrFederateSearchConnector is available. It uses a additional config file for fieldname translation. default heuristicopensearch.conf: - openbdb.com removed - seems not longer to deliver results - config via solrconnector to datacite.org added (large technical library archive)	10 years ago

1 2 3 4 5 ...

596 Commits (80785b785e9db4ff0464cd370c6994a381fa59b0)