yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger	588c6e96fb	upd version for typeahead.jquery.js in jslicense.html	7 years ago
luccioman	8100c033a2	URL Viewer : apply crawler size limits when adding to local index. This allow large files parsing and preview, while preventing unwanted OutOfMemory errors which are likely to occur when adding to the Solr Index resources larger than configured crawler limits.	7 years ago
reger	e5cff062b5	Clean up redundant but obsolete jquery.rdfquery-core-1.0.js script lib	7 years ago
reger	23bda133d2	Fix css conflict of YMarks.html to make it viewable. yacy-ymarks.css sidebar conflicts with bootstraps sidebar (different overlay settings). Simply renamed it to ymark-sidebar.	7 years ago
reger	a21789d4e7	Fix unresolved pattern in api/share.html by init some display var's	7 years ago
luccioman	bf55f1d6e5	Started support of partial parsing on large streamed resources. Thus enable getpageinfo_p API to return something in a reasonable amount of time on resources over MegaBytes size range. Support added first with the generic XML parser, for other formats regular crawler limits apply as usual.	7 years ago
luccioman	1b3c169a9c	URL Viewer : decode raw text using the eventual response charset. When provided, or decode as UTF-8 as previously done.	7 years ago
reger	e6e20dab52	upd to Jetty 9.4.6.v20170531 Modify loginservice to the changes in Jetty, partially based on pull request #101 https://github.com/yacy/yacy_search_server/pull/101 bu @automenta	7 years ago
luccioman	e4c730b99f	Updated PerformanceQueues_p.xml API with last related servlet changes	7 years ago
luccioman	dcc56318bb	Made remote search max system load limits configurable from UI. As reported by davide on YaCy forums ( http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6004 ) when the system is on high load, unless reading carefully YaCy configuration file, it could be difficult to understand why remote search results are not fetched.	7 years ago
luccioman	4b72b29ea2	Added an informative title on the crawl start robots.txt status icon	7 years ago
luccioman	d08f31c3a8	Crawl start Ajax request : properly handle eventual XML parsing errors Otherwise on a malformed getpageinfo_p XML response (from the browser point of view), JavaScript errors where thrown and the ajax status steering wheel remained displayed indefinitely.	7 years ago
luccioman	8da3174867	Ensure lower case conversion consistency with any default locale. Especially for Turkish speaking users using "tr" as their system default locale : strings for technical stuff (URLs, tag names, constants...) must not be lower cased with the default locale, as 'I' doesn't becomes 'i' like in other locales such as "en", but becomes 'ı'.	7 years ago
luccioman	c41b31dcb3	Cleaned up memory usage page HTML - fixed validation errors - removed deprecated attributes - improved accessibility with richer table semantics (headers and caption elements) and language declaration	8 years ago
luccioman	0487336ec3	Prevent integer overflow in table statistics and use strong typing	8 years ago
luccioman	0f80c978d6	Limit the number of initially previewed links in crawl start pages. This prevent rendering a big and inconvenient scrollbar on resources containing many links. If really needed, preview of all links is still available with a "Show all links" button. Doesn't affect the number of links used once the crawl is effectively started, as the list is then loaded again server-side.	8 years ago
luccioman	32288a8999	Merge branch 'master' of https://github.com/yacy/yacy_search_server	8 years ago
luccioman	e9b4b29f90	Limit scope of some local JavaScript variables.	8 years ago
Michael Peter Christen	369b8e0e0b	added json(p) endpoint for crawl start	8 years ago
luccioman	9dd790087d	Added HT Cache basic statistics (hit rate)	8 years ago
luccioman	28b451a0b3	Made Cache compression level and lock timeout user configurable	8 years ago
Michael Peter Christen	6fe735945d	migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8 Also: now Version 1.921	8 years ago
luccioman	8399275142	Properly close file output streams even on exceptions scenarios.	8 years ago
reger	632354e2ff	Tokenize result entry keywords and add some styling for display	8 years ago
reger	a814f3d885	Introduce keyword query parameter This enables keyword navigator to filter on keywords. Added search page output and layout config for keywords, allowing e.g. in Intranet use to display the keywords. No styling or links applied to the keyword text (but is desirable possibly in combination with bootstrap-tagsinput for future/intranet).	8 years ago
luccioman	cbccf97361	Added JavaDoc to the getpageinfo_p API servlet.	8 years ago
luccioman	bd88fd303e	Deprecated duplicated and internally unused getpageinfo servlet. Redirections set for the transition of any eventual external uses: - /api/getpageinfo.xml to /api/getpageinfo_p.xml - /api/getpageinfo.json to /api/getpageinfo_p.json	8 years ago
luccioman	1be4d32f99	Restored search page default behavior for Tab, Page Up and Down keys Replaced by shortcuts defined by the HTML "accesskey" attribute which has the advantage to be advertised by screen readers when focusing the corresponding buttons, contrary to custom JavasScript key handlers. Now With Firefox : - "Alt + Shift + n" for next page - "Alt + Shift + p" for previous page Following ARIA recommendation : "keyboard shortcuts enhance, not replace, standard keyboard access." ( see https://www.w3.org/TR/wai-aria-practices/#kbd_shortcuts_behavior_design) Fix for mantis 711 (http://mantis.tokeek.de/view.php?id=711)	8 years ago
luccioman	45346c1be8	Added missing accessibility attributes on search results progress bar.	8 years ago
luccioman	91a06bc669	Annotated search result information separators for screen readers.	8 years ago
luccioman	31ad043bb9	Added user interface feedback on results feeding termination status. Added as an additional icon with title in the search progress bar, to inform about background search feeder threads terminated or still running. While giving a bit more information to users about the p2p search process, this can help choosing whether or not wait a little bit more time before going to the next page, in order to get results from various sources sorted as best as possible (see #91 for a discussion about sorting accuracy and network latency). Other related modifications included : - regular updates to statistics in the progress bar until the background feeders are completely terminated. - removed some uses of unsecure and discouraged JavaScript elements	8 years ago
luccioman	d90b001e1b	Improved previous merge "Show ranking in HTML UI". - added the new setting as configurable in the "Debug/Analysis" settings page. Debug/analysis is its main purpose for now as there is currently no nice and "understansable" ranking score info servlet (see forum discussion http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5884 ) - render in the "Search Page Layout" page preview when enabled - added constants	8 years ago
luccioman	efe1232d90	Merge branch 'html-show-ranking' of https://github.com/JeremyRand/yacy_search_server Conflicts: defaults/yacy.init	8 years ago
luccioman	4564541b3b	Fixed blacklist Regex containing '+' characters rendering. As reported on YaCy forum by shni (http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5970) when a blacklist entry contained both '?' and '+' characters, the '+' chars were wrongly decoded and rendered as spaces.	8 years ago
luccioman	0612a8f4f2	Fixed the previously added link to scheduled dump operations.	8 years ago
luccioman	a87281b498	Added MediaWiki dump import scheduling feature. Checking the last modified date by default to prevent unnecessary long running operations.	8 years ago
luccioman	10c03c6c64	Improved MediaWiki dump import monitoring. When import thread is terminated : - now stop refreshing and stay on the monitoring page to give user a feedback after a long running import - added link to the next monitoring step : results from surrogates reader - added link to new import On the new import page, added a link on the eventual last import report.	8 years ago
luccioman	8d288f5dba	Crawl results page : apply table lines number limit. Take into account the already existing default limit value (especially useful after a long crawl or surrogates import), or a custom one from parameter "count". Added a "Show all" link for convenience.	8 years ago
reger	c77e43a391	Take out mailto collect in internal parsed document As earlier plans to make use of mailto as separate webgraph entity didn't materialize (see http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5726&p=32493&hilit=mailto#p32493) free the unused handling and resources.	8 years ago
reger	bec34d3546	Add url input field as source for WarcImporter allowing to import warc from url without prior download.	8 years ago
reger	d3df8a46c4	fix unresolved_pattern on missing post parameter api/message.html	8 years ago
luccioman	f66438442e	Extended Mediawiki dump import to remote URLs. When using a public HTTP URL in /IndexImportMediawiki_p.html, the remote file now is directly streamed and processed, allowing import of several GB dumps even with a low memory remote peer, and without need to manually download the dump file first.	8 years ago
luccioman	7edddd7b0d	Improved error reports on various wiki dump prerequisites failure cases. Also added some JavaDoc.	8 years ago
luccioman	dfe8d4139b	Used a text input for wiki dump import file selection. Using an HTML "file" input was confusing (as reported by promocore on YaCy forum : http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5965) , and it only worked with MS IE/Edge on a local YaCy peer : - for security reasons some current major browsers such as Firefox or Chrome do not allow to send full file path information when using a file form input - the local file system selection popup doesn't make sense when you want to import a dump on a remote YaCy server	8 years ago
reger	3a71430030	Adjust ConfigSearchPage_p to activated hosts navigator as plugin	8 years ago
reger	7b80189bda	Activate hosts navigator plugin. This includes rwi results in the navigator count. This might be tangential related to http://mantis.tokeek.de/view.php?id=736 as the example includes a local index search, while rwi results are not counted.	8 years ago
reger	05a1b14b4a	add missing text from ConfigRobotsTxt_p to master.lng and link to Translation Editor to Translation News page.	8 years ago
reger	a39c00a93f	add servlet to list user in UserDB and made user editor available in separate servlet for a quick and easy overview of configured user and selection for edit.	8 years ago
reger	a4498e17c0	fix edit current user form to required post mehtod introduced with `cde237b687`	8 years ago
luccioman	665d087d76	Enforced access controls on a few more administration pages. - ensure use of HTTP POST method when performing server side effect operations - transaction token required to ensure the request has effectively been requested by user interaction	8 years ago
luccioman	0feded21dd	Escaped HTML eventually active content from recorded API call comments.	8 years ago
luccioman	09e72eb0a4	Set Config Portal as a private administration page. Consistently with its required action from submission credentials, and because external unauthenticated users do not need to access these settings.	8 years ago
reger	9339a6a4c5	use css error class for error msg in IndexImportOAIPMH_p.html, adjust to xhtml <p> usage rule	8 years ago
reger	ba339a2a45	Add servlet to import warc file from filesystem IndexImportWarc_p.html. Apply Importer interface to WarcImporter	8 years ago
Michael Peter Christen	1d81b8f102	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	8 years ago
Michael Peter Christen	69081bce00	added export to elasticsearch. The export dump can easily be imported to elasticsearch using the command curl -XPOST localhost:9200/collection1/yacy/_bulk --data-binary @yacy_dump_XXX.flatjson	8 years ago
luccioman	5b5b9d5d96	URL Viewer : only display the link to metadata when metadata exists	8 years ago
luccioman	39ffa42a3c	Modified RWI settings page radio click event to use HTTP POST	8 years ago
luccioman	af28a07780	Updated API calls recording/replay with recent changes. - enabled HTTP POST calls with Digest HTTP authentication - made API calls compatible with API newly restricted to HTTP POST only with transaction token validation - ensured backward compatibility with older entries recorded as HTTP GET	8 years ago
luccioman	cde237b687	Enforced access controls on some administrative actions. - ensure use of HTTP POST method : HTTP GET should only be used for information retrieval and not to perform server side effect operations (see HTTP standard https://tools.ietf.org/html/rfc7231#section-4.2.1) - a transaction token is now required for these administrative form submissions to ensure the request can not be included in an external site and performed silently/by mistake by the user browser	8 years ago
reger	cbf58d5f0a	Add hint text to default ServerAcess Port Settings page	8 years ago
reger	f05976c017	Display the local search word statistic in alphabetic order	8 years ago
reger	3dd23c178b	Introduce the option to configure a shutdown port. A port value of -1 will disable this option. If set to a value greater 0, YaCy listens on this of on the local loopback address (127.0.0.1) for a shutdown or restart signal. E.g. connect to http://localhost:8005/shutdown will stop the YaCy server. http://localhost:8005/restart will restart it. This option allows to stop YaCy locally independant from the web web frontend (which might be configured for password protected remote access).	8 years ago
reger	a2afb4bae0	add switchboardconstants for server ports config keys	8 years ago
reger	038b9cd98e	update translation for ConfigNetwork_p.html	8 years ago
luccioman	8e77fe3860	Fixed unresolved pattern case in search results progress bar. This is a fix for mantis 715 (http://mantis.tokeek.de/view.php?id=715). A possible path scenario that could leading to this case : - YaCy is running low in memory - a search is requested - before the end of search results rendering, the cleanup job runs and deletes the running search event from the cache because of short memory - then yacysearchitem renders with "-UNRESOLVED_PATTERN-" parameter values passed to the statistics() JavaScript function	8 years ago
luccioman	79df5bb20a	Fixed settingsAck_p.html back link for case where referrer is stripped.	8 years ago
luccioman	5b03feb776	Fixed unresolved pattern case on /yacysearchlatestinfo.json api	8 years ago
luccioman	0173b0bc32	Added an advanced settings page for referrer policy settings. Feedback will be welcome, notably on the descriptive content of this page.	8 years ago
luccioman	cdcd923375	Privacy enhancement : added settings to control referrer policy. HTTP "Referer" header sent by the browser when using YaCy can now be controlled either with the referrer meta tag as a global policy, or only for search result links by adding the attribute rel="noreferrer". To improve privacy with the less possible regressions, the default is set as meta tag with value "origin-when-cross-origin" : internal YaCy links behavior is not affected, but when visiting external websites referrer url is not empty but stripped from query parameters and path. Older browsers, Safari, MS IE and Edge do not support the referrer meta tag, so the standard but less flexible noreferrer link type can also be enabled as an alternative. User-friendly settings page to be implemented.	8 years ago
reger	0aa0dd0b5b	fix delta time calculation in PerformanceSearch_p for the 1. entry (INITIALIZATION displayed absolute date, set delta to 0 for 1. entry)	8 years ago
luccioman	9e626f6b00	Added a hint title for required fields in the Solr Schema editor	8 years ago
reger	7c188ad092	Add extract of queries.log in form of top search word cloud (last 7 days) to AccessTracker_p.html (Network Access -> Local Search Log page). It displays top 20 words of search queries.	8 years ago
luccioman	3475d8c1a9	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	8 years ago
luccioman	c68a8be2d9	Refactored and enforced Solr mandatory fields for proper operation - Added a new method to check activation of mandatory fields on Collection Configuration commit, consistently with checks previously performed in Switchboard startup and with mandatory fields in the default schema. - Reorganized default schema and CollectionConfiguration enumeration : moved no more mandatory fields in a specific section, and moved fields enabled at startup to the mandatory section. - Marked mandatory fields as required and with stronger font in the IndexSchema_p.html page	8 years ago
reger	334c70c37a	correct fromDate init value on missing param in api/timeline_p servlet revert test modification from last commit in AccessTracker.main	8 years ago
luccioman	6e89d125f2	Added robots.txt support for heuristics federated search. As noticed by @reger24, abusive use of OpenSearch systems should be prevented, especially if allowing to parse and reuse HTML results. robots.txt file is now checked before requesting an external OpenSearch system to respect the host exclusions and eventual crawl-delay value. The check is also performed when trying to add a new OpenSearch URL template through the /ConfigHeuristics_p.html admin page.	8 years ago
reger	a011a97de9	make ConfigParser a protected page, for consistent behavior of locked menu items.	8 years ago
luccioman	54405577aa	Replaced absolute redirection locations by relative ones when possible. This makes integration of YaCy behind a reverse proxy subfolder easier.	8 years ago
luccioman	1857651988	Added a new Debug/Analysis advanced settings subsection. As discussed in PR #93 with @JeremyRand and @reger24 this new advanced settings page includes: - a new setting to control remote Solr responses encoding - some existing debug settings which could not be set through the admin user interface	8 years ago
luccioman	94af489f14	Removed deprecated "localMissCount" prop from yacysearchlatestinfo.json. This property has been deprecated four years ago by commit `d74472f562`. For any active search event id, it was then always filled with "-UNRESOLVED_PATTERN-".	8 years ago
luccioman	f6ad927a14	Refactored the DHT-Trigger section in Performance_p.html page. This is to be more easily understandable and to reflect more accurately the current memory strategies implementations that eventually set the "proper" state not only because DHT reception.	8 years ago
luccioman	b51fd9467c	Fixed unresolved pattern on directory entries in HostBrowser.xml api. As described in mantis 725 (http://mantis.tokeek.de/view.php?id=725) the HostBrowser.xml api directory entries had incorrect count attribute value. This was because the HostBrowser html page and backing template servlet evolved, but modifications were not reported on the xml api.	8 years ago
reger	f6b08443f0	adjust column layout in Settings_Proxy.inc	8 years ago
luccioman	95b63f5126	Added a CSS class for infobox block. This will prevent mistakenly hiding a div element not designed to be an infobox but having a ".info" parent (After having previously added the possibility for a div - and not only a span element - to be an infobox).	8 years ago
luccioman	68afe900d0	Added user-friendly controls over disk usage configuration settings. As mentioned in issue #103, control settings over YaCy disk usage already existed but lacked a user-friendly way to set them. I added it to the Performance_p.html administration page with a little refactoring on the "Resource Observer" fieldset for improved accessibility and HTML standards respect. Also added the possibility to enable/disable the autoregulation fonction from this page.	8 years ago
luccioman	d0182e4797	Improved Index Browser accessibility with semantically richer html tags. Made use of ol, li, thead, th, tbody, h1 and h2 html tags. Added aria-label attributes to provide alternative textual information previously only conveyed by color cue. Tested behavior with NVDA 2016.4 screen reader.	8 years ago
luccioman	254060bda1	Index Browser : fixed display of "Count colors" for authorized users.	8 years ago
luccioman	c82c8351dd	Fixed Index Browser page HTML validation errors and switched to HTML5. Also removed deprecated HTML attributes uses. Validation performed with Nu Html Checker 17.1.0. Cross browser tested with : - Debian Jessie : Firefox ESR 45.6.0 - MS Windows 10 : Firefox 50.1.0, Chrome 55.0.2883.87, MS Edge	8 years ago
luccioman	826e5bbadd	Documented /HostBrowser.html related configuration settings	8 years ago
luccioman	9adba36754	Fixed "-UNRESOLVED_PATTERN-" admin parameter in "load & index" links.	8 years ago
luccioman	4e2bc644cb	Display Index Browser links requiring auth only when authenticated. In the /HostBrowser.html page "only hosts with urls pending in the crawler", "only with load errors" and "Administration Options" all require administration credentials. But they were displayed even to unauthenticated users, and clicking them did nothing and returned the /HostBrowser.html page empty.	8 years ago
reger	e61ee180a7	Group all proxy settings on System Administration by adding settings of UrlProxyAccss page (moved from deleted AugmentedBrowsing_p), adjust submenu (remove Augmented Browsing) and translation files.	8 years ago
luccioman	39e081ef38	Fixed display of crawler pending URLs counts in HostBrowser.html page. As described in mantis 722 (http://mantis.tokeek.de/view.php?id=722) Also updated some Javadoc.	8 years ago
luccioman	870a5eae26	Removed temporary test main method commited by mistake.	8 years ago
reger	c4017f2e87	upd to commons-compress-1.13.jar hide external icon on forge logo (was also out of position in IE)	8 years ago
luccioman	e048e74072	Added an optional parameter to webstructure.xml api. This new "documentStructure" parameter can be set to false to only get hosts accumulated references on a resource and thus prevent scraping the specified URL and getting citations references. Also set WebStructureGraph constants as final and updated the Javadoc with example api call URLs.	8 years ago
luccioman	17b7c92009	Made sure webstructure.xml API produces valid XML. Host names should not contain XML special characters such as quotation mark, but at this stage the WebGraph may have mistakenly recorded a host name with such characters. What's more the DigestURL constructor does not prevent this. By the way using serverObjects.putXML to encode host names we ensure here the rendered XML is well formed and can be parsed by external tools even if an structure entry is incorrect.	8 years ago
luccioman	d9766ca981	Fixed WatchWebStructure_p.html render to include https URLs. As described in mantis 721 (http://mantis.tokeek.de/view.php?id=721) WatchWebStructure_p.html failed to include in its structure view https and other protocols and ports than default http.	8 years ago
luccioman	ed3dd5e31a	Fixed webstructure.xml API used with a domain name 'about' parameter. As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720), when requesting this API with a domain name instead of a complete URL only HTTP references on default port were listed.	8 years ago

1 2 3 4 5 ...

5756 Commits (dd9cb06d250d8bbfc798c23ab8779a92018557f1)