yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger24	f28d705cd0	update IndexBroser_p add to blacklist button add feedback to user on success	3 years ago
reger24	d7d977569b	Optimize IndexBroser_p - "Add to blacklist" button to better match hosts with www. using the default match substitution (to match www.host.com/ and host.com/)	3 years ago
reger24	6a5f0b3684	Servlet IndexBroser_p add button "Add to blacklist" allows to add the displayed host to add to the default blacklist	3 years ago
unknown	a0f6c3be00	Update htroot compare_yacy servlet - removed metager2.de -> is down for me also others didn't work today (but left unchanged) but added a html onerror event to inform if connection was refused.	3 years ago
Michael Christen	1d7f657620	Merge pull request #437 from lifeofguenter/feature/fix-banner-typo Fix typo + remove dead seeds	3 years ago
Burkhard	b943e0fb03	Merge pull request #420 from iansmirlis/IndexBrowser_p In case of reload404, load only failed documents	3 years ago
reger24	8d0f3d4208	Removed the Blacklist_p default=shared for new created blacklists issue https://github.com/yacy/yacy_search_server/issues/374 Is nice to be able to import blacklist from other peers but shared by default is likely not intentional choosen by user	3 years ago
Lukas Fülling	e8a00007f6	add setting for public facing port	3 years ago
Andreas	590f39b403	Add Sorting functionality to Crawler Queue Table Allow to sort for count and host	3 years ago
lifeofguenter	870319e769	Fix typo + remove dead seeds	3 years ago
ZeroCool940711	7e765b8483	Improved the Image search page to have bigger thumbnails, use a bigger area for results and a smaller left sidebar.	3 years ago
Michael Peter Christen	6fe905bb82	feature https://github.com/yacy/yacy_search_server/issues/434	3 years ago
Michael Peter Christen	9c38b1254e	proper deletion of loadtime index	3 years ago
Michael Peter Christen	bd3f2483a1	replaced url and date retrieval by only url retrieval This should prevent that the search index is used for freshnes of the index entry.	3 years ago
Michael Peter Christen	163ba26d90	replaced check for load time method instead of loading the solr document, an index only for the last loading time was created. This prevents that solr has to fetch from its index while the index is created. Excessive re-loading of documents while indexing has shown to produce deadlocks, so this should now be prevented.	3 years ago
sgaebel	cdf901270c	always use HTTPClient by 'try with resources' pattern to free up resources	3 years ago
Michael Christen	f4834e8e31	link fix	3 years ago
Michael Peter Christen	3c86b7b780	attempt to make a Mac Release using gradle This is almost working with many workarounds: - run rm lib/yacycore.jar - run ./gradlew clean build bundleNative - run ant clean all - run again rm lib/yacycore.jar - run ./fixMacBuild.sh The build is then inside build/mac/YaCy.app Right now this works so far but it does not have the correct release number inside. Target is to make this working for Windows releases and to embedd jre entirely.	3 years ago
Michael Peter Christen	63ad8ce6b2	removed ymarks had not been used since a long time	3 years ago
Michael Peter Christen	ef5a71a592	enhanced crawl start response time for very very large crawl start lists	3 years ago
Michael Peter Christen	1bab4ffe20	calculating the correct size of an export. This can be seen as a fix for https://github.com/yacy/yacy_search_server/issues/343 however, the export was not flawed, it is just the impression that something is wrong, but the export size must be smaller than the index size because the index also containers error documents. Now an information line is presented that shows i.e.: "The local index currently contains 181,319 documents, only 106,887 exportable with status code 200 - the remaining are error documents."	3 years ago
admin	9b7668fa58	reduced memory footprint during indexing/crawling	3 years ago
Ian Smirlis	53518a91ab	In case of reload404, load only failed documents	3 years ago
Michael Peter Christen	e6a87e0426	enhanced crawler a main problem when crawling is long waiting time cuased by crawl-delay values from robots.txt entries. that attribute is not supported by google and interpreted by yandex and bing in different ways. In large crawls there is always one host which blocks the whole crawl with extreme large values. YaCy now still obeys crawl-delay but limits them to 10 seconds. Additionally the blocking logic when loading new robots.txt was analyzed and a deadlock was removed. Furthermore the construction of new queue lists was redesigned and it was ensured that always a large list of different hosts for host-balancing is provided for the loader.	3 years ago
Michael Peter Christen	9182b3dfca	enhanced default value	3 years ago
Michael Peter Christen	3959d43a5c	fixed doku link	3 years ago
Michael Peter Christen	15b7461bc7	removed Xms java memory startup parameter We will use the default value for now on. This is much better for resource economy and fits better into a container/docker/kubernetes strategy. Furthermore, a small memory footprint is essential for the usage on small devices like RaspberryPi.	3 years ago
Michael Peter Christen	4377bd2b70	fix for wrong crawlName construction	3 years ago
Michael Peter Christen	e81b770f79	enabled crawl starts with very large sets of start urls i.e. 10MB large url list with approx 0.5 million start points	3 years ago
Michael Peter Christen	dbd211a1ad	removed/replaced reflection in memory tool	4 years ago
Michael Peter Christen	1cdb21592b	added hazelcast and some modifications to align legacy YaCy with YaCyGrid	4 years ago
sgaebel	7fecd859e5	fixes showing metadata from Searchresult, by removing defType=edismax also removes defType=edismax from IndexBrowser, but still does not show dates	4 years ago
sgaebel	f16cd154f7	removes unused imports and variables	4 years ago
sgaebel	c69c462a15	replaces a expensive getLoadTimeURL() by exists() refactors urlExists to getHarvestProcess as that is what it does	4 years ago
sgaebel	26223dc25a	replaces getLoadTime() by exists() with a simpler query since solr-8.8.1 getLoadTime() causes a high cpu usage	4 years ago
Michael Peter Christen	b46513f4a1	added stub of rc3assembly style a little bit late but whatever	4 years ago
Michael Peter Christen	3da7628117	use environment variables to overwrite configuration variables you can i.e. do: export YACY_PORT=8092 && ./startYACY.sh Just append "YACY_" to uppercase version of environment variables and replace all "." with "_".	4 years ago
Michael Peter Christen	13a2e6dc6e	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	4 years ago
Michael Peter Christen	0ae8ccf657	Make it possible to set an empty password disabling the authentication protocol completely If you set now an empty password, then the http server will not ask to authentify. This is required for environment where we attach an outside authentification service like keycloak or similar using authentication in an ingress proxy. This change is part of the approach to run YaCy inside of a kubernetes cluster where we do not want individual authentication of peers and want to apply a ingress authentication.	4 years ago
Michael Peter Christen	96592a10cf	added option to set yacy configuration values using environment variables To use that feature, set an environment variable with prefix "yacy." and suffix identical to the yacy configuration attribute name. Additionaly we implemented a way to set a peer name using the setting "network.unit.agent". This can therefore now be used to set a peer name with the java call parameter -Dyacy.network.unit.agent=anonymous The purpose for this feature is the ability to set peer names in mass-deployed kubernetes clusters to the same name to prevent that we are flooding peer name statistics with auto-deployment-generated names.	4 years ago
Michael Peter Christen	198826c362	added network scanner process to discover all YaCy peers in the intranet this will be used to wire YaCy peers in a kubernetes cluster	4 years ago
Michael Peter Christen	d9602e8325	Implemented a new syntax in the template engine to simplify json APIs Added also an example for one of the existing APIs. The problem is the comma separator between objects which must not be there for the last entry in a sequence. The new syntax adds the separator symbol automatically.	4 years ago
Michael Peter Christen	5a7f12a9c1	allow network scans for non-standard http/https ports	4 years ago
Michael Peter Christen	022fb15670	fix for https://github.com/yacy/yacy_search_server/issues/385	4 years ago
Michael Peter Christen	17672fcbb4	adding hint how to shrink the disk size after an index deletion. implements https://github.com/yacy/yacy_search_server/issues/360	4 years ago
Michael Peter Christen	907f121d0c	do not overwrite PW with random PW	4 years ago
Michael Peter Christen	256fa3d985	new limitation documentation just replaced two by four	4 years ago
Michael Peter Christen	7997836506	fixed lock image	4 years ago
Michael Peter Christen	d0abb0cedb	enabling all crawl profiles in all network modes also: increased default internet crawl speed to 4 urls/s/host	4 years ago
Michael Peter Christen	a9befbba5f	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	4 years ago
Michael Peter Christen	fed8bd6325	automatically refresh css cache when switching skin and setting of default skin to current skin in selector	4 years ago
Michael Peter Christen	9a5694261a	design update more space	4 years ago
Michael Peter Christen	4ec55289a8	using a lock symbol which looks also good in dark designs	4 years ago
Michael Peter Christen	43a9f4f574	updated solr 6.6.6 -> 7.7.3 dropped GSA support (GSA API is still in YaCy Grid) The 6.6.6 solr index works without migration also with 7.7.3	4 years ago
Michael Peter Christen	c0d9a3e9a7	turned HostBrowser into a admin-only page, now called IndexBrowser This was required because spiders and bots crawled through this page and created load on the peer without use for the user or the YaCy network.	4 years ago
Michael Peter Christen	d359d521a1	fixed warc importer The importer tried to import a gziped files as plain warc. It will now check the file extension and use a unzip automatically on-the-fly.	4 years ago
Michael Peter Christen	cef5fde343	adding message to UI to make port change transparent	4 years ago
Michael Peter Christen	22841ffbf1	creating a threaddump during every cleanup process to be able to find out what a peer did (not) last time before a crash	4 years ago
Michael Peter Christen	d7b2d82faa	showing MB instead of KB in PerformanceMemory	4 years ago
sgaebel	3431f91db9	removes unused 'unused' tokens	4 years ago
sgaebel	dd9d4b1188	replace org.junit.Assert.assertThat by org.hamcrest.MatcherAssert.assertThat from hamcrest 2.2 to avoid deprecation-warning	4 years ago
sgaebel	df9ea0a42a	removes some warnings: unused imports, params	4 years ago
sgaebel	80785b785e	adds deleting during recrawl	4 years ago
Michael Peter Christen	e0ad8ca9da	replaced json library from JSON.org with libandroid-json-java This fixes https://github.com/yacy/yacy_search_server/issues/347	5 years ago
Michael Peter Christen	6d7dc01670	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	5 years ago
Michael Peter Christen	0a7bda2a21	removed JSON-evil license line These classes had been my own creative work. Just the copyright line had been appeared possibly due to a bad copy-paste activity, unaware that the line is a non-free addition.	5 years ago
Michael Christen	57484eb1cc	xss protection	5 years ago
Michael Peter Christen	37827b6788	removed doubes from getpageinfo	5 years ago
Michael Peter Christen	f03e16d3df	enhanced crawl start url check experience urls are now urlencoded and a check is also performed in case that an url is copied into the url field using copy-paste	5 years ago
Michael Christen	41f9b8517f	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	5 years ago
Michael Christen	4ccd1ea3c0	new servlet path "p2p" with a test class. Call the class with http://localhost:8090/p2p/seeds.json	5 years ago
Michael Peter Christen	f7c97fd99e	scanner crawl starts wants non-parseable files	5 years ago
Michael Peter Christen	a20b61f5c0	fix for bad json	5 years ago
Michael Peter Christen	d62a8ec542	masking connects	5 years ago
Michael Peter Christen	5eb0033aef	typo	5 years ago
Michael Peter Christen	2c0742fc43	added json version of peer list	5 years ago
Michael Christen	cfa27d2fd5	fixed links	5 years ago
Michael Peter Christen	0bddf2d895	switched url and snippet position	5 years ago
Michael Peter Christen	2999f4b985	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	5 years ago
Michael Peter Christen	449780f762	enhanced search result design	5 years ago
Michael Christen	cdc7adedc2	added sponsor link	5 years ago
Michael Christen	f2d45ebb87	design updates + added link to new forum	5 years ago
Michael Peter Christen	789670bd8c	design changes - more space	5 years ago
Michael Christen	3a46b07603	fixed many links to old forum, now https://searchlab.eu	5 years ago
luccioman	6b45cd5799	New optional crawl filter on the URL a doc must match to crawl its links For finer control over which parsed documents can trigger an addition of their links to the crawl stack, complementary to the existing crawl depth parameter.	6 years ago
luccioman	d16bc99835	Added "Show Metadata" links to the ViewFile.html links mode To conveniently follow parsed links in the file viewer	6 years ago
luccioman	8c068a9c99	Better HTML text semantics for technical descriptions	6 years ago
luccioman	a5771b1f14	Made SNI extension user configurable without the need for server restart TLS Server Name Indication (SNI) extension activation can now be configured with the new Settings_p.html?page=httpClient administration page. SNI extension is also now enabled by default, as in 2019 the unrecognized_name(112) alert is more properly handled by major web servers TLS implementations, following the RFC 6066 standard. Related YaCy issues : #153 #189 and #272 JDK 1.7 bug : https://bugs.java.com/bugdatabase/view_bug.do?bug_id=7127374 Apache httpd issue : https://bz.apache.org/bugzilla/show_bug.cgi?id=56241 RFC 6066 : https://tools.ietf.org/html/rfc6066#section-3	6 years ago
luccioman	42c8a251c8	Render a relevant message and status on blocked search requests When unauthenticated (or with insufficient rights) client is blocked either because blacklisted or excessive request rate, render an error message and a relevant HTTP status for API requests, instead of an empty response that appears broken.	6 years ago
luccioman	a8316c79da	Allow JS resorting of search results by unauthenticated users Acces rate limitations to this search mode by unauthenticated users are set low by default to prevent unwanted server overload but can be customized through the SearchAccessRate_p.html configuration page Fixes #291	6 years ago
luccioman	0ab2b49c31	Made /yacysearch access rate limitations user configurable With a new admin page at /SearchAccessRate_p.html in menu Network Access > Local Search > Access Rate Limitations	6 years ago
luccioman	630fa0015a	P2P/Privacy switch buttons support with JavaScript disabled	6 years ago
luccioman	74fd2f30fa	Support for search result switch buttons with JavaScript disabled	6 years ago
luccioman	ebc583cdb2	Properly render the href attribute of the active page button	6 years ago
luccioman	093ea9586c	Properly fill current page number to new server side pagination template When current page is automatically reset to zero because of a new search request.	6 years ago
luccioman	6e9d5f60ad	Server side initial pagination links rendering For better support of the search page usage with JavaScript disabled. Reduces also the number of initial refreshes of the paginations links. When JavaScript is enabled, pagination links are still regularly refreshed until all the search feeds are terminated on server side.	6 years ago
luccioman	4b9cc4746d	Upgraded Bootstrap dependency from v3.3.7 to v3.4.1 Non regressions tested on the following platforms : Linux Debian Stretch : - Firefox 60.5.1esr - Chromium 72.0.3626.96 Windows 10 : - Firefox 65.0.1 - Chrome 72.0.3626.109 - Edge 25.10586.672.0 - IE 11.1540.10586.0 Mac OS : - Safari 11.0	6 years ago
luccioman	c617ea58a0	Render additional embedded audios from links on extended audio search	6 years ago
luccioman	69f1971052	Added basic controls to play all audio results. Not displayed when JavaScript is disabled.	6 years ago
luccioman	9782a98a9c	Added the possibility to customize facets sort type and direction Previously search navigators/facets elements were sorted only by counts. Now from the ConfigSearchPage_p.html admin page, sort direction (ascending/descending) and type (on counts or labels) can be customized independently for each navigator.	6 years ago

1 2 3 4 5 ...

6082 Commits (fd45ccf76ebb7f18f63ce508e43ff66d118f2cc3)