yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	046f5a03cb	one more SolrIndexSearcher bugfix	11 years ago
sixcooler	78c01b3eff	fix for 'AlreadyClosedException: this IndexReader is closed'	11 years ago
Michael Peter Christen	1b5e3d523a	better control over close-state of remote solr connections	11 years ago
Michael Peter Christen	1a364572a5	fix for "org.apache.solr.core.SolrCore Too many close [count:-1] on org.apache.solr.core.SolrCore@51af7c57" -error	11 years ago
Michael Peter Christen	69391e5d9e	changed strategy to test existence of documents in Solr: using the update time. The reason for that is a better caching for the crawler double-check, which needs the update time for crawler steering.	11 years ago
Michael Peter Christen	790f103f32	delete fail-docs during postprocessing to prevent that they will appear again and stay in postprocessing forever.	11 years ago
Michael Peter Christen	ff656ce860	explicit call to optimize to add a expungeDeleted flag	11 years ago
Michael Peter Christen	9eb668e951	enhanced the resource observer The resource observer is now able to recognize free disk space AND available space for YaCy. The amount of space which is assigned for YaCy are defined in new settings in the configuration file. Furthermore, there is now a cleanup process which deletes files in case that an autodelete is activated. The autodelete is now BY DEFAULT ON if the disk space is low, which means that YaCy starts to delete documents when the disk is full!	11 years ago
Michael Peter Christen	fbee98c06f	fixed shortcut self-reference bug	11 years ago
Michael Peter Christen	e7a29a2851	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	bf97e38b83	removed clearURLIndex, which is a stub remaining from the old metadata database and not needed any more	11 years ago
orbiter	14764632b5	clear solr caches in case that an exception occurrs. The reason behind this hack is the occurrence of Exceptions like: W 2014/02/11 18:51:33 ConcurrentLog GC overhead limit exceeded java.io.IOException: GC overhead limit exceeded at net.yacy.cora.federate.solr.connector.AbstractSolrConnector.getDocumentById(AbstractSolrConnector.java:334) at net.yacy.cora.federate.solr.connector.MirrorSolrConnector.getDocumentById(MirrorSolrConnector.java:173) at net.yacy.cora.federate.solr.connector.ConcurrentUpdateSolrConnector.getDocumentById(ConcurrentUpdateSolrConnector.java:415) at net.yacy.search.index.Fulltext.getMetadata(Fulltext.java:331) at net.yacy.search.index.Fulltext.getMetadata(Fulltext.java:317) at net.yacy.search.query.SearchEvent.pullOneRWI(SearchEvent.java:1024) at net.yacy.search.query.SearchEvent.pullOneFilteredFromRWI(SearchEvent.java:1047) at net.yacy.search.query.SearchEvent$3.run(SearchEvent.java:1263) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOfRange(Arrays.java:3077) at java.lang.StringCoding.decode(StringCoding.java:196) at java.lang.String.<init>(String.java:491) at java.lang.String.<init>(String.java:547) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(CompressingStoredFieldsReader.java:187) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:351) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:276) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110) at org.apache.lucene.index.IndexReader.document(IndexReader.java:436) at org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:657) at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.SolrQueryResponse2SolrDocumentList(EmbeddedSolrConnector.java:230) at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.getDocumentListByParams(EmbeddedSolrConnector.java:320) at net.yacy.cora.federate.solr.connector.AbstractSolrConnector.getDocumentById(AbstractSolrConnector.java:330) ... 7 more This problem was analysed with the Eclipse Memory Analyser after a heap dump, where the following problem was reported as the main Problem Suspect: One instance of "org.apache.solr.util.ConcurrentLRUCache" loaded by "sun.misc.Launcher$AppClassLoader @ 0x42e940a0" occupies 902.898.256 (61,80%) bytes. The memory is accumulated in one instance of "java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "<system class loader>". This memory is part of the result cache of Solr. Flushing this cache appears the most appropriate solution to that problem.	11 years ago
Michael Peter Christen	bc28247089	Added methods in resource observer to calculate the available and the occupied disc space. These values are also shown on the status page. The disc space calculation shall be used for a disk-limitation of the search index.	11 years ago
Michael Peter Christen	0dda979801	adopted network image drawing to increased number of peers	11 years ago
Michael Peter Christen	ca8b100f96	run the cleanup process even when load is high, do postprocessing even if load > 1 (but < 2) but only if there is enough memory (now: 0.5 GB RAM available). The memory amount of the postprocessing is the cause that systems block because they run into a frequent-GC chain which almost locks the peer. If running with enough memory, the postprocessing is fast and not damaging to the system. Because the required RAM of 0.5 GB is never available in default setting, the postprocessing will not run if the peer is not reconfigured to use more memory.	11 years ago
Michael Peter Christen	195e5868d3	catch solr close exceptions	11 years ago
Michael Peter Christen	751c128544	extra sleep for remote searches enhances search results because there is more time for more remote peers to contribute on the first result page	11 years ago
Michael Peter Christen	0cabcbbe83	more efficient wordcount	11 years ago
Michael Peter Christen	3d474a843e	added memory protection for postprocessing	11 years ago
Michael Peter Christen	412d55523c	enhanced memory protection and OOM exception handling in Solr connector	11 years ago
Michael Peter Christen	d9858e1b8a	removed warnings and superfluous logging	11 years ago
Michael Peter Christen	acc8d7faa7	fixed setting of shortMemoryStatus in MemoryControl	11 years ago
Michael Peter Christen	94245ce0a8	fixed "Size in KBytes" calculation in PerformanceQueues_p.html, see http://bugs.yacy.net/view.php?id=362	11 years ago
Michael Peter Christen	726e8c3ad5	removed unused classes and servlets	11 years ago
Michael Peter Christen	6e59ca4ebf	removed jena library and all code that depended on jena. When jena was introduced, it was also used for search facets. The generic search facets are now deduced from generic solr fields which makes jena as tool for facet semantics superfluous.	11 years ago
Michael Peter Christen	9228214f9b	enrichment of PerformanceMemory display of SolrInfoMBean table	11 years ago
Michael Peter Christen	e8bdf16ea7	added statistic information for solr resources in PerformanceMemory	11 years ago
Michael Peter Christen	931541d198	re-inserted default value re-set button to performance queues and patched missing values for recent new queues	11 years ago
Michael Peter Christen	456e52e0d5	enhanced strategy to clear solr caches - redesigned the instance mirror class (which was a mess) - added final method to close a searcher (which otherwise keeps a cache) - changed cache clear method which iterates over resources and calls clear to all caches in the searcher resources	11 years ago
reger	bd1685c94a	fix not needed getFileExtension().toLower (double) add missing .getFileExtension	11 years ago
orbiter	a11f072504	enhanced didyoumean	11 years ago
Michael Peter Christen	c0e6a65ec3	enhanced didyoumean	11 years ago
Michael Peter Christen	6d2dab7b21	fixed 'resource leak' warning	11 years ago
orbiter	22e3524797	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	c40ba51ca6	added new suggest method which replaces more-than-one suggestions: instead of computing suggest permutations of the given words, the completion of a phrase using the given words is searched in the fulltext index.	11 years ago
reger	ad4b213145	remove unused static var from HTTPDProxyHandler	11 years ago
reger	b693ce9759	allow combining selection of different search nav's (facets) - selecting more than one nav combines the 2 selections (with AND) - unselecting one nav clears all selected (e.g. select filetype:pdf and /language/fr shows ~ french pdf's only)	11 years ago
reger	cb71413d19	fix page nav, to keeping modifier (was new issue)	11 years ago
orbiter	416481c33e	added a boost on appearance of combined words (in the same order the user submitted that) when searching for more than one word	11 years ago
reger	c589ee8c6e	URLproxy access check too tight respect config ip pattern (was own ip)	11 years ago
Michael Peter Christen	ebfaf753b7	- faster initialization of index files - removal of not used space if index files shrink (rare, but possible)	11 years ago
Michael Peter Christen	d2b8f2b477	enhancements for staticIP and ipv6 handling	11 years ago
reger	a71718a459	add config value for ssl/https port (default=8443) adjust server routines to use config	11 years ago
reger	a3e2cca8e9	improve isOlder check to not overwrite node index with metadata on equal load date	11 years ago
reger	9b24dae2b7	add language navigation filter clause to rwi results	11 years ago
reger	f307d65dcf	prepare for a language navigator works fine to restrict language for local solrSearches. More work needs to be done to make rwi/remote searches respect the modifier.language restriction.	11 years ago
reger	cf553e5045	added hint to web.xml and for completeness the full set of hardcoded mappings	11 years ago
Michael Peter Christen	c84bcc878a	first try to add a generic solr servlet as luke request servlet	11 years ago
Michael Peter Christen	4cb7e2a2ca	refactoring: renamed the SolrServlet to SolrSelectServlet for better naming of more Solr Servlets	11 years ago
Michael Peter Christen	dc06e407ce	added two virtual instances of solr for the both cores: collection1 and webgraph. These cores are now accessible at /solr/collection1/select instead /solr/select?core=collection1 and /solr/webgraph/select instead /solr/select?core=webgraph in addition to the old behavior to support compatibility to the old peers. These new paths are fully solr standard-conform and will allow the cross-linking between YaCy peers using their public solr API.	11 years ago
Michael Peter Christen	8b14e92ba4	added button in host browser to re-load 404/failed documents	11 years ago
orbiter	771d8261c1	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	c351e47a84	fix for bad-formatted lonlat	11 years ago
reger	4c603b216e	optimize parse ServerSideInclude	11 years ago
orbiter	5ec0c969c9	fix for http://bugs.yacy.net/view.php?id=354	11 years ago
orbiter	0002abd583	fix for OOM during remote search and too high load protection	11 years ago
sixcooler	5a917e13c6	use less ram on dht-URL transfer by not using a URIMetadataNode[]	11 years ago
Michael Peter Christen	c87cdfca2e	do not set a load prerequisite that prevents the start of one-time-jobs	11 years ago
sixcooler	4d77ca52c9	workaround to let dht-out run on smal Systems like a Pi	11 years ago
Michael Peter Christen	6ada0daae9	making latency_factor and maximum number of same hosts in loader queue settings available in Crawler_p.html servlet for steering.	11 years ago
Michael Peter Christen	489c3fbc90	code simplifications / removed warnings	11 years ago
Michael Peter Christen	0168f80c28	new crawling factors can now be changed during runtime	11 years ago
Michael Peter Christen	be5e808236	- removed hardcoded load-test which is now handled in BusyQueues steering, see /PerformanceQueues_p.html - changed default values for crawler queue load limit (high, because these jobs are started upon user request)	11 years ago
sixcooler	40a4030b55	configurable max-load values for YaCy-Threads: try lower values on smal systems like a Pi	11 years ago
sixcooler	6d8c023a5e	lower client-connection for single-cpu-systems	11 years ago
Michael Peter Christen	77531850b5	reverted crawling strategy from latest commit.	11 years ago
Michael Peter Christen	c0da966dfa	enhanced crawler speed	11 years ago
Michael Peter Christen	79809342fa	added synchronization to exists() call bacause the concurrent call to that method showed in thread dump close to deadlock situations. Its also better to synchronize IO operations because they become faster then.	11 years ago
Michael Peter Christen	9a6912f2e6	if a http client thread is still running but we do not wait for it any more, call an interrupt	11 years ago
Michael Peter Christen	0d235a565b	cleanup crawl loader jobs	11 years ago
Michael Peter Christen	1ea17bd9f3	- removed old metadata database and all migration code - refactored all code which uses URIMetadataRow as standard for word hash length and word hash ordering and moved that to the class 'Word', becuase the class URIMetadataRow defined the old metadata data structure and should be superfluous in the future - removed unused methods from URIMetadataRow as preparation for further removal of that class	11 years ago
reger	d3de309953	fix IOexception logging issue in DefaultServlet reason not sure but .logException triggers another exception	11 years ago
reger	97e84439fb	adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString - since specific heuristic Twitter & Blekko is not longer available or redundant with OpenSearchHeuristic, adjusted ConfigHeuristic to use OpensearchHeuristic settings only. For this the default OSD search target list is made available (copied) by default and the other configs are removed. - the return of QueryGoal.getOriginalQueryString includes the queryModifier, which are held separately in a modifier object, but in most (all) cases just the query term is expected, clarified and renamed it to QueryGoal.getQueryString which returns just the search term (if needed a .getOrigianlQueryString could be implemented in Queryparameters, adding the modifiers) - started to adjust internal html href references from absolute to relative (currently it is mixed). For future development we should prefer relative href targets (less trouble with context aware servlets)	11 years ago
Michael Peter Christen	022c6d3ce1	do YaCy p2p connections using a timeout-request which covers the http request into a separate thread and ignores the furthure result of a request if that does not answer within the requested time-out. This is a try to solve a problem with the peer-ping, which hangs whenever a peer appears to be dead or blocked.	11 years ago
Michael Peter Christen	42f3733a05	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	25a6c05008	experimental removal of synchronization. This should work for all cases where the size() and isEmpty() method is used only for statistics, which happens at many locations in YaCy. If these methods are used for structual reasons (like accessing the last element in an array) then it may fail or cause other problems. As far as visible, this is not the case.	11 years ago
Michael Peter Christen	5695280edd	removed superfluous synchronization	11 years ago
Michael Peter Christen	a1977b7a75	removed debug code	11 years ago
orbiter	fd4abc0565	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	d5b8e473c8	added load limit for DHT transfer: RWI acceptance only if local load is not too high	11 years ago
reger	2614fa7aeb	Skip remote Solr search if last try showed error As the solr servlet may not be available (e.g. no public search page, old version, individual access setting) a /solr/select error is remembered in the seed.dna of the remote peer. This is not permanent, as flag is not stored and the seed is reloaded on several occasions, it is just a memory of the recent past status. Might also be set to "not available" on time-out of last try.	11 years ago
orbiter	a07e9b3582	concurrency-solid version of transmission limitation	11 years ago
orbiter	60ead31273	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	52bf7d1ac8	reduce load during dht transfer	11 years ago
sixcooler	f0587d4af5	NP-fix, which was found on a Pi under 'havy' load	11 years ago
Michael Peter Christen	0bf3cab8c7	- better 'extra'-peer selection - logging of health status for 'extra'-peer selection - concurrency for remote peer IO and interrupting the threads if time-out occurrs	11 years ago
orbiter	e3c4456c8e	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	7f21d21d1d	added synchronization to deeply-embedded solr connector EmbeddedSolrConnector because deadlock situations show that methods in lucene class seem to block.	11 years ago
reger	9b06774414	fix role name in GSA servlet	11 years ago
reger	0c754dd794	implemented DIGEST authentication, which is for remote login more secure as BASIC were pwd is transmitted near clear text (B64enc). This has some implication as RFC 2617 requires and recommends a password hash MD5(user:realm:pwd) for DIGEST. !!! before activating DIGEST you have to reassign all passwords !!! to allow new calculation of the hash - default authentication is still BASIC - configuration at this time only manually in (DATA/settings) or defaults/web.xml (<auth-method> - the realmname is in defaults/yacy.init adminRealm=YaCy-AdminUI - fyi: the realmname is shown on login screen - changing the realm name invalidates all passwords - but for security you are encouraged to do so (as localhostadmin) - implemented to support both, old hashes for BASIC and new hashes for BASIC and DIGEST - to differentiate old / new hash the in Jetty used hash-prefix "MD5:" is used for new pwd-hashes ( "MD5:hash" )	11 years ago
Michael Peter Christen	ba44eb1160	when scaling the number of remote peers, also consider the machine load and the number of cores	11 years ago
Michael Peter Christen	f8ce7040ab	remote search peer selection schema change: - all non-dht targets (previously separated into 'robinson' for dht-like queries and 'node' for solr queries) are non 'extra' peers, which are queries using solr - these extra-peers are now selected using a ranking on last-seen, peer-tag-matches, node-peer flags, peer age, and link count. The ranking is done using a weight and a random factor. - the number of extra peers is 50% of the dht peers - the dht peers now exclude too young peers to prevent bad results during strong growth of the network - the number of dht peers (and therefore extra-peers) is reduced when the memory of the peer is low and/or some documents still appear in the indexing-queue. This shall prevent a peer from deadlocks when p2p queries are made in a fast sequence on weak hardware.	11 years ago
Michael Peter Christen	47a82e471c	less blocking in SeedDB which caused deadlocks in peer ping	11 years ago
Michael Peter Christen	ec10ed45bd	better logging in logger	11 years ago
Michael Peter Christen	a5d7961812	replaced old caching in SolrConnector with a new one which is better for concurrency and should prevent from 100% CPU usage after a long run of a peer with a large number of documents.	11 years ago
reger	6e2fe777af	simulate Authorization cookie for yacy servlet header	11 years ago
reger	ea7cef5d05	fix NPE in TemplateEngine StackTrace For input string: "" java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:504) at java.lang.Integer.parseInt(Integer.java:527) at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:241) at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:199) at net.yacy.http.servlets.YaCyDefaultServlet.handleTemplate(YaCyDefaultServlet.java:896)	11 years ago
reger	cb6d0c2113	implementing YaCy legacy role names - taking out customized SecurityHandler code as the original/default seems to just work fine - with this individual sec. constraints can be applied via web.xml (using legacy role names)	11 years ago
reger	f09dbbef96	make SecurityHandler webappcontext ready	11 years ago
reger	37f2a82a5d	making root context (htroot) a WebAppContext - this allows additional features, like servlet configuration via web.xml and many more things. - currently the standard servlets are still configured in the code (so the supplied defaults/web.xml is not realy needed, yet), but could be expanded - lookup for web.xml - 1. in /DATA/SETTINGS then in /defaults	11 years ago
reger	28eae57e8b	spend CrawlQueues a fremem routine - clears errorStack - will not get hit often (but better little than nothing on low mem)	11 years ago
reger	b931bf6b48	fix use of url proxy access pattern pattern of transparent was used.	11 years ago
reger	280c4a3ac1	exclude terms with " for didYouMean suggestion causes Solr error (and wordindex likely finds suggestion) org.apache.solr.core.SolrCore org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Cannot parse 'text_t:""d"': Lexical error at line 1, column 12. Encountered: <EOF> after : "" at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.query(EmbeddedSolrConnector.java:179) at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector$DocListSearcher.<init>(EmbeddedSolrConnector.java:345) at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.getCountByQuery(EmbeddedSolrConnector.java:364) at net.yacy.cora.federate.solr.connector.MirrorSolrConnector.getCountByQuery(MirrorSolrConnector.java:326) at net.yacy.cora.federate.solr.connector.ConcurrentUpdateSolrConnector.getCountByQuery(ConcurrentUpdateSolrConnector.java:440) at net.yacy.search.index.Segment.getWordCountGuess(Segment.java:464) at net.yacy.data.DidYouMean.getSuggestions(DidYouMean.java:181) at suggest.respond(suggest.java:73)	11 years ago
reger	fbc1071f6d	Merge origin/master	11 years ago
reger	7b800a0c8e	fix: NPE on shutdown via script	11 years ago
Michael Peter Christen	ce4d42d77c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	644573cfc4	using the adminAccountUserName from yacy.conf within apicall.sh	11 years ago
reger	6932aa4d7a	use configured admin-username for api calls - the admin user name can be configured, in apiExec calls the default "admin" username is used. TODO: the bin/apicall.sh script should likely take that into account.	11 years ago
orbiter	2ead4e44d9	introduced a new storage path ARCHIVE inside of DATA which will be used as path for solr index dumps (instead of the SEGMENTS path). This will make a maintenance of index backups easier. It will also provide a tool to migrate from an freeworld index to a webportal index.	11 years ago
sixcooler	add0e42804	fix double-escaped urls from proxy-usage	11 years ago
sixcooler	865ce6f974	check blacklist proxyClient config	11 years ago
sixcooler	345f9aba27	make use of our DNS-cache again - this realy speeds up the lookup	11 years ago
reger	e6d284fe1e	better solution for prev. commit with MultiMapSolrParams.getFieldInt not returning default parameter	11 years ago
reger	0bc2fc14ab	improve NPE chance on missing parameters java.lang.NullPointerException at net.yacy.http.servlets.SolrServlet.service(SolrServlet.java:145) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)	11 years ago
reger	f06cef5d5b	reimplement proxy access by configured whitlist pattern was currently limited to own ip.	11 years ago
reger	05d6cc6ea3	setting of IPv4Stack moved earlier it seems even better to call system.setproperty before isrunning check (if nothing helps we have to set it in startup script)	11 years ago
reger	30d925a96e	reimplemented server access restriction via Jetty IPAccessHandler to allow only configured IP's to access. Handler is only loaded if a restriction is configured. Since IPAcessHandler (Jetty 8) does not support IPv6 system property java.net.preferIPv4Stack=true Testing showed system.setProperty seems to be sensitive to point of calling (earliest possible time seems to be best = early in yacy.main). Moved the "isrunning..." just open browser check also to the new routine to preread the yacy.config only once.	11 years ago
orbiter	3cb6c7861f	fixed shutdown authenticaton problem	11 years ago
Michael Peter Christen	ed06b5b94b	set a realm message to log-in input window which explains that a password for the account 'admin' can be (re-)set with the script bin/passwd.sh	11 years ago
Michael Peter Christen	7005ecdabd	cleanup	11 years ago
Michael Peter Christen	2939b47986	removed non-working realm setting in http client (auth for localhost was added in previous commit)	11 years ago
orbiter	9d52b337f3	added http authentification to YaCy http client for all localhost acesses to enable self-steering of the peer using the API table. This is necessary in case that an password for the administration pages is set.	11 years ago
Michael Peter Christen	c951945666	modified log-in detail to enable admin-login from localhost with stored hash even if localhost access is disabled. This is urgently needed for the apicall.sh script since that is used for high-availability set-up (checkalive and indexdump for index mirroring)	11 years ago
Michael Peter Christen	9bd71fdbb4	made the access tracker class static because it shall be used by the jetty auth module	11 years ago
Michael Peter Christen	1c56befb93	fixed mess with test on localhost (which means local hosts for some cases)	11 years ago
Michael Peter Christen	7d6fc79eb8	refactoring (usage of constant names for attributes of authentication check)	11 years ago
Michael Peter Christen	b9d36e45e0	removed the &amp explicit encoding of ampersand character since this is double-translated within the template replacement process.	11 years ago
reger	e2ccb6ce9d	modified DefaultServlet parameter on invoke templates call response with post=0 (if post empty) simulating previous behavior. (template servlets typically test for post==null, found one more Crawler.p.java were empty post caused problem, = defaults not correctly set)	11 years ago
reger	4c38bceafc	handle http connect for proxy refactor header cleanup (reuse existing code)	11 years ago
reger	cfabe8f67a	harmonize access restriction for urlproxy servlet with proxy handler, what is currently - use switched on in config - access from a local IP / hostname fix shutdown exception for crashprotection handler on interrupted connections.	11 years ago
reger	e6b9643fd6	extended request for local peer check to by hostname resolved ip the current islocal() check did not detect a domain.com address as request for the local peer.	11 years ago
reger	c797f108a1	add error response on deniedl proxy access send http 403 response	11 years ago
reger	0583f44306	reimplement proxy access log (to Jetty ProxyHandler) - using existing HTTPDProxyHandler logger - allow local loopback ip to access proxy	11 years ago
reger	8cbc1c970a	Security Hot-Fix: for transparent proxy.	11 years ago
reger	58ecf5e4dd	add to blacklist button in CrawlResults http://bugs.yacy.net/view.php?id=220 introduced Blacklist.add with sourcefile only parameter	11 years ago
reger	e9081c0f17	moved startup execAPIActions call after Jetty startup execAPIActions require http to be up. The 10s sleep was sufficient to allow Jetty to start, but it's more robust to place the call after http is assigned to switchboard/serverSwitch.	11 years ago
reger	19c1a7a5ca	change SolrServlet from Filter to Servlet (as no multicore required) this allows to simplify context/servlet initialization in Jetty init.	11 years ago
reger	14c977dd26	fix NPE GSAresponseWriter on query=null java.lang.NullPointerException at net.yacy.cora.federate.solr.responsewriter.GSAResponseWriter.highlight(GSAResponseWriter.java:328) at net.yacy.cora.federate.solr.responsewriter.GSAResponseWriter.write(GSAResponseWriter.java:263) at net.yacy.http.servlets.SolrServlet.service(SolrServlet.java:235)	11 years ago
orbiter	c3dee2d6bd	added security patch	11 years ago
orbiter	dcf46ce8f6	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	343d2ef49a	new data type for access tracker (unfinished)	11 years ago
reger	dd8ea0cdd6	fix "add to blacklist" button style in IndexControlRWIs_p - added default filename filter to select field (as only addition to *.black list is permanent) - modified Blacklist_p header/legend to show all active blacklists (to support understanding that all configured lists are active) - removed obsolete code in Blacklist_p servlet	11 years ago
reger	abbf487023	fix QueryGoal Image query (missing space) see query log example .. url_file_ext_s:(jpg OR png OR gif) ORcontent_type:(image/*)) ..	11 years ago
reger	26e9d7e066	fix NPE in IndexControlRWIs_p.html - metatags my be null Caused by: java.lang.NullPointerException at net.yacy.search.query.QueryParams.getFacets(QueryParams.java:445) at net.yacy.search.query.QueryParams.getBasicParams(QueryParams.java:400) at net.yacy.search.query.QueryParams.solrTextQuery(QueryParams.java:345) at net.yacy.search.query.QueryParams.solrQuery(QueryParams.java:334) at net.yacy.search.query.SearchEvent.<init>(SearchEvent.java:290) at net.yacy.search.query.SearchEventCache.getEvent(SearchEventCache.java:176) at IndexControlRWIs_p.genSearchresult(IndexControlRWIs_p.java:641) at IndexControlRWIs_p.respond(IndexControlRWIs_p.java:141)	11 years ago
reger	7f9b9315fe	Merge origin/master	11 years ago
reger	8eaabb9600	remove dependency from old serverCore.java - remaining getPortNr not needed (as current release allows only to set plain integer as port, see ConfigBasic)	11 years ago
orbiter	2018e55f8b	switched back on index deletion (was accidently off because new jetty framework delivers never null to post arguments .. there may be more of that kind of problems)	11 years ago
orbiter	3961b643a3	write solr searches to search log	11 years ago
orbiter	15882beb19	fix for strange NPE java.lang.NullPointerException at net.yacy.search.Switchboard.updateMySeed(Switchboard.java:3667) at net.yacy.peers.Network.peerPing(Network.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at net.yacy.kelondro.workflow.InstantBusyThread.job(InstantBusyThread.java:107) at net.yacy.kelondro.workflow.AbstractBusyThread.run(AbstractBusyThread.java:165)	11 years ago
orbiter	f3ac923a7e	ftp client shall be able to open non-anonymous ftp servers if login details are given	11 years ago
reger	3d913558ab	display configured adminUserName in ConfigAccounts_p - fix read default username in in loginservice	11 years ago
reger	fbdd89e198	Merge origin/master	11 years ago
reger	65a2f3d5e7	tweak Jetty credentials to work with YaCy UserDB - user entry in UserDB with admin right can login to access protected pages - dto. admin user, choosen username is stored in conf (adminAccountUserName=)	11 years ago
Michael Peter Christen	ffdfe5fb9b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	7d6b34a89f	Merge origin/master	11 years ago
reger	45e8750ba5	nasty quick fix for admin login with other username as admin - userDB is not sync'ed with Jetty credentials as of now only the std. admin account can login switched initial browser open with ssl active back to std. http port	11 years ago
Michael Peter Christen	ee17bd0b69	added option to attach remote solr servers in read-only mode	11 years ago
Michael Peter Christen	25f9c35033	add patch which shall prevent that naive search mistakes like usage of regular expressions cause no results. Usage of '*' followed by a dot or any expression will now cause that this expression is used as a filetype search.	11 years ago
Michael Peter Christen	667a6adddb	- use default files from yacy.init property "defaultFiles" if no jetty-configuration is given for default files. - fix a problem with default paths if no path is given (i.e. http://localhost:8090 instead of http://localhost:8090/). Without this patch the path was resolved automatically to http://localhost:8090//	11 years ago
Michael Peter Christen	77aeb288a2	suppress deprecation warning (for now); TODO: find alternatives	11 years ago
reger	fca7f1d043	run SSL/HTTPS port (8443) ping test in migration only if SSL/HTTPS is on - see last commit	11 years ago
reger	71cac1a278	added SSL/HTTPS connector to support SSL/https connection on port 8443 !!! attention !!! to make sure YaCy can start, https will be disabled if port 8443 is used - added ping test for above to migration - as of now port for https is hardcoded to default 8443 - if not urgend required I'd leave it this way (it's standard) to use different ports for http and https - post https port on ConfigBasic.html (if active)	11 years ago
Michael Peter Christen	82c0525e71	wrong logger fix	11 years ago
Michael Peter Christen	e17624b6dd	added html retrieval from alternative DATA/HTDOCS path	11 years ago
Michael Peter Christen	07cee6b99c	removed more unused code	11 years ago
Michael Peter Christen	20b48f894f	refactoring: moving all servlets to the same package (the solr servlet is currently actually a filter which should be changed somehow)	11 years ago
Michael Peter Christen	84167adb49	removed unused anomichttpd code after migration to jetty	11 years ago
Michael Peter Christen	b461a27abb	fixed the SolrServlet	11 years ago
Michael Peter Christen	7603e879dc	Merge branch 'master' into HEAD Conflicts: .classpath source/net/yacy/cora/federate/solr/SolrServlet.java	11 years ago
Michael Peter Christen	25250405f1	solr servlet preparation for join with jetty branch	11 years ago
Michael Peter Christen	2f16770681	migrated to solr 4.6.0	11 years ago
Michael Peter Christen	57f0f71ac6	added patch to allow binary response writer	11 years ago
orbiter	937273d4e3	added parsing of metadata to surrogate reading: a dublin core record inside of surrogate input files may now contain tokens within the namespace 'md' (short for: metadata). The token names must be valid withing the namespace of the solr field names. All md-tokens inside of surrogate files then overwrite values within solr documents before they are written to the solr index. This makes it possible to assign collection names to each surrogate entry and also ranking information can be added. Please see the example file.	11 years ago
reger	18497f6475	remove unused init parameter from DefaultServlet - remove "RelativeResourceBase" parameter	11 years ago
orbiter	4de3fefdb5	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	7e346e1d79	using stringbuilder in query construction	11 years ago
reger	c84c313fe1	Merge origin/master into jetty	11 years ago
Michael Peter Christen	2702d9e56b	- added a SolrQueryResponse2SolrDocumentList method which is able to work around the unfolding process in Solr's BinaryResponseWriter. This was a huge performance bottleneck in the embedded solr connector and the problem is actually on Solr side, but we have now a workaround. - This made it possible to abstract a high-performance index access method which is implemented as method getDocumentListByParams. That method is also implemented in the SolrServerConnector and provides a very efficient access to a solr index if the index is embedded. - a popular use of the document list retrieval is a result count which can now also make use of the new method, via getDocumentCountByParams. - enhanced the Error cache which now does not store error documents within the ram cache if the document is also written to solr. When documents are retrieved from the cache, they are partly read from the ram cache and if not existent there, from the Solr index.	11 years ago
Michael Peter Christen	74466d731a	use pre-compiled patterns in ymark	11 years ago
Michael Peter Christen	34633044b4	made pattern computation static	11 years ago
Michael Peter Christen	ef7ddbc933	added date parser caches to prevent re-calculation of costly date parsing	11 years ago
Michael Peter Christen	552ef9f18e	fix for bad ErrorCache.exists test (bug from latest commit)	11 years ago
Michael Peter Christen	09412ea3a4	counting search requests in solr interface	11 years ago
Michael Peter Christen	303f5694ba	avoid usage of existsByQuery. If a document can be loaded by the ID before testing other fields from the existsByQuery request, then a document cache fills and queries after that one can be avoided.	11 years ago
reger	b43bbd3cc4	join DefaultServlet and Jetty8 implementation - removing Jetty 8 specific dependencies	11 years ago
reger	089c5007ee	move conditionalHeader to DefaultServlet - by removing Jetty specific implementation detail	11 years ago
Michael Peter Christen	79771c60c0	IPv6 fixes	11 years ago
reger	92d9c56f9f	Merge origin/master into jetty	11 years ago
Michael Peter Christen	78eac85161	better calibration of caches and queue maximum sizes	11 years ago
Michael Peter Christen	c8af19bd37	removed unnecessary check which causes a NPE when searching with empty search string	11 years ago
Michael Peter Christen	e3c2f09de9	- reduce computation in case that specific postprocessing fields are not selected - de-select citation rank computation	11 years ago
Michael Peter Christen	cfa08024c7	removed optimization bevore postprocessing because that may cause a time-out which will cause that postprocessing fails.	11 years ago
Michael Peter Christen	6f3a923691	fixed urlmask which was not able to combine several constraints	11 years ago
Michael Peter Christen	9a27bf6e82	removed filter computation in Protocol class for remote searches because that is already done in the QueryParams class	11 years ago
Michael Peter Christen	f1b5db2c45	- performance graph does not shop peer ping in memory monitor any more - after a forced GC, the PerformanceMemory view switches to automatic update by default	11 years ago
Michael Peter Christen	a125904a1c	fixed a NPE in surrogat processing	11 years ago
Michael Peter Christen	0db8e34625	enhanced webgraph processing	11 years ago
reger	ac067b5236	clean-up Jetty handler classes	11 years ago
reger	b75e92aac3	add read queryparameter in gsaservlet	11 years ago
reger	1e94719084	fix NPE on mime detection of unknown file extension	11 years ago
reger	effea4bca0	Merge origin/master into jetty Conflicts: source/net/yacy/cora/federate/solr/SolrServlet.java	11 years ago
sixcooler	2c2ebb0d92	tried some hardening in order not letting any Solr-Searchers open	11 years ago
Michael Peter Christen	a16534cb0a	tried to fix timeout and connection-lost problems when using an outside solr.	11 years ago
Michael Peter Christen	c3dcbdc8d5	try to recover from an OOM during citation index reading and fail-over to second solr core in case of unrecoverable OOM.	11 years ago
Michael Peter Christen	9932c441c8	fixed a problem with Date fields parsing Solr results if a remote Solr is attached.	11 years ago
sixcooler	94db054aff	memory-leak-fix: the DocListSearcher fires an query in its constructor and it is highly recommend to close every SolrRequest. Every Request, which is not closed leaves a Searcher with its Chaches an can not be garbage-collectet.	11 years ago
reger	26bb1e37b7	implement core selection in SolrServlet - making initcore() obsolete	11 years ago
Michael Peter Christen	ae55d69ef6	include/exclude size NPE fix (recently added)	11 years ago
Michael Peter Christen	2c39b65409	fixes for searches containing stopwords. The fix was done using a reconstruction of the search word set access method to protect that words are deleted from the sets from the outside of the QueryGoal class.	11 years ago
Michael Peter Christen	5592ea57f0	hack to remove compiler warnings about deprecated classes. It would be better to remove the deprecated usage but to do this the Solr core must adopt the latest apache http core changes as well .. this is not our fault.	11 years ago
orbiter	037cd0a57c	using the BinaryResponseWriter which is supported within the YaCy solr servlet since YaCy 1.63. This is much more performant for the client than using the XMLResponseWriter because parsing of XML data is very CPU intensive. Older YaCy peers are still requested using the XMLResponseWriter but the majority of YaCy peers already respond with the binary writer. This makes remote searches much faster and less CPU intensive.	11 years ago
orbiter	61409788eb	less word hash computations (removing some overhead because of MD5 calcs) using the clear word in a normalized form.	11 years ago
reger	f23471c471	add check to prevent index entries containing url_file_ext_s with ";jsession=xyz" note: check could be implemented in MultiProtocolURL (but at this time didn't oversee possible implication)	11 years ago
reger	5c4a3d1c01	Merge origin/master into jetty	11 years ago
reger	444a9ae674	remove unused options and attributes from DefaultServlet cleanup obsolete class files	11 years ago
reger	8da75a4b0c	fix contentType definition for Solr html responswriter from xml to html (hint: value is currently not used, but is in SolrServlet)	11 years ago
Michael Peter Christen	ccf2f4e43b	refactoring of seed attributes (introduced more constants)	11 years ago
Michael Peter Christen	1f0bfa8fec	added test to Base64Order (runs successfully!)	11 years ago
orbiter	b7f1e5af51	added new servlet which generates the same file as the principal peers upload to a bootstrap position you can call it either with http://localhost:8090/yacy/seedlist.html or to generate json (or jsonp) with http://localhost:8090/yacy/seedlist.json http://localhost:8090/yacy/seedlist.json?callback=seedlist	11 years ago
orbiter	3e552550d1	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	c2d720cdaf	purge a lucene cache - possible memory leak fix	11 years ago
reger	e4f49fb175	for searchresults with empty title use filename as title - to not store a title in index which isn't extracted from source the title is empty check only added to ResultEntry class	11 years ago
reger	b1dc9a6f52	- disable Jetty servlet defaultUseCache (prevent double caching) - include short memory status check for class cache in DefaultServlet - remove obsolete Resource interface for Jetty8YaCyDefaultServlet	11 years ago
reger	f111f30ace	Merge origin/master into jetty	11 years ago
reger	94293176a3	use writeOptionHeaders with ServletResponse parameter only	11 years ago
orbiter	ff86cb683f	fixed some XSS bugs reported by Marius from http://ctf365.com/	11 years ago
orbiter	da33ee0d77	extended also timeout fr webgraph postprocessing	11 years ago
orbiter	74f9e40747	extended timeout during postprocessing of 30 minutes.	11 years ago
orbiter	19a051bec8	more monitoring for postprocessing and enhanced layout in Crawler monitor page	11 years ago
Michael Peter Christen	9cf9727685	fix for wrong counter	11 years ago
Michael Peter Christen	fceac8cffd	more monitoring for postprocessing	11 years ago
Michael Peter Christen	6842783761	fixed and enhanced postprocessing	11 years ago
Michael Peter Christen	219d5934a4	fixed termination bug in Solr Connector	11 years ago
Michael Peter Christen	bf1bdd52a6	prevent requesting of 0-facets (which actually exist)	11 years ago
Michael Peter Christen	9d5895f643	enhanced and fixed postprocessing	11 years ago
Michael Peter Christen	f86fe90eda	enhanced mass storage speed to remote solr servers	11 years ago
Michael Peter Christen	6ed9821209	fixed several problems in solr connectors	11 years ago
Michael Peter Christen	191fd3d7e7	added an optimization option to HandleSet mass data storage structure	11 years ago
Michael Peter Christen	94b565ea0d	fixed keepalive min value	11 years ago
reger	b26787dc2d	- DefaultServlet: remove static gzip option YaCy doesn't use pre-gzip'ed static html pages - ProxyServlet: remove not neede procedure - Server init: skip one overlaping servlet context	11 years ago
Michael Peter Christen	24a052ecb9	removed debug code for existsByIds	11 years ago
Michael Peter Christen	087df05e24	added option to Config_Network_p.html to enable remote search while DHT-Receive is switched off.	11 years ago
Michael Peter Christen	1a4a69c226	set more logger to 'final static'	11 years ago
Michael Peter Christen	c60947360d	logger should be static	11 years ago
Michael Peter Christen	69b8d61c47	fix for search requests in GSA interface which contain 'funny' characters (like ':' etc.)	11 years ago
orbiter	b085cb522b	replaced old existsByIds for embedded Solr with obviously much faster new selection method (including stil existing debug code to test that this is in fact better)	11 years ago
reger	b29d262e70	implement Jetty8HttpServerImpl.generateSocketAddress (code 1:1 copied from serverCore)	11 years ago
orbiter	4234b0ed6c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
orbiter	909bbb49d8	added (partly commented) test code for url rewrite methods .. to be completed	11 years ago
reger	066a1ecf0a	add highlight queryparams to solrservlet if missing - modify query params in Solr parameter map (instead of querystring)	11 years ago

... 3 4 5 6 7 ...

2672 Commits (c947ee06bf0be10d70dcc65c87e95daa916eefd7)