yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Dmitriy Kazimirov	5bed1a7893	Russian localization update	12 years ago
Michael Peter Christen	31e854bef6	Merge remote-tracking branch 'copro/master'	12 years ago
Michael Peter Christen	4735bd47f4	- changed solr commit call and added an optimize option. Since Solr 4.0.0 there is a new softcommit feature which implements a near-real-time (NRT) search option. The softcommit does not do IO and does not cause performance issues. YaCy has now an extension in its solr connectors to use the softcommit feature. The softcommit call now replaces all places where a hard commit was used. Furthermore the commit strategy in when doing a search from the web interface was changed (it's done every time before a search is done). The softcommit feature was implemented because it was needed for the following changes (customer demands), which is also included in this git commit: - added a feature to identify all documents which have unique titles and/or unique descriptions. These unique flags are disabled by default. - added also a feature to set a flag when the url from a canonical tag is equal to the document url. This is also disabled by default. To support the new softcommit strategy, the commitWithinMs option was set to -1 do disable automatic commit based on document insert times. If documents are inserted permanently then also a commit would happen permanently whenever the commitWithinMs time is reached. This would conflict with the regular autocommit of 10 minutes and the new softcommit strategy.	12 years ago
Copro	0025983993	Fix typo embedd -> embed	12 years ago
Copro	3ea8380959	Adding Vimeo tag to wiki commands to embedd Video video with id	12 years ago
Copro	ee9d7fd93d	Added feature to embedd Youtube videos to wiki commands for usage in Wiki, Blog or other servlets	12 years ago
Michael Peter Christen	ec927ea72b	Merge remote-tracking branch 'reger/master'	12 years ago
Michael Peter Christen	7159ed2a7d	Merge remote-tracking branch 'copro/master'	12 years ago
Copro	946fad48c7	Some more German translation reducing the amount of Unused String messages	12 years ago
Aleksej	6690dac845	Russian translation fixes not merged due to conflict	12 years ago
Michael Peter Christen	9ccdd21d76	Merge remote-tracking branch 'aleksejs/fixtrans' Conflicts: locales/ru.lng Tried to merge this but I had to made this 'blind'. Sorry if I deleted something that was right.	12 years ago
Copro	de7c3d95b4	Added German translation for HostBrowser.html	12 years ago
Dmitriy Kazimirov	5e5ae01909	updated Russian localization for update system	12 years ago
Dmitriy Kazimirov	f9c65078f0	A little more fixes for Russian localization	12 years ago
Dmitriy Kazimirov	ca01d225db	A little more fixes for Russian localization	12 years ago
Dmitriy Kazimirov	9dc0bea1dc	Little more correct and readable Russian localization	12 years ago
Dmitriy Kazimirov	c1b9113a68	Little more correct and readable Russian localization	12 years ago
Dmitriy Kazimirov	9cc72df176	More Russian translations. And if some text is not translated it will be in English and not German	12 years ago
Michael Peter Christen	db024a4e19	added new solr fields (unused yet; implementation will follow)	12 years ago
Michael Peter Christen	f5fd2aea18	removed archaic migration code	12 years ago
Michael Peter Christen	9b5bdae1b4	Reverted setting of MMapDirectoryFactory from solrconfig; see http://forum.yacy-websuche.de/viewtopic.php?p=27509#p27509 Instead, in the start script is checked if the host is a 64 host and -Dsolr.directoryFactory=solr.MMapDirectoryFactory is set as java option Reverted the ramBufferSizeMB setting (this was not enabled anyway) because that may be too much memory for small peers and embedded systems. Activated the mergeFactor 4; this was commented out by mistake	12 years ago
reger	f8f7f33596	add Maven build script	12 years ago
orbiter	eb68a30947	solr performance settings the target of these performance settings is the reduction of IO in general and during search in particual. - reduced mergeFactor to 4. This will increase the IO during indexing, but will reduce IO during search. It will also greatly reduce the number of open files which should make it possible to have overall larger indexes until the number of open files in an OS is reached. - increased ramBufferSizeMB to 256mb. This will reduce the number of commits. This change may compensate the reduction of the mergeFactor. - disabled updateLog. This is a real-time search feature which is available in YaCy anyway because a commit is forced if index.html is called. The updateLog feature causes a lot of IO during indexing and search and produced a lot of files in SEGMENTS/solr_40/data/tlog	12 years ago
Michael Peter Christen	60f2a69331	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
Michael Peter Christen	cba038f97b	one more NPE fix	12 years ago
sixcooler	f3e705c4fe	bump to httpclient / httpcore 4.2.3 (bugfix-release)	12 years ago
Michael Peter Christen	aa067da86b	set the 'all' option as option at end of the list because the all option currently select also lists which cannot be exported in xml correctly	12 years ago
Michael Peter Christen	af465cdca5	fix for wrong robots.txt loading for https protocol see also: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4579	12 years ago
Michael Peter Christen	edbc86d2b0	integrated search term into opensearch result title. this makes better bookmark names when subscribing multiple search results from the same peer	12 years ago
Michael Peter Christen	c3d50d91f8	relaxing site operator for www prefix: - when using a site operator search for a domain where the domain has a www prefix, also the domain without the www is enclosed - when using a site operator search for a domain where the domain has no www prefix, also the domain with the www in enclosed - in the host navigator, all domains with and without a www prefix are accumulated. That means that the host navigator does never show a host with a www prefix. This should prevent usage mistakes of the site operator.	12 years ago
Michael Peter Christen	f53703df62	using MMapDirectoryFactory as solution for ClosedChannelException given in https://issues.apache.org/jira/browse/SOLR-2247	12 years ago
Michael Peter Christen	db49e91724	fixed a NPE which may appear for freeworld peers without any rwi index data. This the NPE looked like: Caused by: java.lang.NullPointerException at net.yacy.search.query.SearchEvent.<init>(SearchEvent.java:279) at net.yacy.search.query.SearchEventCache.getEvent(SearchEventCache.java:155) at search.respond(search.java:314) ... 12 more	12 years ago
Michael Peter Christen	4faa07c214	added a timeout for topic computation (solr is here much slower than the old metadata-db)	12 years ago
Michael Peter Christen	d2d5be032d	added a 'inlink' search option according to the suggestion in the YaCy forum at http://forum.yacy-websuche.de/viewtopic.php?f=18&t=4572#p27410 The feature was not called 'haslink' but called 'inlink' to have a analogous naming like 'inurl'. This causes now that you can search for words in links of the document, like: * inlink:yacy searches all documents which link to pages which have an 'yacy' in the url.	12 years ago
Michael Peter Christen	76e1e91b11	with strict compiler settings, IndexFederated_p does not compile without @SuppressWarnings("deprecation")	12 years ago
reger	3897bb4409	added (manual) urldb migration (link on: Index Administraton -> Federated Solr Index) - migrates all entries in old urldb Metadata coordinate (lat / lon) NumberFormatException still relative often (see excerpt below), - added try/catch for URIMetadataRow (seems not to be needed in URIMetaDataNode, as Solr internally checks for number format) - removed possible typ conversion for lat() / lon() comparison with 0.0f, changed to 0.0 (leaving it to the compiler/optimizer to choose number format) current log excerpt for NumberFormatException: W 2013/01/14 00:10:07 StackTrace For input string: "-" java.lang.NumberFormatException: For input string: "-" at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source) at java.lang.Double.parseDouble(Unknown Source) at net.yacy.kelondro.data.meta.URIMetadataRow$Components.lon(URIMetadataRow.java:525) at net.yacy.kelondro.data.meta.URIMetadataRow.lon(URIMetadataRow.java:279) at net.yacy.search.index.SolrConfiguration.metadata2solr(SolrConfiguration.java:277) at net.yacy.search.index.Fulltext.putMetadata(Fulltext.java:329) at transferURL.respond(transferURL.java:152) ... Caused by: java.lang.NumberFormatException: For input string: "-" at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source) at java.lang.Double.parseDouble(Unknown Source) at net.yacy.kelondro.data.meta.URIMetadataRow$Components.lon(URIMetadataRow.java:525) at net.yacy.kelondro.data.meta.URIMetadataRow.lon(URIMetadataRow.java:279) at net.yacy.search.index.SolrConfiguration.metadata2solr(SolrConfiguration.java:277) at net.yacy.search.index.Fulltext.putMetadata(Fulltext.java:329) at transferURL.respond(transferURL.java:152)	12 years ago
reger	3b6e08b49f	prevent checking of urldb if empty - disconnect urlIndexFile if empty - add missing lock class in submenuSearchConfiguration	12 years ago
reger	1fb452174a	read defaults from yacy.init for "Set to Defaults" button	12 years ago
reger	f143804382	fix configuration for search page navigators - added additional config page (ConfigSearchPage_p) for easy setup of search page layout (to not overload ConfigPortal page) - currently redundant setting with part of ConfigPortal page - added missing config for filetype and protocol navigator - adjusted init of SearchEvent to check navigation config setting - renamed RankigProcess.getTopicNavigator to getTopics (to distiguish between added SearchEvent.getTopicNavigator)	12 years ago
Michael Peter Christen	24db2fcd9d	fix for Network info	12 years ago
Michael Peter Christen	22c694f906	activated the clickdepth_i attribute for solr again because the calculcation of that value is not as extensive as expected and furthermore the value is very useful for ranking	12 years ago
Michael Peter Christen	becd52a984	added also a re-calculation of reference counts during the post-processing of clickcount calculations. This is a really nice thing to have because the reference count affects ranking.	12 years ago
Michael Peter Christen	fc47109608	added 'Last Hour' to network statistics	12 years ago
Michael Peter Christen	38d3feae65	added separate delete commands for the local+remote solr index, the old metadata and old rwi and for the citation index. The important advancement is the separation of the citation index deletion because that index is responsible for the linkdepth calculation. Now a search index can be deleted without the citation index and that should cause that less clickdepths must be post-processed.	12 years ago
Michael Peter Christen	6f0baaa309	added the clickdepth post-processing: some links may have 'shortcuts' to already calculated click depths. There are then calculated if the crawl buffer is empty and therefore no new 'shortcuts' can be discovered. The status of the clickdepth stack (to-be-processed) can be seen using a solr search command like this: http://localhost:8090/solr/select?q=process_sxt:[%20TO%20]&start=0&rows=30&fl=sku,clickdepth_i,process_sxt	12 years ago
Michael Peter Christen	0f5b6f38c1	enhanced root-url detection	12 years ago
Michael Peter Christen	5a0eb1b268	clickpath should not be active by default because it needs extensive computation - partly to be implemented	12 years ago
Michael Peter Christen	8ae08a2cac	moved HTCache, Heuristics and Parser servlet to a more appropriate menu location	12 years ago
Michael Peter Christen	5c0c56cfe1	Preparations to produce a click depth attribute in the search index. This attribute can be used for ranking and for other purpose (demand by customer) The click depth is computed in two steps: - during indexing the current fill-state of the reverse link index is used to backtrack the current page to the root page. The length of that backtrack is the clickdepth. But this does not discover the shortest click depth. To get this, a second process to check again is needed - added a process tag that can be used to do operations on the existing index after a crawl; i.e. calculation the shortest clickpath. Added a field to control this operation but not a method to operate on this. - added a visualization of the clickpath length in the host browser	12 years ago
Michael Peter Christen	6861af87e2	removed warnings	12 years ago

... 5 6 7 8 9 ...

9544 Commits (0c1a018bbde9c9e67bc000b6a3fd8dbd1706a6f3) All Branches Search

9544 Commits (0c1a018bbde9c9e67bc000b6a3fd8dbd1706a6f3)

All Branches