yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger	61f42a7928	fix formatting issue in search result display if description contains html code noticed e.g. for id=NmNdJ9uApLaQ http://hswong3i.net/blog/hswong3i/virtualmin-drupal-7-x-ubuntu-12-04-howto	10 years ago
Michael Peter Christen	6578ff3ddb	enhanced suggest function	10 years ago
reger	ab98f69592	fix: searchoption hint for heuristic	10 years ago
Michael Peter Christen	974d58b01f	IPv6 Fix for push interface	10 years ago
Michael Peter Christen	fe50e5aef6	fix for failed selection of terms in faceted search with vocabularies	10 years ago
Michael Peter Christen	1309619a71	remove remote indexing option in crawl start if not in p2p mode	10 years ago
Michael Peter Christen	6324db1213	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	5cb05c3013	adjust table column width to not line wrap crawler traffic line	10 years ago
Michael Peter Christen	606d00c8f2	cloning a crawl now accepts the class name of vocabulary scapers	10 years ago
reger	11b21308c0	fix: malformed filename in image search fix for http://mantis.tokeek.de/view.php?id=533	10 years ago
reger	9e1ec5fec4	refactor: just some more useages of constant for term ":[* TO *]"	10 years ago
Michael Peter Christen	b5ac29c9a5	added a html field scraper which reads text from html entities of a given css class and extends a given vocabulary with a term consisting with the text content of the html class tag. Additionally, the term is included into the semantic facet of the document. This allows the creation of faceted search to documents without the pre-creation of vocabularies; instead, the vocabulary is created on-the-fly, possibly for use in other crawls. If any of the term scraping for a specific vocabulary is successful on a document, this vocabulary is excluded for auto-annotation on the page. To use this feature, do the following: - create a vocabulary on /Vocabulary_p.html (if not existent) - in /CrawlStartExpert.html you will now see the vocabularies as column in a table. The second column provides text fields where you can name the class of html entities where the literal of the corresponding vocabulary shall be scraped out - when doing a search, you will see the content of the scraped fields in a navigation facet for the given vocabulary	10 years ago
Michael Peter Christen	68c605d637	replace with CommonPattern.SPACE for split	10 years ago
Michael Peter Christen	1f5047b15f	using precompiled pattern CommonPattern.SEMICOLON for splits	10 years ago
Michael Peter Christen	a8a2b7a803	persistency for vocabulary facet switch	10 years ago
Michael Peter Christen	efbc9a3561	introducting a new getConfig method which parses comma-separated llists from setting fields; refactoring for all places where such lists are parsed	10 years ago
Michael Peter Christen	69eacdf4eb	applying precompiled CommonPattern.COMMA.split to all places where split(",") was used	10 years ago
Michael Peter Christen	5a060c9f26	refactoring of reindexSolr (just replaced constant string)	10 years ago
Michael Peter Christen	3d717b749a	fix for urlmaskfilter	10 years ago
Michael Peter Christen	2636582435	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	0260d3d800	Allow to hide linkstructure graphic in crawl monitor using/setting the config param DECORATION_GRAFICS_LINKSTRUCTURE	10 years ago
Michael Peter Christen	bee5ee7cce	removed some warnings	10 years ago
Michael Peter Christen	6390454652	fix for vocabulary on/off setting	10 years ago
Michael Peter Christen	29f6e9db7a	write java version to status page	10 years ago
Michael Peter Christen	7db2888336	fixed font size and print page generation in pdf snapshots	10 years ago
reger	24f68a4eb7	refactor opensearch heuristic introduce FederateSearchManager handling search heuristic to external systems via specific FederateSearchConnectors, which provide the query() functionallity, the translation to YaCy schema .toYaCySchema() and the search() routine to deliver results to searchevents, which is generally implemented in Abstract connector. The manager enforces now a min 15s delay between calls to external systems. Besides the OpensearchConnector a SolrFederateSearchConnector is available. It uses a additional config file for fieldname translation. default heuristicopensearch.conf: - openbdb.com removed - seems not longer to deliver results - config via solrconnector to datacite.org added (large technical library archive)	10 years ago
Michael Peter Christen	3b51636ecb	fix for mediawiki import	10 years ago
Michael Peter Christen	8cafdb989a	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	4214f250d0	Add option for extended search (Autosearch) to Bookmark.html asking all connected peers for the searchterm added as description to the bookmark created by the bookmark icon. Intended for searches/research projects with not sufficient results from local and DHT selected remote target peers. Function: the process checks newly created bookmarks for description starting with "query=..." and takes this to ask every peer for 20 search results and adds it to the local index in a background job. link to start/stop the process added to /Bookmarks.html	10 years ago
reger	bb37cb32e4	Add title import for bookmark icon if avail in index	10 years ago
reger	8e751d754a	- add javadoc to busythread with hint about the init parameter useage - remove obsolete 10_httpd config parameter	10 years ago
Michael Peter Christen	0871e43fcc	better scale	10 years ago
Michael Peter Christen	35c24608cc	fix for division by zero (rare cases)	10 years ago
reger	4eb89d7f15	revert clickservlet (default was indeed a mistakenly)	10 years ago
reger	ebe5faeb01	added url to bookmark icon link url is anyway needed, saves index lookup and works w/o commited url. Removed unused order parameter	10 years ago
reger	d44d8996d0	Added a “don't store remote search results” option This is intended for peers who want to participate in the P2P network but don't wish to load/fill-up their index with metadata of every received search result. The DHT transfer is not effected by this option (and will work as usual, so that a peer disabling the new store to index switch still receives and holds the metadata according to DHT rules). Downside for the local peer is that search speed will not improve if search terms are only avail. remote or by quick hits in local index. To be able to improve the local index a Click-Servlet option was added additionally. If switched on, all search result links point to this servlet, which forwards the users browser (by html header) to the desired page and feeds the page to the fulltext-index. The servlet accepts a parameter defining the action to perform (see defaults/web.xml, index, crawl, crawllinks) The option check-boxes are placed in ConfigPortal.html	10 years ago
reger	d729386787	fix NPE in viewimage Caused by: java.lang.NullPointerException at net.yacy.peers.graphics.EncodedImage.<init>(EncodedImage.java:73) at ViewImage.respond(ViewImage.java:156)	10 years ago
reger	4ff018c9e4	fix ConfigPortal jumps to iframe focus add focus parameter to yacysearch.html too	10 years ago
Michael Peter Christen	5b810f6d70	Merge branch 'master' of gitorious.org:yacy/whitrs-rc1	10 years ago
Ryszard Goń	3cdbd5f5c6	Fix for progress table background not resizing when the post-processing started/ended.	10 years ago
reger	0dfeee154a	adjustments for Bookmark icon to act on BookmarkDB, it acts on YMarks but YMark interface seems not maintained, for future features (e.g. query memory) BookmarkDB is the likely choice to expand, besides the crawlstart bookmark also the result bookmark icon now adds to BookmarkDB. The YMark related code is (for now) left untouched so both tables are updated.	10 years ago
Michael Peter Christen	513e9259f5	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
reger	e177d69387	remove obsolete config footer option (ConfigPortal user.login) no footer or footer-option in use remove unused yacy.init item allowUnlimitedReceiveIndexFrom	10 years ago
Michael Peter Christen	5d4167f977	reacivated clear stacks code for termination of all crawls because this did not work wihtout that part of the code	10 years ago
Michael Peter Christen	ecb6a59e9e	do not translate gif images into png images for thumbnails. Instead, stream the original to the search result thumb viewer. This has two reasons: - animated gifs cause 100% cpu and deadlocks in the jvm gif parser; a known bug which is obviously not yet fixed - animated gifs now appear in the search result also as animation	10 years ago
Michael Peter Christen	d9603039ff	automatically set the Q flag for smb/ftp start urls (split pdf support)	10 years ago
Michael Peter Christen	8600ea01dd	automatically swith on query option in case intranet protocols (smb/ftp) are used. This supports the new split-pdf option.	10 years ago
Ryszard Goń	3144313974	Postprocessing progress bar fix (Make it work as [probably] actually intended)	10 years ago
reger	7e4e9f7e32	improve yacysearchitem, prevent allocation of String (modifyURL) if feature not used	10 years ago
Michael Peter Christen	8ef56eda90	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
Michael Peter Christen	9fce8bf2a5	crawling of multi-page pdfs with artificial post part on smb or ftp shares is not possible with the disabled setting; this is not temporary disabled until a better solution is on the hand.	10 years ago
reger	682dd94925	fix div by 0 in hello Caused by: java.lang.ArithmeticException: / by zero at hello.respond(hello.java:159)	10 years ago
Michael Peter Christen	003ec43bee	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	bef689d0a2	NPE fix	10 years ago
reger	1de33c6a53	add hint to Heuristics Config on "Greedy Learning Mode" in portal config, to point to a option to make this setting permanent.	10 years ago
Michael Peter Christen	84e2cccab4	fix to prevent assertion error in ranking servlet if no vocabularies are present that could be evaluated	10 years ago
Michael Peter Christen	9e588944fa	prevent NPE during initialization of very large vocabularies	10 years ago
Michael Peter Christen	aaf7d4775a	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	85773ebd4f	removed debug lines	10 years ago
reger	198102304b	refactor size() -> filesize() of URIMetadataNode (harmonize with ResultEntry and to not get confused with Collection.size())	10 years ago
Michael Peter Christen	445fafeb7c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	0d69089c61	fix for division by zero	10 years ago
reger	ac61a39828	use peeraddress for link in remote crawl list to make link work without enabled proxy upd pom for Jetty (missing in last commit)	10 years ago
Michael Peter Christen	5516819354	preventing the use of no-cache and expires in case that images are generated dynamically which will stay static in the future. This applies mainly to the search result favicon in front of search hits. These icons will now be generated once, but then caches in the browser. There is also a YaCy-internal cache for these icons which had prevented the re-generation of the icons in YaCy, but this cache is now superfluous since the browser should not call the servlet ViewImage again.	10 years ago
Michael Peter Christen	d3e71ed070	fixes for searches when initialization of large autotagging libraries have not been finished	10 years ago
Michael Peter Christen	28683530cd	fixes to usage of no-cache: use and recognize also the no-store directive	10 years ago
Michael Peter Christen	932faafffe	reactivated on-demand snapshot loading	10 years ago
Michael Peter Christen	2362ad7c34	fix for a count issue in snapshot api	10 years ago
Michael Peter Christen	9971e197e0	Added a transaction interface to the snapshots: all documents in the snapshots can now be processed with transactions using commit and rollback commands. Furthermore, a large number of monitoring methods had been added to check the success of transactions. The transactions for snapshots have two main components: a rss search API to get information about latest/oldest entries and a commit/rollback API to move entries away from the rss results. This is done by usage of two storage locations for the snapshots, INVENTORY and ARCHIVE. New snapshots are placed to INVENTORY, commited snapshots move to ARCHIVE, rollback snapshots move to INVENTORY again. Normal Workflow: Beside all these options below, usually it is sufficient to process data like this: - call http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=LATESTFIRST - process the rss result and use the <guid> value as <urlhash> (see next command) - for each processed result call http://localhost:8090/api/snapshot.json?command=commit&urlhash=<urlhash> - then you can call the rss feed again and the commited urls are omited from the next set of items. These are the commands to control this: The rss feed: http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=LATESTFIRST http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=OLDESTFIRST http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=ANY http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=LATESTFIRST http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=OLDESTFIRST http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=LATESTFIRST The feed will return a <urlhash> in the <guid> - field of the rss. This must be used for commit/rollback: Commit/Rollback: http://localhost:8090/api/snapshot.json?command=commit&urlhash=<urlhash> http://localhost:8090/api/snapshot.json?command=rollback&urlhash=<urlhash> The json will return a property list containing the property "result" with possible values "success" or "fail", according of the result. If an "fail" occurs, please look into the log for further info. Monitoring: http://localhost:8090/api/snapshot.json?command=status This shows the total number of entries in the INVENTORY and the ARCHIVE http://localhost:8090/api/snapshot.json?command=list This will result a list of all hosts which have snapshots and the number of entries for the hosts. Counts for INVENTORY and ARCHIVE are listed in the porperties for "count.INVENTORY" and "count.ARCHIVE" http://localhost:8090/api/snapshot.json?command=list&depth=2 The list can be restricted to such which have a specific depth. The list contains then the same host names, but the count values change because only documents at that specific crawl depth are listed http://localhost:8090/api/snapshot.json?command=list&host=yacy.net.80 This lists all urlhashes for the given host, not only an accumulated list of the number of entries http://localhost:8090/api/snapshot.json?command=list&host=yacy.net.80&depth=0 This restricts the list of urlhashes for that host for the given depth http://localhost:8090/api/snapshot.json?command=list&state=INVENTORY http://localhost:8090/api/snapshot.json?command=list&state=ARCHIVE This selects either the INVENTORY or ARCHIVE for all list commands, default is ALL which means that from both snapshot directories the host information is collected and combined. You can use the state option for all the commands as listed above Detailed Information: http://localhost:8090/api/snapshot.json?command=metadata&urlhash=upiFJ7Fh1hyQ This collects metadata information for the given urlhash. This can also be restricted with state=INVENTORY and state=ARCHIVE to test if the document is either in one of these snapshot directories. If an urlhash is not found, an empty result is returned. If an entry was found and the state was not restricted, then the result contains a state property containing the name of the location where the document is, either INVENTORY or ARCHIVE. Hint: If a very large number of documents is inside of INVENTORY, then it could be better to call the rss feed with http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=ANY because that is very efficient.	10 years ago
reger	6c3f36def1	- fix path to default heuristic.cfg - deprecate unused ProxyServlet	10 years ago
Michael Peter Christen	c3c2b6999b	fixes on wkhtmltopdf	10 years ago
Michael Peter Christen	ff035a20e7	fix for vocabulary import (double term detection)	10 years ago
Michael Peter Christen	e6650050fe	fix for Is Facet checkbox	10 years ago
Michael Peter Christen	bd3ed5cae5	added charset detection to vocabulary reader	10 years ago
Michael Peter Christen	7bfc5b80cb	added new options to vocabulary editor: - new switch 'isFacet' which causes that the usage of the vocabulary for search facets is enabled or disabled. This shall be used for large vocabularies sind searched in solr are extremely slow if facets for a large set of alternative terms are generated - new option to disable auto-enrichment from synonyms - new option to add synonyms from another column when importing from csv - automatically recognize double-occurrences in synonyms and bundling terms for such synonyms	10 years ago
Michael Peter Christen	8df8ffbb6d	enhanced the snapshot functionality: - snapshots can now also be xml files which are extracted from the solr index and stored as individual xml files in the snapshot directory along the pdf and jpg images - a transaction layer was placed above of the snapshot directory to distinguish snapshots into 'inventory' and 'archive'. This may be used to do transactions of index fragments using archived solr search results between peers. This is currently unfinished, we need a protocol to move snapshots from inventory to archive - the SNAPSHOT directory was renamed to snapshot and contains now two snapshot subdirectories: inventory and archive - snapshots may now be generated by everyone, not only such peers running on a server with xkhtml2pdf installed. The expert crawl starts provides the option for snapshots to everyone. PDF snapshots are now optional and the option is only shown if xkhtml2pdf is installed. - the snapshot api now provides the request for historised xml files, i.e. call: http://localhost:8090/api/snapshot.xml?urlhash=Q3dQopFh1hyQ The result of such xml files is identical with solr search results with only one hit. The pdf generation has been moved from the http loading process to the solr document storage process. This may slow down the process a lot and a different version of the process may be needed.	10 years ago
Michael Peter Christen	4111d42c81	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	793ce6d13b	added confirmation dialogs for row deletion	10 years ago
Michael Peter Christen	cdc21d43b1	more robustness for broken table data in Table_API_p.html -- see bug report http://mantis.tokeek.de/view.php?id=495	10 years ago
reger	1d3ea35d69	prevent NPE on host link for to short HeuristicCfg.OpenSearchURL	10 years ago
Michael Peter Christen	a95af11050	enhancement for clearing the crawl queue	10 years ago
reger	5f0bb1214f	modified FieldReIndex to reindex queries with low number of documents first by using a internally a score map with number of documents as score and working through the list from low to high.	10 years ago
Michael Peter Christen	d97deb5555	npe fix	10 years ago
Michael Peter Christen	4fe4bf29ad	added rss feed output to snapshot servlet which can be used to get a list of latest/oldest entries in the snapshot database. This is an example: http://localhost:8090/api/snapshot.rss?depth=2&order=LATESTFIRST&host=yacy.net&maxcount=100 The properties depth, order, host and maxcount can be omited. The meaning of the fields are: host: select only urls from this host or all, if not given depth: select only urls at that crawl depth or all, if not given maxcount: select at most the given number of urls or 10, if not given order: either LATESTFIRST to select the youngest entries, OLDESTFIRST to select the first entries or ANY to select any The rss feed needs administration rights to work, a call to this servlet with rss extension must attach login credentials.	10 years ago
reger	d6539ba597	Merge origin/master	10 years ago
reger	ff18129def	ViewFile servlet: update index if newer, so viewed text and metadata (stored) info is similar - to archive it, use request with profile to allow indexing (defaultglobaltext) and update index (the resource is loaded, parsed anyway, so it's not a expensive operation) Request: remove 2 unused init parameter - number of anchors of the parent - forkfactor sum of anchors of all ancestors	10 years ago
Michael Peter Christen	d83de9ecf5	added another path for the convert command because on older Macs ImageMagick has a different installation location	10 years ago
Michael Peter Christen	226aea5914	added a servlet which can create preview images, preview tumbnails and preview pdfs from web pages, i.e.: http://localhost:8090/api/snapshot.png?url=http://yacy.net/en/&width=128&height=128 http://localhost:8090/api/snapshot.jpg?url=http://yacy.net/en/&width=128&height=128 http://localhost:8090/api/snapshot.pdf?url=http://yacy.net/en/ This supports also an on-the-fly generation of the preview documents if the user is an administrator. Otherwise, the servlet fails. To enable this, you must add wkhtmltopdf, imagemagick and (on headless servers) xvfb to your operation system. for detailed instructions, see `97f6089a41`	10 years ago
Michael Peter Christen	181911376c	showing list of all thread in threaddump using the ThreadMXBean counter (this obviously show more threads than before?)	10 years ago
Michael Peter Christen	64887f6b21	show number of threads on status page	10 years ago
Michael Peter Christen	6f0167fac1	get cloned crawl start parameter for snapshots	10 years ago
Michael Peter Christen	97f6089a41	YaCy can now create web page snapshots as pdf documents which can later be transcoded into jpg for image previews. To create such pdfs you must do: Add wkhtmltopdf and imagemagick to your OS, which you can do: On a Mac download wkhtmltox-0.12.1_osx-cocoa-x86-64.pkg from http://wkhtmltopdf.org/downloads.html and downloadh ttp://cactuslab.com/imagemagick/assets/ImageMagick-6.8.9-9.pkg.zip In Debian do "apt-get install wkhtmltopdf imagemagick" Then check in /Settings_p.html?page=ProxyAccess: "Transparent Proxy" and "Always Fresh" - this is used by wkhtmltopdf to fetch web pages using the YaCy proxy. Using "Always Fresh" it is possible to get all pages from the proxy cache. Finally, you will see a new option when starting an expert web crawl. You can set a maximum depth for crawling which should cause a pdf generation. The resulting pdfs are then available in DATA/HTCACHE/SNAPSHOTS/<host>.<port>/<depth>/<shard>/<urlhash>.<date>.pdf	10 years ago
Michael Peter Christen	41d00350e4	moved network configuration to Use Case submenu; this is necessary because the definiton of portal peers within the YaCy freeworld network is otherwise splitted into two different main menus.	10 years ago
reger	221f86dd5e	position api icon (ViewFile.html)	10 years ago
Michael Peter Christen	ad0da5f246	added new web page snapshot infrastructure which will lead to the ability to have web page previews in the search results. (This is a stub, no function available with this yet...)	10 years ago
reger	c475be2937	fix (enable) error msg on empty query	10 years ago
reger	f709132961	remove obsolete alternate link fix api link	10 years ago
Michael Peter Christen	3c71e1c872	show vocabularies in search result (in case of debugging)	10 years ago
Michael Peter Christen	2fce2e2697	larger boost fields for ranking	10 years ago
Michael Peter Christen	6c03ff8355	bold words in snippets should not be coloured black in the base style because there are styles with dark backgrounds which make the bold word invisible	10 years ago
Michael Peter Christen	c0f9f6ac66	added option to change the navbar-default, i.e. usable for dark skins	10 years ago
Michael Peter Christen	84763126e0	added option to make the YaCy proxy act as the cache is never stale. If set to 'Always Fresh' the cache is always used if the entry in the cache exist. This is a good way to archive web content and access it without going online again in case the documents exist. To do so, open /Settings_p.html?page=ProxyAccess and check the "Always Fresh" checkbox. This is set do false which behave as set before. If you set this to true, then you have your web archive in DATA/HTCACHE. Copy this to carry around your private copy of the internet!	10 years ago
Michael Peter Christen	5bb52f79be	reduce number of calls to queue.size() because that may be a bottleneck during crawling	10 years ago
Michael Peter Christen	092d97d7ac	when importing vocabulary csv files, accept also files without semicolon and truncate quotes from literals	10 years ago
Michael Peter Christen	ee9ec40048	added hints to ranking to make ranking boosts using vocabularies easier	10 years ago
Michael Peter Christen	70f03f7c8e	do not cache search requests to Solr if the result is used for doublechecking. If a double-check comes from cached results the doublecheck fails.	10 years ago
Michael Peter Christen	a0b84e4def	use a LinkedHashMap for factes to maintain facet order as given by solr	10 years ago
Michael Peter Christen	0dc6e0a5f2	added option to enrich vocabularies with synonyms from synonym database	10 years ago
Michael Peter Christen	6a2a669db4	added loading of the synonyms file from addon/synonyms into the knowledge loader	10 years ago
Michael Peter Christen	fdba8e2fa0	fix for 2-day network stats table: showing 48 instead of 24 hours from peer history	10 years ago
Michael Peter Christen	ec9d021568	added option in vocabulary editor to import CSV files with different encodings (preselected windows-type character encoding which is typical for CSV files). Fixed also other problems with character encoding in dictionary files. Automatically generated vocabularies are now also noted in the API steering.	10 years ago
reger	b558433211	adjust tag cloud font size calculation to limit max font size to ~ TOPWORDS_MAXSIZE	10 years ago
Michael Peter Christen	0550b54d56	added fix to postprocessing: avoid caching of postprocessing collection to always get fresh lists of documents. This is necessary since the postprocessing changes the same documents which the postprocessing-collection query selects.	10 years ago
Michael Peter Christen	68e8039fd1	added high-precision scheduler for API processes. This allows also to make the execution in dependency of available RAM or CPU load. The default value for CPU load is 4.0 and the check runs once a minute.	10 years ago
Michael Peter Christen	0a879c98e7	added new 'firstSeen' database table and necessary data structures which hold a date for each URL to record when a url was first seen. This is then used to overwrite the modification date for urls upon recrawl in case that the first-seen date is before the latest document date. This behaviour is necessary due to the common behaviour of content management systems which attach always the current date to all documents. Using the firstSeen database it is possible to approximate a real first document creation date in case that the crawler starts frequently for the same domain. As a result the search results ordered by date have a much better quality and the usage of YaCy as search agent for latest news has a better quality.	10 years ago
Michael Peter Christen	487a733c99	fix for catchall handling in search	10 years ago
sixcooler	33b0234454	added a input-field for setting 'fileHost' Set this to avoid error-messages like 'proxy use not allowed / granted' on accessing your Peer by its hostname.	10 years ago
Michael Peter Christen	1db476c67e	fix for bad table iteration	10 years ago
Michael Peter Christen	e05b7332b9	html fix	10 years ago
reger	c1ad265efd	remove not used accordion javascript call for facet navs	10 years ago
Michael Peter Christen	ecdfb35f09	added long variables to debug output in index browser	10 years ago
Michael Peter Christen	95d87f00b3	fix for bad query generation in doublecheck in postprocessing	10 years ago
orbiter	a2b5cfb3cf	added reverse button to tables, by default on now (to see latest entries first)	10 years ago
orbiter	fceac5d2d4	added (missing) Tables_p.xml for table xml api	10 years ago
orbiter	dbafd4865e	enhanced debug code in host browser	10 years ago
Michael Peter Christen	8f6587e87b	fix for broken protocol navigation	10 years ago
Michael Peter Christen	5c962dd009	better scaling of network statistic graphs	10 years ago
orbiter	3ffe19b85c	replaced old /api/table_p.xml servlet with /Tables_p.xml to avoid double code	10 years ago
Michael Peter Christen	b4585e9546	added new index size history image in /Status.html page	10 years ago
Michael Peter Christen	9aebbbebc0	added network history in /Network.html?page=5	10 years ago
Michael Peter Christen	26279b0993	added debug code for statistics about document attributes related to domains	10 years ago
reger	d65e3f2b53	RankingSolr: display only available or configured boost fields	10 years ago
Michael Peter Christen	4e56d79fc8	replaced input text field with text field for index deletion with query and replaced GET with POST method. This should make it possible to tubmit here very large queries for deletion.	10 years ago
orbiter	6f707b4305	removed spaces in seedlist.xml to reduce data	10 years ago
orbiter	78c9d31388	fix for bad json	10 years ago
Michael Peter Christen	8098a86f1d	ipv6 fix for api /yacy/seedlist.[json\|xml], multiple IPs are now attached to the seed info. API clients must be adopted. Documentation will be fixed in http://www.yacy-websuche.de/wiki/index.php/Dev:APIseedlist Also added a new retrieval option for seeds, they can now be retrieved by their name with the get parameter name=<name>	10 years ago
Michael Peter Christen	07c5b57953	removed warnings	10 years ago
Michael Peter Christen	509eba2484	automatically zoom to location/POI	10 years ago
orbiter	fa2ad101ec	enhanced graphics computation (avoiding long string parsing for colours)	10 years ago
orbiter	ef813cec91	added proper copyright notice to OSM tiles presented at the search result page	10 years ago
Michael Peter Christen	1269e77dfa	enhanced location search	10 years ago
Michael Peter Christen	75b5f24be4	make browsing of file://z: - paths in index browser easier - this will now show the root paths on a shared drive	10 years ago
Michael Peter Christen	8ac3e9f890	fix for api icon in yacysearch_location.html	10 years ago
Michael Peter Christen	a1dd0ae62c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	f5967dfedf	add filter to citation page and a on/off button to display only sentences with citations, while maintaining the sentence number. Make the filtered list the default in search result citation link	10 years ago
Michael Peter Christen	f818f84adb	more ipv6 fixes	10 years ago
Michael Peter Christen	2c2b50e65d	refactoring (class name should start with uppercase letter)	10 years ago
Michael Peter Christen	14385057c2	added also the NetworkHistory servlet...	10 years ago
Michael Peter Christen	d8beafba3a	fix for values in CrawlProfileEditor table and xml; now the full profile is available in the xml.	10 years ago
Michael Peter Christen	ec95dfa2e6	fixed crawl profile xml result which did not show the correct crawl status.	10 years ago
Michael Peter Christen	8c1a89cb34	added another decoration flag to switch off network graphics in crawler monitor and index browser: decoration.grafics.linkstructure Please set this to false to remove the graphics from the interface.	10 years ago
Michael Peter Christen	764e4ed673	fixed appearance of RSS icon on search result page	10 years ago
Michael Peter Christen	9b1958e8ca	more ipv6 bugfixes	10 years ago
Michael Peter Christen	7817fc50c9	added a high cpu cycle monitor to PerformanceQueues	10 years ago
Michael Peter Christen	5082feb103	less volume for effect sounds	10 years ago
Michael Peter Christen	0bfc69b29b	more ipv6 bugfixes	10 years ago
Michael Peter Christen	a27563e5c3	removed the atmo sound clips because they had been too large	10 years ago
Michael Peter Christen	ae58b22f5b	ipv6 fixes for Network.html front page	10 years ago
Michael Peter Christen	e413beac04	fix for latest UPnP update	10 years ago
Michael Peter Christen	74957f3760	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	2a052f446a	Added an experimental audio feedback system. This is the first element of a new 'decoration' component which may hold switches for different external appearance parameters. The first switch in that context is decoration.audio (as usual in yacy.init). This value is set to false by default, that means the audio feedback element is switched off by default. To switch it on, set decoration.audio = true (using /ConfigProperties_p.html). You will then hear sounds for the following events: - remote searches - incoming dht transmissions - new documents from the crawler Sound clips are stored in htroot/env/soundclips/ which is done so because a future implementation will read these files using the http client and with configurable urls which will make it very easy for the user to replace the given sounds with own sounds.	10 years ago
Marc Nause	1e6e69bc40	Finished implementation of UPNP: ) will try other ports if YaCy standard ports are not available ) distinguish between internal and external port (not sure if this works 100%) Still to add: propery in config to enter own external port (in case of manually configured NAT)	10 years ago
Michael Peter Christen	e1bc768f9d	more IPv6 bugfixes	10 years ago
Michael Peter Christen	961f06c0b6	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	209e0f2fe8	allow url parameter in worktable apicall allow url=wwwl?param=a&param=b (with ?, & encoded) fix: http://mantis.tokeek.de/view.php?id=100 fix double adding of '&' in MultiProtocolURL.escape()	10 years ago
reger	3562b5e3a4	open rejected urls in new browser	10 years ago
reger	b0c87d8240	fix image search expand box, cut-off of 2nd capture line height tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height) +fix charset parameter in metadataImageParser +update start errMsgTxt to "java 1.7"	10 years ago
reger	fa99b286cc	add html5 autofocus to query input field (leave onload untouched = redundant, for IE9 http://www.w3schools.com/tags/att_input_autofocus.asp) adjust Peer-to-Peer/ Privacy switch label to display "Peer-to-Peer" as 2nd switch option in active stealth mode	10 years ago
Michael Peter Christen	329262231f	unresolved pattern fix	10 years ago
Michael Peter Christen	528f583d72	ipv6 fixes	10 years ago
Michael Peter Christen	e4ccca9497	fix for xss bugs found by CTF365	10 years ago
Michael Peter Christen	247e626083	IPv6 host parsing bugfixes	10 years ago
Michael Peter Christen	fe917deb2d	when pinging other peers, be able to select the right IP option	10 years ago
Michael Peter Christen	65e6ae52fb	IPv6-enhanced Network monitoring page	10 years ago
reger	7c1707872b	search result showPicture update search parameter used parameter &cat=image is obsolete and returns no results - remove &cat=image and &cat=href references - remove &tenant= references (unused) Use contentdom=image and inurl: parameter to make showPicture link display something (open in new window because of used inurl modifier changes original query)	10 years ago
Michael Peter Christen	3073c69aee	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	6491270b3a	large IPv6 redesign of peer ping methods! removed preferred IPv4 in start options and added a new field IP6 in peer seeds which will contain one or more IPv6 addresses. Now every peer has one or more IP addresses assigned, even several IPv6 addresses are possible. The peer-ping process must check all given and possible IP addresses for a backping and return the one IP which was successful when pinging the peer. The ping-ing peer must be able to recognize which of the given IPs are available for outside access of the peer and store this accordingly. If only one IPv6 address is available and no IPv4, then the IPv6 is stored in the old IP field of the seed DNA. Many methods in Seed.java are now marked as @deprecated because they had been used for a single IP only. There is still a large construction site left in YaCy now where all these deprecated methods must be replaced with new method calls. The 'extra'-IPs, used by cluster assignment had been removed since that can be replaced with IPv6 usage in p2p clusters. All clusters must now use IPv6 if they want an intranet-routing.	10 years ago
reger	0ecbf32134	update to Jetty 9.2.3	10 years ago
reger	46afdf7d21	add link to thread pool settings in status panel	10 years ago
reger	54019313e7	fix NPE in ViewFile - show snippet on document not in index	10 years ago
reger	4873a2d3a4	adjust link to peer in Network list (www path obsolete)	10 years ago
orbiter	3ac31614a3	added option to reverse-sort YaCy tables (internal API change only)	10 years ago
Michael Peter Christen	6d3d4c4ea6	changed the concurrent enumeration of query results in such a way that it is now possible to get the results in two steps: - first retrieve all IDs as given for a query - then retieve each document individually This was necessary for very large result sets where a query may run for hours and is possibly terminated by a solr-internal timeout. This occurs regulary during postprocessing and therefore this commit may fix unwanted postprocessing terminations.	10 years ago
reger	ed0d7a80d5	modifiy description for Field-Reindex to act only on local index http://mantis.tokeek.de/view.php?id=279	10 years ago
Michael Peter Christen	81f9b34da7	increaesed ability ot search for all images on a single server within the p2p remote search	10 years ago
Michael Peter Christen	9b92685771	automatically add a wild card if only a search on a single domain is done. This makes it possible to search all documents on a single domain even if no search word is given. This is in particular interesting when searching for all images on a single domain.	10 years ago
Michael Peter Christen	abde89438b	fix for favicon	10 years ago
Michael Peter Christen	ca8b2bf099	removed www and welcome servlet, these had been demo servlets and are not needed any more	10 years ago
reger	5247d01cd4	implement a forward to remote peer link in P2P Network list Most links in Network.html are only available with transparent proxy = on, which is switched off by default, to make the provided links useable in default setup a small forward servlet added (goto_p.java), which takes the peer hash as parameter and forwards to current public ip (optional with path= parameter). The servlet is protected ( _p ending) to assure forwarding works only for authorized YaCy users.	10 years ago
reger	de7641023c	add recommended link "self" to atom feed output	10 years ago
Michael Peter Christen	805a95a98b	fix for http://mantis.tokeek.de/view.php?id=467	10 years ago
Michael Peter Christen	7527ae63e7	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	b93ea4e2a6	Added the option to retrieve only the own seed or a seleceted seed (selected by peer-hash) in the /yacy/seedlist.[json/xml] api. Added also a xml-version of the servlet. The own seed can be retrieved i.e. with http://localhost:8090/yacy/seedlist.xml?my= http://localhost:8090/yacy/seedlist.json?my= and any other peer can be selected with http://localhost:8090/yacy/seedlist.xml?id=<peerhash> http://localhost:8090/yacy/seedlist.json?id=<peerhash>	10 years ago
reger	b5e0f70197	- remove repositoryPath post from ConfigBasic (obsolete) - remove static snippetComputationTime from ResultEntry (not used)	10 years ago
Michael Peter Christen	ffc259c944	changed link to new tutorial repository (yes, Youtube..). The link does not point to youtube directly to prevent that the referer to the peer address is given to youtube. Instead, a forwarder address at yacy.net is used to redirect to the tutorial repository (and can be changed later).	10 years ago
Michael Peter Christen	b0bfafa581	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	1735dbc9d9	enhanced image search: bugfixes and performance enhancements	10 years ago
reger	1d5d0b82a6	- skip html template specific servlet post variables (show_xxx) for feeds, - add <updated> (in required format) to atom feed	10 years ago
reger	8ed6550261	adding totalResults and id to atom feed output	10 years ago
Michael Peter Christen	7611bf79bd	Merge branch 'master' of gitorious.org:yacy/icewindxs-rc1 Conflicts: locales/ru.lng	10 years ago
Michael Peter Christen	d3b000b089	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
reger	9962b9e548	use configured search items per page if not specified in post - remove verify=cacheonly from admin screen search box to use the configured values (otherwise definition overwrites configured behavior and is used for following searches what might give unexpected/confusing different results compared to using /yacysearch )	10 years ago
Michael Peter Christen	2a52c6f0f1	using htroot/api/blacklists as source folder: removed package declaration of some classes in that folder	10 years ago
Michael Peter Christen	f510fb82dd	css skin fix: visited was not set which caused bad colors on new-user menu design.	10 years ago
Michael Peter Christen	57ce7eeff3	fixed localhost authorization and replaced the adminRealm with an info string which is visible in the browser. That makes it possible that the browser instructs the user how to change a forgotten admin password (during runtime).	10 years ago
Michael Peter Christen	62f48a28d6	moved index administration up ahead of system administration to put more importance on it. People should not feel that it is more important to tweak any settings (which may break things) than to look into the index.	10 years ago
Michael Peter Christen	77b4c6dc5b	moved Table administration and Busy Queues Config out of mini-submenu of advanced settings to a top-menu entry. Moved the advanced setting to a less prominent place of the submenu. Removed the table administration from target analysis submenu because it appeared double, the table administration is now the default in the system administration. Sorry for inconvenience if i constantly move menues around, but this makes just more sense and YaCy is still not finished :)	10 years ago
Michael Peter Christen	c90ae191ab	moved cookie monitoring to the network monitoring submenu	10 years ago
orbiter	0947bea882	fixed wrong submenu title	10 years ago
orbiter	3ba47823cb	switched position of API steering and content semantic	10 years ago
reger	0ff66118bf	exclude nav-header/footer in ServerScannerList.html?embedded fixes display of header in yacyinteractive.html	10 years ago
orbiter	301961c4c1	small fix to the welcome message	10 years ago
orbiter	2dd4b274d4	update to kaskelix	10 years ago
orbiter	46efeb6ea2	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
orbiter	60f97faec6	added hint to the search api examples to remind the user that these interfaces are examples and not actual variants of the current search	10 years ago
orbiter	f642cfbe30	added hint to the regular expression tester	10 years ago
orbiter	73ebf69ca7	changed style of info-icon to be similar to bootstrap.css glyphicons	10 years ago
reger	6654d314f1	add rss version to api/feed.rss IE11 reports error without	10 years ago
orbiter	cbb5f06630	do not remove the index deletion option from the IndexControlURLs_p.html servlet after a deletion happend, instead show but disable the option when the index is empty.	10 years ago
orbiter	73c2e47de3	added a confirmation dialog to complete index deletion	10 years ago
orbiter	688c6d8954	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
orbiter	500e0b9927	fix for browsing of file paths in Index Browser	10 years ago
Marc Nause	2af56fa37d	Improved UPnP. (still not perfect) ) set HTTPS port if enabled ) improved data structures (may not be final) *) moved UPnP to own package	10 years ago
orbiter	b3ebd38079	removed the HTDOCS repository concept because the concept to host files on the YaCy http server is obsolete; YaCy can index file:// and smb:// paths	10 years ago
orbiter	5611d45b65	renamed Host Browser to Index Browser (gives a better hint what it actually does)	10 years ago
reger	ec5b1d9e33	let NETWORK_WHITELIST take precedence over NETWORK_BLACKLIST this makes it easier to config exception (for private networks), like blacklist= .* whitelist= 10\..,127\.. ..... allows only listed ip pattern	10 years ago
reger	70bb3d1b38	update target url for yacy-portalsearch.html to search.yacy.net (peer yacportalsearch2014) (old www.yacy-suche.de not reacheable)	10 years ago
reger	29ccbf6491	seedUploadUrl config is lost on restart if no publish event occured -add a saveMySeed() on uploadurl changes (to keep url setting without retyping even if network down)	10 years ago
reger	e033e79826	remove old description for proxy port settings (Settings_p.html?page=ProxyAccess) - The options were not current (only port number accepted, which is part of ConfigBasic.html) - Deleted options and the port number input field from the proxyaccess page. - joined both transparent proxy setup pages (Settings_Http.inc & Settings_ProxyAccess.inc) in one page - adjustments to the related/linked pages	10 years ago
orbiter	e4e1bdeba0	added 0x40 to image of lockopen-gif image palette (light grey)	10 years ago
orbiter	7028a39abb	changed lock/unlock image design	10 years ago
orbiter	b4f2a1db6e	added a unlock icon for all protected pages that are unlocked because the administrator is logged in.	10 years ago
reger	7267c76881	set default "Search Interfaces"."Solr RSS/Opensearch" query to show latest 10 addition to index	10 years ago
reger	f76d81f5c9	fix: hanging text in input fields of WatchWebStructure_p.html in IE11	10 years ago
orbiter	cf9e7fdbb8	reverted template from latest cherry-picked commit	10 years ago
Alex	f6c7467a90	updated some french translations	10 years ago
reger	19e35a9126	add type attribute to atom feed <link> tag (for /yacysearch.atom)	10 years ago
reger	0a2f4a0e2f	eliminate lat/lon type conversion in osm (define as double)	10 years ago
Michael Peter Christen	01bbb20666	increased default logging line count to max	10 years ago
Michael Peter Christen	9bc3e457dd	fix for termination of all crawls	10 years ago
Michael Peter Christen	8d650ca225	added hint to port forwarding videos	10 years ago
reger	3963bca3b6	catch IndexControlRWIs_p error if RWI not connected	10 years ago
orbiter	2371d6b8db	target linktexts must be string to enable search facets on these fields	10 years ago
Michael Peter Christen	05d58e4df0	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	98f45c9032	fix for image alt attachment to AnchorURLs in html parser.	10 years ago
orbiter	22ce4fb4dd	better error handling for remote solr queries and exists-checks	10 years ago
orbiter	161a11070c	yacystats is gone :(	10 years ago
Michael Peter Christen	c115f3869c	enhanced snippet computation and test method in ViewFile	10 years ago
Michael Peter Christen	6e1dc444c3	added a snippet test function in ViewFile: you can now search for a specific word on the document; the servlet returns the snippet in the same way as it would be shown in a search result.	10 years ago
reger	29d1945c16	fix double &query parameter (index.html) ?query=word&query=	10 years ago

... 3 4 5 6 7 ...

5349 Commits (caf9e98f09b144933c9f23840cebfc8b5739a931)