Commit Graph

4456 Commits (2af8e337737b4e7cbc4b6394ef6a71f869894885)

Author SHA1 Message Date
orbiter fffb91447a fixed crawl queue delete function
14 years ago
orbiter b769cce433 - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only
14 years ago
orbiter 22453b13ad implemented local host address discovery as posted in
14 years ago
orbiter cc6499bf8d - added http://blekko.com as search heuristic (like scroogle). This was easy since they deliver their search results also as rss feed
14 years ago
orbiter a9f754c45f removed unused CR accumulation and distribution process
14 years ago
orbiter d4a1a1850b removed warnings
14 years ago
low012 3b5830b7d4 *) Fixed typo.
14 years ago
low012 9b3fae9496 *) cleaning up the code a little bit
14 years ago
orbiter 7bb4b001ed - view image files from cache
14 years ago
low012 e7552bd719 *) cleaning up the code a little bit
14 years ago
apfelmaennchen 737aaf6952 various small changes to ymarks
14 years ago
apfelmaennchen 8a50670546 some code clean up for the last post
14 years ago
apfelmaennchen 442497868d another step towards an auto tagging function for YMarks
14 years ago
f1ori 741a87a3e9 * make .yacy-domains crawlable (.yacy-domains are local domains, so only in custom networks/peers)
14 years ago
f1ori dca9e16f51 * don't index pages, which redirect, twice
14 years ago
low012 eb79b952ef *) cleaner code
14 years ago
low012 38fdf43587 *) renamed classes according to standard Java coding conventions
14 years ago
low012 025e3f4790 *) renamed classes according to standard Java coding conventions
14 years ago
low012 3b9aa0504e *) removed unsused code
14 years ago
low012 db3db0fdb9 *) trying to make this class less confusing (probably failing)
14 years ago
apfelmaennchen 54e63b556e intermediate step for a YMark auto-tagging function based on word frequencies.
14 years ago
apfelmaennchen 403ee9c014 added a drill-down for metadata and word count to /api/ymarks/test_treeview.html
14 years ago
apfelmaennchen 11ae5b108e enabled rebuildIndex for /Table_YMark_p.html (rebuilds the tags and folders index)
14 years ago
apfelmaennchen 94a9be18a4 added a ymark table administration: /Table_YMark_p.html
14 years ago
apfelmaennchen 25339f93c7 more updates to ymarks
14 years ago
apfelmaennchen cdd65aca71 update to ymarks
14 years ago
apfelmaennchen 808edffaf6 ymarks
14 years ago
f1ori 2c539b514a * add domaincheck (local/global/domainlist) to urlcleaner
14 years ago
orbiter 117fc86b3d fix for http://forum.yacy-websuche.de/viewtopic.php?p=21199#p21199
14 years ago
orbiter 09badc697b - low-memory patch for crawler
14 years ago
orbiter becc463d8a enhanced did-you-mean
14 years ago
apfelmaennchen 43586a2ace a update to ymarks (please test if you wish):
14 years ago
orbiter 93c535d111 fixed http://forum.yacy-websuche.de/viewtopic.php?p=21113#p21113
14 years ago
orbiter 4c72885cba added a sitemap entry parser and loader for sitemaps
14 years ago
orbiter 790e0b1894 - enhanced index deletion in IndexControlRWIs_p: delete also robots.txt database and cache if demanded
14 years ago
apfelmaennchen f5324b27f2 more updates to the new bookmarks (ymarks)....
14 years ago
orbiter 445619f3ec added a submenu ConfigHTCache_p.html to set the size of the HTCache separately from the proxy configuration.
14 years ago
f1ori acd93b1b31 * add failsafe mechanisme to domainlist retrieval
14 years ago
orbiter 70c95608d4 Added CORS Access header for yacysearch.rss output
14 years ago
lotus 18729351e7 upnp: hint for wrongly detected local ip address
14 years ago
f1ori def4253555 * add option to network definition to provide a domainlist (syntax like in blacklists)
14 years ago
orbiter ac6b503adf untar files without gzip decompression even if the file has gz extension. this is done when the decompression fails.
14 years ago
apfelmaennchen efe0667fdd more new bookmark (ymarks) code with experimental html and xbel import
14 years ago
mikeworks caabebf9be Fixed spelling mistake omiting -> omitting in debug messages in ConfigUpdate_p.java and Switchboard.java
14 years ago
orbiter 155d556568 - better memory protection
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter 25a8e55bc9 more logging about bad seeds
14 years ago
orbiter 959b8c6fa0 - allow greater seed size
14 years ago
orbiter e103419a56 - removed <3 peers barrier for peer ping feedback
14 years ago
apfelmaennchen d0e6c03b51 some updates to the new bookmark code...
14 years ago
orbiter facfd204e9 added a parent configuration option.
14 years ago
orbiter e3964f2c31 better catch of network definition load error; continue with secondary network load definition location
14 years ago
low012 65a0381f76 *) cleaning up code (still not done)
14 years ago
orbiter e3e3b49d52 - enhanced main release recognition
14 years ago
apfelmaennchen 9c94ebdee4 small changes to new bookmark code...
14 years ago
apfelmaennchen 244b56e9d3 an update to the new bookmark code...
14 years ago
low012 dc40f51b8d *) added headlines as proposed by Vega
14 years ago
apfelmaennchen f035f257da added some more bookmark code...
14 years ago
low012 22ed9c380c *) fixed bug which was introduced in r7226 (shame on me) which made wiki unusable (all entries were stored with empty subject as key -> edits were lost)
14 years ago
f1ori 60fd2e549d * log failures when writing config file
14 years ago
orbiter 58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
14 years ago
orbiter 863065abc4 added user agent logging to access tracker
14 years ago
apfelmaennchen a79728b97d some updates to experimental bookmark code...
14 years ago
apfelmaennchen ef782cd026 and even more experimental bookmark code...
14 years ago
orbiter ed4371dcf3 enhanced navigation implementation and enhanced tag cloud computation
14 years ago
orbiter ca738ac924 - added a tag cloud to search results (using the topics)
14 years ago
apfelmaennchen 7aca763ca8 Some more experimental bookmark code...
14 years ago
apfelmaennchen 4270ed696c Experimental code (I need to transfer the code to my macbook, sorry) for the new bookmarks API based on the Tables concept (same as for crawl starts). Currently you can add a bookmark by api/ymarks/add_ymark.xml?url=http://www.yacy.net&title=YaCy and watch the result via the standard view Tables_p.html.
14 years ago
orbiter e4d561971e added more score cluster options and made score cluster usage more transparent
14 years ago
orbiter e8f90201a5 fix for scheduling of rss feeds
14 years ago
orbiter 7cd9d9d22a - enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
14 years ago
orbiter de722090b5 enhancements in did-you-mean guessing
14 years ago
orbiter a59c885ee0 autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types
14 years ago
orbiter b7acd92ce4 Auto-Suggestions for YaCy Search:
14 years ago
orbiter 6a166c2040 patches for bad proxy behaviour
14 years ago
orbiter d607b30b6a performance enhancements for search and code review for database functions
14 years ago
orbiter 45b1ab3d07 custom + generic skins:
14 years ago
orbiter fcd40cd30f - disabled domZones (buggy, must think about better solution)
14 years ago
orbiter 0d363a94d7 more performance hacks
14 years ago
orbiter b8aee6d402 performance hacks for better search performance
14 years ago
orbiter 091dd3f6ec - enhanced intranet search speed
14 years ago
low012 b9f405d1e8 *) added comments
14 years ago
orbiter 6e6994e328 latest bugfixes to search and indexing function after test of demo presentation
14 years ago
orbiter aacf572a26 - enhancements for search speed
14 years ago
sixcooler 61c82f3105 gzip-compresson @ transferRWI & transferURL back again
14 years ago
orbiter 2c549ae341 fixed a number of small bugs:
14 years ago
orbiter f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
14 years ago
orbiter c60aed4435 no caching in browser of dynamic web pages sent by YaCy http
14 years ago
orbiter e63896f2a8 added an intranet scanner and a servlet which shows all intranet addresses and an option to start a site-crawl for all these addresses at once.
14 years ago
orbiter e54cb7fb0c more bugfixes (also for latest commit)
14 years ago
orbiter d2fd93135c - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed
14 years ago
low012 afa708d552 *) added <s>...</s> tag to WikiCode -> works just as the HTML equivalent
14 years ago
orbiter a83186ac7d fix for bug in cytrails
14 years ago
orbiter 48c0d508ac fixes for crawling of smb links (file length not always available)
14 years ago
orbiter 0bc6284e27 - added bugfix for access tracker in case of concurrency conflicts
14 years ago
orbiter 10a9cb1971 simplified snippet computation process and separated the algorithm into two classes
14 years ago
lotus 4450c240b7 npe fix http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2982
14 years ago
orbiter 84a023cbc8 fixed several search bugs
14 years ago
orbiter 97ee278931 enhanced search speed:
14 years ago
orbiter ee3820c9cc more logging for strange "java.lang.NoClassDefFoundError: de/anomic/http/server/RequestHeader" error
14 years ago
orbiter 377f001e0d sorting of crawl profile names in crawl profile editor, see
14 years ago
orbiter 3552476fbe terminated migration from apache httpclient-3.1 to 4.1:
14 years ago
orbiter a2f9974745 some redesign in the access tracker to realize sixcoolers question about "smartes way for deleting the first Object":
14 years ago
sixcooler 03f0414025 some minor correction of my last commit
14 years ago
sixcooler 42fa0eadb1 fix endless loop:
14 years ago
low012 5a9ea0308f *) further simplification of wiki code parser (less redundancy in code, less magic numbers), still not done with it...
14 years ago
orbiter 37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification
14 years ago
orbiter 29fe401f93 - some layout and text enhancement for site crawl start
14 years ago
orbiter 461a2a6ec7 enhanced remote crawling:
14 years ago
orbiter 670ba4d52b - removed the remote crawl option from the network configuration submenu and
14 years ago
orbiter 89c2d8b81e better initial hash computation
14 years ago
orbiter 34e2f7f487 enhanced snippet fetch strategy: concurrent snippet fetch even for offline-snippet searches. This improves speed since it is now possible to fetch snippets offline and parsing of source files from the htcache can be enhanced using concurrency. This improves local and remote search.
14 years ago
orbiter 0cf006865e refactoring and enhanced concurrency
14 years ago
orbiter 83ac07874f - corrected return value of put() methods (not used anywhere, so it did not harm before)
14 years ago
orbiter 5702419194 fixed a bug in HTTPClient: keep-alive must be set to false, otherwise servers hold connections 2 seconds open until response.
14 years ago
orbiter 5870b13f3a - code cleanup / added debug line for further investigation in HTTPDemon.parseMultipart
14 years ago
orbiter ac1c08924e more performance hacks
14 years ago
orbiter 14c843d364 more performance hacks
14 years ago
orbiter 39f409a7bb performance hacks
14 years ago
orbiter 7ebef56add - redesign of a part of the remote search client to make it possible to have a test environment for remote search performance tests
14 years ago
orbiter 3c0e07ba72 removed all delays in shutdown process
14 years ago
orbiter 64860dc1bb enhanced search event logging (to be used for further improvements)
14 years ago
sixcooler 17eebd4ef8 counting crawler traffic again:
14 years ago
orbiter 32f73d1aaa added copy for Info.plist for Mac application release updates (this file contains class paths and start parameters)
14 years ago
orbiter 4c21d8dc9d - changed default values for online caution (the pausing may not be necessary any more)
14 years ago
orbiter 570ca577c6 performance hacks
14 years ago
orbiter 348dece62f redesign of the SortStack and SortStore classes:
14 years ago
orbiter 114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
14 years ago
lotus 6a09f1f7e5 fix dedicated upnp testing
14 years ago
orbiter 5fe828fa06 - replaced pdfbox and fontbox version 1.1.0 with 1.2.1
14 years ago
orbiter c757a4aa9f - corrected lifetime computation for search events
14 years ago
orbiter fb828f3767 - performance enhancements in search response time using faster query ID computation and an ID cache
14 years ago
orbiter 22047ffad5 enhanced computation speed of many replaceAll string operations
14 years ago
orbiter e8228fba09 less locking in time format computation, caching and during secondary (remote) search evaluation
14 years ago
orbiter 9c0c94683c because of a bug in search result caching count search results had not been generated as fast as possible.
14 years ago
orbiter fa2eb9676e removed unused class
14 years ago
low012 5f391fcfa9 *) cleaned up in wikiCode parser (more to be done)
14 years ago
orbiter b3f0d06444 fixed a problem with restarts in YaCy mac applications: the DATA directory path was not submitted when doing a restart. This solves the problem by:
14 years ago
orbiter d4e4967e19 cleaned up code in yacyRelease (there will be work to do there)
14 years ago
orbiter 1da5241c2d do not block server session if maximum number of sessions is reached, just try to clean up once
14 years ago
orbiter 5de70c3d7c changed way of storage for search requests:
14 years ago
orbiter 9d080f387e change in handling of the all-visible home path for storage in YaCy:
14 years ago
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
14 years ago
f1ori 55da979291 disable revision detection for git
14 years ago
orbiter 104318d58a - added nice colors to feed indexing state messages
14 years ago
orbiter 4f22e2df41 bugfixes for
14 years ago
orbiter 42414a6ae3 added two more tables in rss reader interface:
14 years ago
orbiter 0010cd9db1 Support for indexing of RSS feeds!
14 years ago
orbiter 0f276dd63f - MapHeap now implements Map<byte[], Map<String, String>>
14 years ago
orbiter c60d0282fd more abstraction for tables stored in heaps:
14 years ago