Commit Graph

4979 Commits (16327d1cbe39f054f3445b5c1a2628c9ca916351)

Author SHA1 Message Date
orbiter 0621a15f89 fix for wrong search result counter: added a counter for all filtered out entities
14 years ago
orbiter 9c33b2fb58 fix for String Matcher in case that no snippet is returned (NPE)
14 years ago
orbiter 76f2817e00 a fix for the snippet computation and hopefully better snippets
14 years ago
orbiter deda54d684 - relaxed matching of string-search (this is now case-insensitive)
14 years ago
orbiter 15e3a57b4e removed unused functions in condenser
14 years ago
orbiter 6e42d4de88 - added full-String search function: find things that match exactly what is quoted in the query
14 years ago
orbiter 8e10b82280 small fix for solr export
14 years ago
apfelmaennchen 8b8db2aaba YMarks: some small changes/fixes
14 years ago
apfelmaennchen 441035f1f4 YMarks: some improvements to flexigrid quick search on YMarks.html
14 years ago
orbiter 6fa439c82b - refactoring of robots
14 years ago
apfelmaennchen e7c2ea193b YMark:
14 years ago
orbiter e3d19d0a90 fix in Document inboundlinks/outboundlinks sorting
14 years ago
orbiter 4e8fa03514 added more attributes to html evaluation
14 years ago
orbiter 3b578a28ef some patches to prevent that empty or bad IP information is broadcasted
14 years ago
orbiter 361841df16 another patch according to http://bugs.yacy.net/view.php?id=26#c36
14 years ago
orbiter 37fede9d30 better logic for proper seed ip recognition and better error messages
14 years ago
orbiter 8b95a26866 better magic
14 years ago
orbiter 2700a58e5a added a magic to the peer ping that will be used in case that the contacting peer requests that it's reported IP shall be used for a back-ping. The back-ping now also returns the same magic which will make it possible that the requested peer can verify that the back-pinged peer is actually the same peer.
14 years ago
orbiter 8879cc1db2 removed System.out.println
14 years ago
orbiter 528da7c9ea removed unused class and added license header for new class
14 years ago
orbiter f6077b3cc0 added more attributes for html parser and enhanced data structures
14 years ago
f1ori 0b02083e97 * function for simple crawl of one url
14 years ago
f1ori d671de8c17 add ranking weight to json-search-results
14 years ago
sixcooler 4eb9c1e7c3 not setting userAgent from Constructor as default for following calls
14 years ago
orbiter d8e934c085 better abstraction of http client identification
14 years ago
sixcooler a3e707283d not using HTTPConnector anymore
14 years ago
orbiter 9f1f47ec67 added some comments to explain the isLocal patch
14 years ago
orbiter b77b8cac0c - enhanced html parser: recognized much more details in the content
14 years ago
low012 bc84d2bc9d *) fixed typo in stop script
14 years ago
apfelmaennchen b2281f0b7d YMark: intermediate work towards flexigrid support
14 years ago
low012 06d50fd801 *) fixed stupid bug (introduced in r7663 by myself) which caused wrong parsing of Wiki pages
14 years ago
apfelmaennchen 60412d2bb3 YMark:
14 years ago
orbiter 3d5104d357 - fixed a bug in crawl start with file name (npe in new url)
14 years ago
orbiter fd3baa9025 fix for http://bugs.yacy.net/view.php?id=24
14 years ago
low012 2e9694c9e9 *) removed recursion which hopefully prevents exception
14 years ago
apfelmaennchen a2e86daae9 YMark: more bug fixes
14 years ago
apfelmaennchen 62855f9567 YMark: code clean up and some small fixes
14 years ago
apfelmaennchen 667e912b19 YMark:
14 years ago
apfelmaennchen a0e4960a4d YMark:
14 years ago
orbiter 958ff4778e enhanced location search:
14 years ago
sixcooler 8d63f3b70f just cosmetics - keeping my baby clean :-)
14 years ago
orbiter e402622584 removed httpclient-3.1 (this was added with last commit which was a mistake)
14 years ago
orbiter 19fd13d3bc Added federated index storage to solr.
14 years ago
orbiter c17d102bd8 enhanced speed for OrderedScoreMap inc method and size comparisment in concurrent environments
14 years ago
orbiter b788182954 some enhancements to scoring speed
14 years ago
orbiter 01690eab86 fix for mediawiki importer and wikicode parser
14 years ago
orbiter c5352e6872 added new SearchResult class (to be used later)
14 years ago
orbiter 4c013d9088 more UTF8 getBytes() performance hacks
14 years ago
apfelmaennchen 78d6d6ca06 refactoring for ymarks
14 years ago
cominch 9ac02caf00 different initialization of empty variables in alternative constructor. This leads to wrong interpretation of user credentials, resulting in unnecessary "@" in front of host, and different urlhash values.
14 years ago
orbiter a47bdc405b better logging for robinson selection according to peer tag
14 years ago
orbiter cafcb1f9ed removed the DNS resolving for web structure computation from the indexing queue and placed it in a concurrent computation queue that does not block the crawler. Makes crawling faster and less DNS-speed-dependent
14 years ago
orbiter 57ce1fb491 reverted synchronization from SVN 7641
14 years ago
orbiter 17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10
14 years ago
orbiter 7c8e764201 removed synchronization again...
14 years ago
orbiter 96c32e87b0 fixes to crawler and new user-agent crawl-delay handling
14 years ago
orbiter b2fe4b7b1a added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer
14 years ago
orbiter cb6f709a16 - enhancements in surrogate reading
14 years ago
low012 1ff9947f91 *) added new user right: extended search right (allows to define users who can query more results than anonymous users)
14 years ago
orbiter 564184909a enhanced the surrogate parser: better reading of UTF-8 characters
14 years ago
orbiter 156cf02703 - added an index constraint 'has location' to the condenser
14 years ago
orbiter 41b8d7f655 fix for url normalization (no backpath resolving in post parameters)
14 years ago
orbiter 0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
14 years ago
orbiter 8412f8787d fix for http://bugs.yacy.net/view.php?id=8
14 years ago
orbiter 9b25d07295 - added geo information parsing to html parser
14 years ago
f1ori efcf37a953 * show info in log, if robots.txt is rejected due to wrong mime-type
14 years ago
lotus cbf87fe72f write PID to yacy.running
14 years ago
low012 16cd919795 *) fixed Exceptions which caused 500 error when entering invalid URL mask or invalid prefer mask, invalid masks are ignored, error message is displayed on yacysearch.html (what about yacysearch.rss and yacysearch.json?)
14 years ago
low012 1a24917cea *) fixed NPE which occured when empty String was entered as search word
14 years ago
orbiter b1a8d0c020 enhancements to web cache and less strict caching rules
14 years ago
orbiter f3baaca920 - enhancements to DNS IP caching and crawler speed
14 years ago
low012 e7860b1239 *) <mode="Homer">D'oh!</Homer>
14 years ago
low012 82f1580a60 *) trying to fix ConcurrentModificationException
14 years ago
f1ori df71776929 * fix bug #7
14 years ago
low012 9f0286b380 *) fixed potential "java.lang.IllegalArgumentException: Illegal group reference" which occured if special characters which are also used as metacharacters in regular expression were used inside of <pre>...</pre> (see: http://veerasundar.com/blog/2010/01/java-lang-illegalargumentexception-illegal-group-reference-in-string-replaceall/)
14 years ago
orbiter 78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
14 years ago
orbiter ba03ca8620 added more configuration options for search:
14 years ago
f1ori e0c7d490f9 * fix bug #6
14 years ago
orbiter a50f28e6e7 - fixed missing save operation for peer name change
14 years ago
orbiter 2b5f8585bf performance hack for Balancer and ip address parsing
14 years ago
orbiter b1d133b69f another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts
14 years ago
orbiter a35d513bd8 fix for not-deleted .gap and .idx files
14 years ago
orbiter a6935e7dc8 fix for active dns resolving: do not resolve in case that the dns server is not available (offline mode)
14 years ago
orbiter 859c99886c fix for multiple thread dump
14 years ago
orbiter 61acf55da4 avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final.
14 years ago
orbiter c2a968c23f fix for bug in formatting in ThreadDump
14 years ago
low012 2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
14 years ago
orbiter 078ecacf61 avoid synchronization in DigestURI hash requests
14 years ago
orbiter 1989ebc24b removed more warnings
14 years ago
orbiter 0324de1467 removed debug line
14 years ago
orbiter 1aba7869bf patch for Windows: do not use the thread lock feature from previous commit if used on Windows
14 years ago
orbiter 0a11727374 added new feature for Thread dump:
14 years ago
orbiter b62b79675b removed type cast warnings
14 years ago
orbiter a07a1a8b1e removed type cast warnings
14 years ago
orbiter 8edaccfedf removed unused variables
14 years ago
orbiter e6c3507b17 disabled some of the previous changes (did not work in openjdk)
14 years ago
orbiter f9e5c21083 update to thread dump logs
14 years ago
orbiter 8f11d3a5bb redesigned the ScoreMap classes:
14 years ago
orbiter a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases)
14 years ago
orbiter dc0db3550e avoid string conversion
14 years ago