Commit Graph

731 Commits (c659310e89e8a4b4b2d1de0b13c67f604843ad1e)

Author SHA1 Message Date
orbiter e4d561971e added more score cluster options and made score cluster usage more transparent
14 years ago
orbiter 7cd9d9d22a - enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
14 years ago
orbiter de722090b5 enhancements in did-you-mean guessing
14 years ago
orbiter a59c885ee0 autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types
14 years ago
low012 b9f405d1e8 *) added comments
14 years ago
orbiter e63896f2a8 added an intranet scanner and a servlet which shows all intranet addresses and an option to start a site-crawl for all these addresses at once.
14 years ago
low012 afa708d552 *) added <s>...</s> tag to WikiCode -> works just as the HTML equivalent
14 years ago
low012 5a9ea0308f *) further simplification of wiki code parser (less redundancy in code, less magic numbers), still not done with it...
14 years ago
orbiter 114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
14 years ago
orbiter 22047ffad5 enhanced computation speed of many replaceAll string operations
14 years ago
orbiter fa2eb9676e removed unused class
14 years ago
low012 5f391fcfa9 *) cleaned up in wikiCode parser (more to be done)
14 years ago
orbiter 9d080f387e change in handling of the all-visible home path for storage in YaCy:
14 years ago
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
14 years ago
orbiter 4f22e2df41 bugfixes for
14 years ago
orbiter 42414a6ae3 added two more tables in rss reader interface:
14 years ago
orbiter 0f276dd63f - MapHeap now implements Map<byte[], Map<String, String>>
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter 90531f78ff refactoring of the cora package to get subpackages for http and ftp (smb to come)
14 years ago
sixcooler 661867923a ... migrating to HttpComponents-Client-4.x ...
14 years ago
orbiter 70dd26ec95 added the new crawl scheduling function to the crawl start menu:
14 years ago
orbiter 5a994c9796 added a scheduler based on API actions
14 years ago
orbiter 189a986ebd - modified api-call interface to record api calls with references to api-call database (carries pk)
14 years ago
orbiter 5924a0d851 - enhanced concurrency in database index access for multicore
14 years ago
orbiter 150cf42a1b migrated all my LGPL 3 -licensed files to the LGPL 2.1 because LGPL 3 is not compatible to the GPL 2
15 years ago
orbiter b03caaa57a better handling of OOM situations
15 years ago
orbiter 56ff9d5fd4 - extended news size from 512 to 1024 characters
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 1defd580bc - added option to localization search to distinguish between a search for a location according to the search word only or for the relation between a web search results and locations found in the metadata fields
15 years ago
orbiter e43e61e502 added another geolocalization data source: GeoNames
15 years ago
orbiter 2a8f70f0ca - fix for caching of OSM tiles. if you want that this fix applies to your peer, please delete the crawl profiles
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter 3661cb692c added dictionary loader servlet that can be used to get the geolocalization file:
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 90c3e5d6f6 - cleanup, removed unused imports
15 years ago
sixcooler 13f5b8e7ba fix for storing/getting bookmark-folders
15 years ago
orbiter 64f29f990e a collection of performance hacks and code cleanup:
15 years ago
low012 fc43f3028e *) hopefully fixing NPE issue introduced in r6797
15 years ago
orbiter 4917f96729 fixes for some changes in SVN 6797 that caused NPEs when the bookmarks initialized
15 years ago
low012 dff660441a *) changes for better code readability
15 years ago
low012 15d9ea8375 *) changes for better code readability
15 years ago
low012 2bc459252e *) changes for better code readability
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter 748abfcffa added patches to prevent yacy-protocol DoS settings
15 years ago
orbiter 884b262130 - added a new Wiki Namespace Navigator
15 years ago
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 93b7ddc27d fix for http://forum.yacy-websuche.de/viewtopic.php?p=19376#p19376
15 years ago
orbiter 6538043d89 fix for http://forum.yacy-websuche.de/viewtopic.php?p=19189#p19189
15 years ago
lotus 8faeedd99a not a fix! for:
15 years ago
orbiter be18b5d8cd fix for 'cannot switch back to default language'-bug
15 years ago
orbiter 308a973503 refactoring of tables data organisation
15 years ago
orbiter 8a76f38d26 Added a new steering servlet that can be used to repeat actions that had been made on the yacy interface. This can be used to:
15 years ago
orbiter 840527689b more simplification of bookmark class
15 years ago
orbiter d77782a8d5 removed bookmark tags file, tags are now stored only in RAM
15 years ago
orbiter ada0ce9de3 refactoring of bookmarks: there is a big performance problem in the bookmarks code and furthermore the bookmarks
15 years ago
orbiter 24060885b6 - added Tables abstraction in data.Tables.java
15 years ago
orbiter 3889438db6 fix for bookmarks
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter a06f7ddb33 more PMD recommendations
15 years ago
orbiter 66c0a8e849 more PMD recommendations
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter d77a8f3b3e added some modifications recommended by PMD for better performance
15 years ago
lotus ab3cf60dbe fix for npe
15 years ago
orbiter 7f20963b41 add-on to last commit
15 years ago
orbiter eeca2ded92 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18500#p18500
15 years ago
orbiter 4ac4fe952c patch for npe in bookmarks
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 969123385b added json and rss output for image search
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter e7f18ba24b refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter c0e0e1f422 moved blob classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 194da25a2f moved kelondro index
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
orbiter f8371707e5 - possibly better termination for SplitTable
15 years ago
orbiter 87780f2562 produce did-you-mean also for queries with more than one word
15 years ago
low012 a6a3090c3d *) blacklist cleaner supports usage of regular expressions now
15 years ago
orbiter c3a4aee255 some redesign with a possible fix for the ReferenceContainerCache.
15 years ago
orbiter 721b88efbd - fixed a problem loading blacklists with new yacycore.jar
15 years ago
orbiter d64569aa39 reuturn only recommendations of words that have a greater count than the original word
15 years ago
orbiter 604c37927f used comparator for did-you-mean that uses index sizes for comparisment, but:
15 years ago
orbiter a58d9cae7d - show location name in geolocalization search result
15 years ago
orbiter 573d03c7d7 added configuration to enable ram table copy
15 years ago
orbiter 3be54e1891 fix to rule when to use a ram table copy
15 years ago
orbiter 342c5d0fd4 fixed city name detection: finds now also substrings of city names
15 years ago
low012 248f3fd9b5 *) cleaned up code for better readability
15 years ago
orbiter eaddf2d464 - corrected layout of map preview
15 years ago
orbiter fd668f531b fixed map layout
15 years ago
orbiter 2740d9dd79 added integration of osm maps for search
15 years ago
orbiter ce972ff4ef update to default ranking profile which has now some settings to deny some phpbb3 pages which are redundant in the index when crawling phpbb3.
15 years ago
orbiter 67eddaec4b changed way to integrate dictionary files:
15 years ago
orbiter 3b9aaf9e9f - inserted new library tests inside DidYouMean
15 years ago
orbiter 8c35ffe34c fixes to the dymlib
15 years ago
orbiter bfa273bcc1 added a library provider which holds libraries in static objects,
15 years ago
orbiter 1762a7bcd6 - moved DidYouMean to the data package
15 years ago
orbiter 161d2fd2ef redesign of access to the HTCache (now http.client.Cache):
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 77d2a3782c removed strange debugging strings
16 years ago
orbiter 409538e17a code cleanup and code simplifcation
16 years ago
orbiter 1f1399e5c5 extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
16 years ago
orbiter 154bbc3364 code cleanup: call of static methods directly to the class
16 years ago
orbiter 222850414e simplification of the code: removed unused classes, methods and variables
16 years ago
orbiter c5122d6836 completed migration of BLOBTree to BLOBHeaps:
16 years ago
orbiter ae015e8e98 refactoring of blob package classes
16 years ago
orbiter ce1adf9955 serialized all logging using concurrency:
16 years ago
orbiter 27fa6a66ad - completed the author navigation
16 years ago
orbiter c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing.
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter 3d4b826ca5 migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically.
16 years ago
orbiter 26a46b5521 increased default maximum file size for database files to 2GB
16 years ago
orbiter e005cfea37 fix for bug in -incell option of URLAnalysis
16 years ago
orbiter a7e392f31b The collection index will not be supported any more.
16 years ago
low012 ea27853c59 *) some refactoring
16 years ago
orbiter 89aeb318d3 enhanced the wikimedia dump import process
16 years ago
orbiter c097531e3d added a catch Exception to all thread to check if any of them silently dies without any other notification
16 years ago
low012 ff5f82d780 *) removed description of removed commands from wikiHelp ([= =])
16 years ago
orbiter 9c6ac43f66 fixes for wiki parser
16 years ago
low012 78ffb61297 *) got rid of unnecessary variable which might also fix IndexOutOfBoundsException
16 years ago
orbiter d079d6dfdb small changes in surrogate reader, wiki code and portal test
16 years ago
low012 f1244264b8 *) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057
16 years ago
low012 d1116c049f *) added new method "contains()" to Blacklist interface
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
orbiter d4d87d90c4 - extended experimental wikipedia dump parser
16 years ago
orbiter c08f9b36a4 refactoring of wiki parser.
16 years ago
low012 9180617dd9 *) Classes to handle import of lists (especially blacklists) from XML files, not used yet, but will be used soon.
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 96eaecda3e - added migration class to go from index collections to the index cell data structure.
16 years ago
orbiter 7dff1cba62 removed option to use different primary keys in kelondro tables
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter d49238a637 more performance hacks: better default values for scaling, less memory usage
16 years ago
orbiter d988204875 better shutdown of tools
16 years ago
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
16 years ago
orbiter 60078cf322 added next tool for url analysis: check for references, that occur in the URL-DB but not in the RICOLLECTIONS
16 years ago
orbiter dbdd10da84 better logging and startup behaviour for referenceHash computation
16 years ago