Commit Graph

735 Commits (6c9320e82a37b32b99e70cece6c01f6016bdffac)

Author SHA1 Message Date
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 93b7ddc27d fix for http://forum.yacy-websuche.de/viewtopic.php?p=19376#p19376
15 years ago
orbiter 6538043d89 fix for http://forum.yacy-websuche.de/viewtopic.php?p=19189#p19189
15 years ago
lotus 8faeedd99a not a fix! for:
15 years ago
orbiter be18b5d8cd fix for 'cannot switch back to default language'-bug
15 years ago
orbiter 308a973503 refactoring of tables data organisation
15 years ago
orbiter 8a76f38d26 Added a new steering servlet that can be used to repeat actions that had been made on the yacy interface. This can be used to:
15 years ago
orbiter 840527689b more simplification of bookmark class
15 years ago
orbiter d77782a8d5 removed bookmark tags file, tags are now stored only in RAM
15 years ago
orbiter ada0ce9de3 refactoring of bookmarks: there is a big performance problem in the bookmarks code and furthermore the bookmarks
15 years ago
orbiter 24060885b6 - added Tables abstraction in data.Tables.java
15 years ago
orbiter 3889438db6 fix for bookmarks
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter a06f7ddb33 more PMD recommendations
15 years ago
orbiter 66c0a8e849 more PMD recommendations
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter d77a8f3b3e added some modifications recommended by PMD for better performance
15 years ago
lotus ab3cf60dbe fix for npe
15 years ago
orbiter 7f20963b41 add-on to last commit
15 years ago
orbiter eeca2ded92 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18500#p18500
15 years ago
orbiter 4ac4fe952c patch for npe in bookmarks
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 969123385b added json and rss output for image search
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter e7f18ba24b refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter c0e0e1f422 moved blob classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 194da25a2f moved kelondro index
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
orbiter f8371707e5 - possibly better termination for SplitTable
15 years ago
orbiter 87780f2562 produce did-you-mean also for queries with more than one word
15 years ago
low012 a6a3090c3d *) blacklist cleaner supports usage of regular expressions now
15 years ago
orbiter c3a4aee255 some redesign with a possible fix for the ReferenceContainerCache.
15 years ago
orbiter 721b88efbd - fixed a problem loading blacklists with new yacycore.jar
15 years ago
orbiter d64569aa39 reuturn only recommendations of words that have a greater count than the original word
15 years ago
orbiter 604c37927f used comparator for did-you-mean that uses index sizes for comparisment, but:
15 years ago
orbiter a58d9cae7d - show location name in geolocalization search result
15 years ago
orbiter 573d03c7d7 added configuration to enable ram table copy
15 years ago
orbiter 3be54e1891 fix to rule when to use a ram table copy
15 years ago
orbiter 342c5d0fd4 fixed city name detection: finds now also substrings of city names
15 years ago
low012 248f3fd9b5 *) cleaned up code for better readability
15 years ago
orbiter eaddf2d464 - corrected layout of map preview
15 years ago
orbiter fd668f531b fixed map layout
15 years ago
orbiter 2740d9dd79 added integration of osm maps for search
15 years ago
orbiter ce972ff4ef update to default ranking profile which has now some settings to deny some phpbb3 pages which are redundant in the index when crawling phpbb3.
15 years ago
orbiter 67eddaec4b changed way to integrate dictionary files:
15 years ago
orbiter 3b9aaf9e9f - inserted new library tests inside DidYouMean
15 years ago
orbiter 8c35ffe34c fixes to the dymlib
15 years ago
orbiter bfa273bcc1 added a library provider which holds libraries in static objects,
15 years ago
orbiter 1762a7bcd6 - moved DidYouMean to the data package
15 years ago
orbiter 161d2fd2ef redesign of access to the HTCache (now http.client.Cache):
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 77d2a3782c removed strange debugging strings
16 years ago
orbiter 409538e17a code cleanup and code simplifcation
16 years ago
orbiter 1f1399e5c5 extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
16 years ago
orbiter 154bbc3364 code cleanup: call of static methods directly to the class
16 years ago
orbiter 222850414e simplification of the code: removed unused classes, methods and variables
16 years ago
orbiter c5122d6836 completed migration of BLOBTree to BLOBHeaps:
16 years ago
orbiter ae015e8e98 refactoring of blob package classes
16 years ago
orbiter ce1adf9955 serialized all logging using concurrency:
16 years ago
orbiter 27fa6a66ad - completed the author navigation
16 years ago
orbiter c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing.
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter 3d4b826ca5 migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically.
16 years ago
orbiter 26a46b5521 increased default maximum file size for database files to 2GB
16 years ago
orbiter e005cfea37 fix for bug in -incell option of URLAnalysis
16 years ago
orbiter a7e392f31b The collection index will not be supported any more.
16 years ago
low012 ea27853c59 *) some refactoring
16 years ago
orbiter 89aeb318d3 enhanced the wikimedia dump import process
16 years ago
orbiter c097531e3d added a catch Exception to all thread to check if any of them silently dies without any other notification
16 years ago
low012 ff5f82d780 *) removed description of removed commands from wikiHelp ([= =])
16 years ago
orbiter 9c6ac43f66 fixes for wiki parser
16 years ago
low012 78ffb61297 *) got rid of unnecessary variable which might also fix IndexOutOfBoundsException
16 years ago
orbiter d079d6dfdb small changes in surrogate reader, wiki code and portal test
16 years ago
low012 f1244264b8 *) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057
16 years ago
low012 d1116c049f *) added new method "contains()" to Blacklist interface
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
orbiter d4d87d90c4 - extended experimental wikipedia dump parser
16 years ago
orbiter c08f9b36a4 refactoring of wiki parser.
16 years ago
low012 9180617dd9 *) Classes to handle import of lists (especially blacklists) from XML files, not used yet, but will be used soon.
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 96eaecda3e - added migration class to go from index collections to the index cell data structure.
16 years ago
orbiter 7dff1cba62 removed option to use different primary keys in kelondro tables
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter d49238a637 more performance hacks: better default values for scaling, less memory usage
16 years ago
orbiter d988204875 better shutdown of tools
16 years ago
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
16 years ago
orbiter 60078cf322 added next tool for url analysis: check for references, that occur in the URL-DB but not in the RICOLLECTIONS
16 years ago
orbiter dbdd10da84 better logging and startup behaviour for referenceHash computation
16 years ago
orbiter d64836c34f added statistical analysis of URL reference
16 years ago
orbiter b80db04667 - refactoring of IntegerHandleIndex and LongHandleIndex (better method names)
16 years ago
orbiter efcd95dc37 simplification of (internal) query process / refactoring
16 years ago
orbiter f1b712c29a small corrections to image loading methods in result presentation
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 6ffc6e3389 more refactoring of indexer and kelondro classes;
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter d1d9fbae5c enabling the URLAnalysis to operate on multime input files, just use a wild card when calling the class from the command line
16 years ago
orbiter 7ea53fe47b added another url list transformation option:
16 years ago
orbiter 54625360f7 performance update
16 years ago
orbiter d884c4718a added gzip support for URLAnalysis:
16 years ago
orbiter cf9b74e6e3 added another method to process url lists: extract hosts only
16 years ago
orbiter 89d8e824ed memory protection for URLAnalysis
16 years ago
orbiter 0f6fa804ff performance update to URLAnalysis
16 years ago
orbiter e8f5f2f612 added tool to analyse url strings
16 years ago
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter 411f2212f2 more memory leak fixing hacks
16 years ago
orbiter 333489420b - fix for NPE when loading the cytag image
16 years ago
orbiter c25c334b75 replaced old DHT transmission method with new method. Many things have changed! some of them:
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter 024da2916b refactoring of logging
16 years ago
orbiter 83ce65707a (almost) completed partition of classes in kelondro
16 years ago
orbiter 7ee494fde5 more refactoring of kelondro:
16 years ago
orbiter bf93767ec6 refactoring of kelondro database classes
16 years ago
orbiter fc27bf8c4c refactoring of kelondro classes:
16 years ago
apfelmaennchen 3484e55be4 - small fix for bookmarksDB
16 years ago
apfelmaennchen 6dd52422ea - added two dialogs to manage bookmark tags in YaCy-UI
16 years ago
apfelmaennchen 3dc208fad0 bugfix: bookmarks can now handle folder names like /news and /newspaper without getting confused...
16 years ago
low012 f26b8fcb1b *) comment mode is 'moderated' instead of 'activated' by default now (to avoid spam being visible)
16 years ago
orbiter e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
16 years ago
orbiter 7535fd7447 - refactoring of CrawlEntry and CrawlStacker
16 years ago
lotus 18513e2ee2 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646
16 years ago
orbiter e1acdb952c fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376
16 years ago
orbiter 47292e696a more performance hacks
16 years ago
orbiter d39d420b39 performance hacks
16 years ago
orbiter 0b4808ba3d added new interactive search feature:
16 years ago
low012 e423fa9846 *) added method to only get file names in directory listing which match a filter
16 years ago
orbiter dba7ef5144 extended crawling constraints:
16 years ago
f1ori 7e1fe05e3c * added utf8-encoding to many getBytes-calls
16 years ago
low012 baae3d91b1 *) fixed warning when compiling listManager
16 years ago
low012 a99a629ed4 *) quick fix to prevent comments for blog entries which don't exist (http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1554)
16 years ago
low012 00e27e5050 *) fixed bug which made it possible to write files outside of the DATA/LIST directory when creating a new blacklist
16 years ago
orbiter 0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
16 years ago
orbiter 6fb865fbdc - fix of bug in iterator in kelondroBLOBHeap which caused bug in crawl profile listing
16 years ago
apfelmaennchen b97ff24b43 bookmarksDB / xbel.xml:
16 years ago
lotus 0a0cc3bf67 added missing classes to build target "run"
16 years ago
lotus a81cb78211 finally some putHTML on htroot/xml/
16 years ago
apfelmaennchen 7b63c66a08 - bugfix in bookmarksDB.Tag.hasPublicItems()
16 years ago
orbiter 05dbba4bab added logging conditions to all fine and finest log line calls
16 years ago
apfelmaennchen aa6ae77e5e - autoReCrawl: fix for filter settings
16 years ago
apfelmaennchen 8ae29bad57 - fix to previous change of Crawl Profile Names
16 years ago
apfelmaennchen 434104e4a0 - change Crawl profile name for autoreCrawl
16 years ago
lotus 0df2e47012 changed auto recrawl to comply with new date format
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
apfelmaennchen bd931a82f7 - added dynamic filters to autoReCrawl.conf
16 years ago
apfelmaennchen b3fc5e96a3 - removed unused import from bookmarksDB
16 years ago
apfelmaennchen bc048db7b6 - bugfix for bookmarksDB's rebuildDates()
16 years ago
danielr 3c68905540 remove redundant null checks
16 years ago
danielr 753a1ae430 - changed default browser from netscape to firefox
16 years ago
danielr be28af50f5 - fixed "yacy2yacy no proxy"-problem
16 years ago
danielr 621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
17 years ago
apfelmaennchen 0500b1179e added a 2 min start up delay to serverBusyThread autoReCrawl to avoid a Null Pointer Exception...
17 years ago
apfelmaennchen e1574fe02e - added autoReCrawl folders to bookmarks (DATA/SETTINGS/autoReCrawl.conf)
17 years ago
danielr 17b7845eb5 * refactoring
17 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter 3ca98fee42 removed superfluous copyright statement
17 years ago
orbiter 7b1c9e6aee discovered and removed a (possibly large) memory leak:
17 years ago
orbiter 0f5fe8cc53 refactoring of method calling for objects from kelondroMapDataMining
17 years ago
orbiter 4acf0a61cd refactoring of kelondroObjects (mainly renaming to kelondroMap)
17 years ago
orbiter f7aaeb3fad created new main menu entry 'Customization and Integration'
17 years ago
orbiter 1e6d12f146 Major update to BLOB data structures:
17 years ago
orbiter 81f75f5056 - removed unnecessary classes (these objects are much easier to handle using generics)
17 years ago
orbiter a6719dfd2b - refactoring of robots parser
17 years ago
orbiter e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
17 years ago
danielr 68c38c2d34 - WatchCrawler shows status without JavaScript
17 years ago
orbiter 3330181aa0 refactoring:
17 years ago
danielr 4b71912e76 fixed wrong class name
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
17 years ago
apfelmaennchen 2113672bf2 small fix on tag comporator functions
17 years ago
orbiter fbb712c669 refactoring:
17 years ago
orbiter 1689030ee8 refactoring: moved all crawler classes into their own package
17 years ago
orbiter d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back)
17 years ago
danielr d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
17 years ago
orbiter 1995faef8d - refactoring of Colage back-end: move to plasma package
17 years ago
orbiter 8313d58ae7 - integrated the collage into the Web Visualization menu
17 years ago
orbiter 82bf9ac1c8 - added Collage servlet from datengrab and modified it:
17 years ago
orbiter 202a3adb3e refactoring of HttpClient Writer processes
17 years ago
orbiter e356625b22 - refacotring of stream copy handling to support time-consuming operations
17 years ago
orbiter c3342e1178 - removed class with only one static method
17 years ago
danielr 5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
17 years ago
orbiter 7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
17 years ago
orbiter d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
17 years ago
orbiter 541b817502 refactoring of switchboard queueing
17 years ago
orbiter 275a226cc5 refactoring
17 years ago