Commit Graph

4315 Commits (c0d9a3e9a782607acbe90f70abb5c09eb83fea41)

Author SHA1 Message Date
orbiter 748abfcffa added patches to prevent yacy-protocol DoS settings
15 years ago
orbiter e820ed061a avoiding excessive DNS lookups to determine localhost
15 years ago
orbiter 11983bc936 redesigned some parts of the parser entry point:
15 years ago
orbiter 89b4fff1c2 adopted ant script for new exif library
15 years ago
orbiter 24e5faee75 added exif parsing for jpg images
15 years ago
orbiter 82f76e1296 removed log line
15 years ago
orbiter 0f8004f9da enhanced html parser to recognize a href tags inside header tags
15 years ago
orbiter 3300930fc5 - (almost) fixed FTP crawler
15 years ago
orbiter 1198b9989d bugfixes, more sorttable
15 years ago
orbiter 9623d9e6d2 added a smb loader component for the YaCy crawler
15 years ago
orbiter ae2f3f000f better handling of table copy abandon .. prevent memory leak
15 years ago
orbiter 0769517129 added a robots.txt monitor in the crawler monitor submenu
15 years ago
orbiter 72f00dee59 removed never-used server access account function
15 years ago
orbiter de01fe0e6d fix for bug in url parser
15 years ago
orbiter c4bdb1e7f2 added one more option in ViewFile to show an iframe like for the orginal web page content but using the cache than the direct link to the content in the web. Upgraded the very old and previously not any more used CacheResource_p servlet to a new and working version.
15 years ago
orbiter 1bbe14d23f SVN 6716 unfortunately contained parts of the unfinished SMB integration. To fix compile errors the remaining parts of the SMB implementation stub is added with this commit.
15 years ago
orbiter 884b262130 - added a new Wiki Namespace Navigator
15 years ago
orbiter 270fb38674 - fixed some bugs in Table viewer
15 years ago
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request
15 years ago
sixcooler cd6de83905 next try for for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703
15 years ago
sixcooler bfe4693e9a fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703
15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
15 years ago
orbiter 30c8185139 fix for sid check
15 years ago
orbiter ef62d017e5 integrated session id filtering for crawler
15 years ago
orbiter d8d9984913 added framework for session id filtering (not ready yet)
15 years ago
orbiter 2bc36de336 - fix for bug in svn 6669
15 years ago
orbiter d378ca4604 better handling of concurrency in seed
15 years ago
orbiter 6538043d89 fix for http://forum.yacy-websuche.de/viewtopic.php?p=19189#p19189
15 years ago
sixcooler e071d71f19 fix for yacy-banner-network-values
15 years ago
sixcooler 787b588c33 reverted a part of svn6636:
15 years ago
lotus 11188cd7eb resource observer now uses the Java 6 method to check for free space. thus, disk observing now needs Java 6 installed.
15 years ago
sixcooler 089877f32c my first commit - hopefully fix for merge problem
15 years ago
orbiter d6391f2537 better handling of rewrite cases where the resulting rewrite blob entry is equal in size
15 years ago
orbiter ef9473d92c added another sixcooler suggestion: recycle corrupted records
15 years ago
orbiter fe78edac32 - view API calls in correct date-order
15 years ago
orbiter 308a973503 refactoring of tables data organisation
15 years ago
lotus 85ca96227f fix for re-enable parser
15 years ago
orbiter ada0ce9de3 refactoring of bookmarks: there is a big performance problem in the bookmarks code and furthermore the bookmarks
15 years ago
orbiter 3751ab4ae2 added sixcoolers patch and more checks/removed unnecessary code
15 years ago
orbiter d8d8562c59 fill key with zeros during normalization
15 years ago
orbiter 24060885b6 - added Tables abstraction in data.Tables.java
15 years ago
orbiter 7fdf59a77f misc NPE check
15 years ago
lotus 38a3d55afd added more possible php extensions for html
15 years ago
orbiter 4403304957 bugfix for list()
15 years ago
orbiter 69c29acb6e no exception thread dump if parser cannot parse becuase that mime-type/extension is in the deny-set
15 years ago
orbiter 0098e6e859 bugfix for heap iterator
15 years ago
orbiter db19a941cf added new image index storage classes (not integrated yet)
15 years ago
orbiter c8aece34a4 update to yacy/ai (just more testing)
15 years ago
orbiter 8ce936bcdd added an api recording function: it shall be possible to record
15 years ago
orbiter 56e0d9bd01 - testings with image parser
15 years ago
orbiter e80e060ca6 - increased thread priority for server threads
15 years ago
orbiter 7d400b17d0 html parser support for .cfm files
15 years ago
orbiter f6731c6240 more logging etc.
15 years ago
orbiter 007f8297de added php3 as extension type for html
15 years ago
orbiter 4f1f4863c4 fix for deadlock when initializing a SplitTable with a file of size 0, see also:
15 years ago
orbiter cc5dcf69ff missing change for last commit
15 years ago
orbiter ca1ef9a079 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18584#p18584
15 years ago
orbiter 938e806182 tried to fix date problem that may have prevented that foreign peers stay in the network
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter 82f57f79e5 more PMD enhancements
15 years ago
orbiter a06f7ddb33 more PMD recommendations
15 years ago
orbiter eb79ceb3ff update to kelondro data structures
15 years ago
orbiter 18172451a0 better search computation:
15 years ago
orbiter 66c0a8e849 more PMD recommendations
15 years ago
orbiter 2113fcd7e5 - fixed usage of isEmpty() which is not available in java 1.5
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
lotus eac2daf2e8 * reenable DHT if yet enough memory is available
15 years ago
lotus 0752634b8b log YaCy version on startup
15 years ago
orbiter d77a8f3b3e added some modifications recommended by PMD for better performance
15 years ago
orbiter d1973bae2a code cleanup: removed unused code and unused methods
15 years ago
orbiter a3b8b7b5c5 some redesign of the main menu structure:
15 years ago
orbiter 7f20963b41 add-on to last commit
15 years ago
orbiter eeca2ded92 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18500#p18500
15 years ago
lotus 32972139af added nice configuration for the resource observer
15 years ago
orbiter 3f771d2a16 fix for rss parser: be lazy when rss is not well-formed
15 years ago
orbiter dff4f95c78 some patches to get the torrent parser working
15 years ago
hermens 574f49903e Prevent blob merge from possibly losing the last container
15 years ago
orbiter 83d05e9176 added sixcoolers hack with some modifications:
15 years ago
orbiter fbd24c2d84 integrated the torrent parser
15 years ago
orbiter bd32f8b8cb added a torrent metadata file parser
15 years ago
orbiter 610e3ffffb Added new classes for the implementation of concurrent greedy algorithms.
15 years ago
orbiter d0b7bf9ca2 added a decoder class for Bencoding
15 years ago
low012 028657f019 *) adding more SVN properties
15 years ago
low012 82d740050f *) adding more SVN properties
15 years ago
low012 e04cb8cef0 *) adding more SVN properties
15 years ago
low012 dcb1096fb0 *) adding more SVN properties
15 years ago
low012 7d610e0063 *) minor changes
15 years ago
lotus 9bee0ac780 more logging for DHTrule
15 years ago
orbiter c14233a933 fix for a OOM in MapView that can cause unavailability of
15 years ago
orbiter 37245430c3 fix for NPE during DHT RWI selection
15 years ago
orbiter 959b38b61b fix for memory tracker
15 years ago
orbiter a37878b7d5 url parser regex performance hack
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter 8281e29963 - more configuration for profiling graph (number of events)
15 years ago
lotus 713cb26a27 update for memory observer algorithm
15 years ago
orbiter 29fde9ed49 better control of ranking order in sort stack
15 years ago
orbiter 93caa38d55 fix for bug in SortStack (did not appear to shrink according to required size) - caused bad and unsufficient search results
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 491ba6a1ba - some refactoring in workflow
15 years ago
orbiter 969123385b added json and rss output for image search
15 years ago
orbiter d183f8d980 refactoring (moved code from ContentTransformer to TemplateEngine)
15 years ago
orbiter 23aef43786 - better synchronization in SortStack
15 years ago
orbiter 7b1f5b0430 - better media search ranking
15 years ago
orbiter 4df88a4e7a - fixes for missing or bad hashCode computation
15 years ago
orbiter dbdf2570ba added comparator and more fixes for SortStack/SortStore
15 years ago
orbiter d2938c44a1 - added bmp parser to the document parsers
15 years ago
orbiter 1dff620181 Better implementation of SortStack and SortStore and adoptions in all using classes to implement the necessary Comparable interface and hash code computation.
15 years ago
orbiter fe41a84330 some enhancements in web caching: avoid double loading of response metadata and/or content
15 years ago
orbiter 06d0dcde20 more enhancements to image search
15 years ago
orbiter 4c6312d103 enhanced image search
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter fd0658ce7c avoid forced execution of InetAddress.getLocalHost() at startup, because that hangs at some strangely declared linux configurations. The Domains.localHostAddresses object is first instantiated with a more simple logic and enriched with more host addresses using a concurrent thread that will not block a startup process.
15 years ago
orbiter 013f337d3f - avoid unnecessary host name lookups for localhost
15 years ago
orbiter 5afd9f7a91 fix for crlf writing
15 years ago
orbiter 2d3c98b742 less computation within synchronized blocks
15 years ago
orbiter 1a146b0d73 added a patch to ignore bad mime-ignore patterns
15 years ago
orbiter 29fe436e36 - fixed post-ranking including prefer mask
15 years ago
orbiter 5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
15 years ago
orbiter a97fdb4566 catch for NPE in image parser
15 years ago
orbiter 534182559c removed concurrency hacks from SplitTable because it showed deadlock-like situation.
15 years ago
orbiter cd6745b292 accept rss feeds without channel descriptions
15 years ago
orbiter 08f1cbb125 another update to the pdf parser
15 years ago
orbiter 54c54fb144 get a handle for grep: 'StackTrace'
15 years ago
orbiter 605e896d6c more details for exception catching when parsing pdfs
15 years ago
lotus 6edc168cfe option to disable dht by memory limit:
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 19f31bb043 - moved OAI-PMH source list file from SETTINGS to DICTIONARIES/harvesting
15 years ago
low012 e77c906673 *) minor changes mainly in comments
15 years ago
low012 f1740edbf8 *) added skript to change memory settings, password and port (experimental, don't blame me if it messes up your configuration)
15 years ago
orbiter 11f7da06ed - fixes to csv parser
15 years ago
orbiter 9b6762ec2e - added a csv "comma separated values" parser to parse OAI-PMH sources from
15 years ago
orbiter 176e334aa4 fixes
15 years ago
orbiter 2fa6bf440b workflow update to OAI-PMH importer
15 years ago
orbiter b0b7a4f9a5 - added function to OAI-PMH reader that can pull all records from a server using an evaluation of the resumption token to get URL to retrieve remaining records
15 years ago
orbiter 350d13e153 very first working version of oai-pmh importer: if given the right url, the importer can read and index listRecord xml files and calculate the right resumptionURL which is then given as next default start point for the importer url input.
15 years ago
lotus 58616d99e4 patch for yacy disk usage detection on lvm host
15 years ago
orbiter a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
15 years ago
orbiter 4240785f20 added anti-alias function for line drawing
15 years ago
orbiter 30f108f97d added stub of oai-pmh importer (not working yet)
15 years ago
orbiter 77c99e500f added more control over memory allocation
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 26fafd85a5 - more refactoring
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter a8ce192f63 - shifted main classes to new package net.yacy
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter c864901087 - moved httpd.mime to defaults path
15 years ago
orbiter 6192205533 more final modifier
15 years ago
orbiter 0f6b011e1a fix for new index location and better way to use own classes by reflection
15 years ago
orbiter 7a3bbd950f :-(
15 years ago
orbiter b953f04f90 one more reflection fix
15 years ago
orbiter 77d6604856 fix for npe, see http://forum.yacy-websuche.de/viewtopic.php?p=17727#p17727
15 years ago
orbiter 2a7fe35f92 performance tuning using more final modifiers in the kelondro core
15 years ago
orbiter cb4de9ceee fixed a bug in table iterator (did not recognize elements in write buffer)
15 years ago
orbiter e7f18ba24b refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter bd876eb4b7 moved io classes
15 years ago
orbiter c0e0e1f422 moved blob classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 194da25a2f moved kelondro index
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago