Commit Graph

2625 Commits (48f81acc0e82f173368bc6de671fc26fa3cecc57)
 

Author SHA1 Message Date
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
18 years ago
orbiter aa38721cf6 new features for surftipps
18 years ago
theli 64b2ef5aae *) Trying to bugfix shutdown problem
18 years ago
orbiter e03427871e enhanced surftipps:
18 years ago
theli e745b63c77 *) Bugfix for different behavior of indexDistributeWhileCrawling to other checkboxes on IndexControl_p.html
18 years ago
theli 1dc12d6659 *) Bugfix for shutdown problem caused by cacheScan thread
18 years ago
borg-0300 42173462f5 rename cutUrlText to shortenURLString;
18 years ago
borg-0300 af1d89e381 check url == null added;
18 years ago
theli cc667b0aa5 *) htmlFilterContentScraper.java: adding support for link tag
18 years ago
borg-0300 16ba5d1b46 topwords: only [a-z] words, quality is better;
18 years ago
theli 66a58502df *) configure logging filehandler to use UTF-8 for logging messages
18 years ago
theli 26dfbb7499 *) Bugfix for UTF-8: url names are now stored properly in stackcrawl, crawler, indexing queue and should be displayed correct on the gui
18 years ago
theli cf6acff2c2 *) Bugfix. htmlFilterInputStream document analysis did not work properly for documents smaller than the
18 years ago
borg-0300 f18304ddd3 unused/not needed imports removes;
18 years ago
orbiter ec031eb993 first version of surftipps
18 years ago
borg-0300 b174fbd0ca "import ...*" removed;
18 years ago
orbiter 807756150e patch for strange bug reported by email
18 years ago
theli 5c6251bced *) some improvements for extended html document charset support
18 years ago
theli 33f0f703c0 *) reinserting type cast again
18 years ago
orbiter 8c11a543dc fixed line ending coding
18 years ago
theli b690597275 *) adding casts to avoid compatibility problems between java 1.4 and java 1.5 writer class usage
18 years ago
theli 5afb0cbce8 *) setting default charset (for unkown documents) to iso-8859-1
18 years ago
orbiter f453c14b5d removed unreacheable catch blocks and unused imports
18 years ago
theli ad7f600f25 *) Bugfix. re-enabling inheritance of serverCharBuffer from writer class
18 years ago
theli 97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
18 years ago
theli fc594e8eda *) adding httpContentLengthInputStream.java class to allow reading of http response bodies
18 years ago
low012 cd636eb00e *) Fix for the fix...
18 years ago
low012 f9a5b55a9e *) Fixed bug described in http://www.yacy-forum.de/viewtopic.php?p=25448#25448
18 years ago
orbiter 3aac5b26da - added automatic tag generation when a web page from the search results is added
18 years ago
low012 8a30c5343d *) Fixed bug where exclamation marks could get lost between [=...=] and <pre>...</pre>
18 years ago
low012 d8f4b17e31 *) Hopefully fixed bug described in http://www.yacy-forum.de/viewtopic.php?t=2825.
18 years ago
michitux 2d9496577f Removed double labels for forms in Blacklist_p.html
18 years ago
michitux aa46269eff Less margin/padding for dls (e.g. in Messages)
18 years ago
michitux 567c40f5f0 Bookmark/delete-links now visible when mouse is over the searchresult, in standard-compliant browsers with css, in Microsoft Internet Explorer via JavaScript
18 years ago
theli 0e84a969d6 *) Bugfix for serverCharBuffer read from file operation
18 years ago
theli 90ef19d778 *) first version of a serverCharBuffer
18 years ago
orbiter d374ef2bbe bugfix for tryRemoveURLs
18 years ago
orbiter f644a1c3a7 better evaluation of index abstracts
18 years ago
orbiter 1b48473bc5 bugfix to utf8 recognition
18 years ago
orbiter 90f7241b59 serverByteBuffer.trim() can now recognize utf-8 characters
18 years ago
allo 2fd610b556 http://www.yacy-forum.de/viewtopic.php?p=25611#25611
18 years ago
rramthun 20e1754379 Various fixes for the languages
18 years ago
theli e34d9b3fec *) charset aware headlines (after the serverByteBuffer.trim problem is solved)
18 years ago
theli 8115ac47b5 *) charset aware metadata parsing
18 years ago
theli 3ac30bdf22 *) some todo markers added for additional charset support
18 years ago
orbiter d54144a4e3 fixed bad snippet behavior (hopefully)
18 years ago
theli 06fa891152 *) htmlFilterContentScraper.java: using proper charset for document title
18 years ago
orbiter 5015e780c2 - simplified watchCrawler code
18 years ago
theli 74c3e7cf29 *) storing document charset into plasmaParserDocument object (is needed later by the condenser)
18 years ago
theli c5d3020941 *) better errorhandling for last commit
18 years ago