Commit Graph

167 Commits (b3ffcde0c77b07b0f5f74c34751eeb3999ba8edf)

Author SHA1 Message Date
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of & parts inside of the
12 years ago
Michael Peter Christen 554db5608b fix for ViewFile
12 years ago
Michael Peter Christen 1533bfd63b refactoring
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago
Michael Peter Christen d8425e6809 added collections to crawl monitor
12 years ago
Michael Peter Christen 0cab06c47c refactoring
12 years ago
Michael Peter Christen 18f989dfb1 - refactoring (load -> getMetadata)
12 years ago
Michael Peter Christen b51df6c7e8 - added coordinate storage in solr schema
12 years ago
Michael Peter Christen 24d9db1613 snippet retrieval loading processes may use a smaller minimum load time
12 years ago
Michael Peter Christen cba4ab862e fix for http://bugs.yacy.net/view.php?id=202
12 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
12 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
orbiter 78fc3cf8f8 refactoring and new usage of SentenceReader: this class appeared as one
13 years ago
Michael Peter Christen 1825f165b8 better integration of blacklist according to use case
13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class:
13 years ago
Michael Peter Christen 9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
13 years ago
Michael Peter Christen 52f5d40043 better abstraction of document model generation
13 years ago
Michael Peter Christen 64c0268b2b show triplestore metadata in yacydoc and viewfile
13 years ago
Michael Peter Christen 8b974905ee changed log-in text for all servlets with authentication:
13 years ago
Michael Peter Christen a3badd3205 changed search process for images: no more media snippet load process,
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen 204c29f010 small bugfixes for search result display and cache display
13 years ago
orbiter e22f8497c9 - tested the ARC methods
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process
13 years ago
orbiter 37e35f2741 normalization of url using urlencoding/decoding
13 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter b00e69c5df removed test output
13 years ago
orbiter 5dd2efc9a2 - bugfixes in html parser
13 years ago
sixcooler 59b767eebd stop loading via http at defined maximum of bytes - even size is unknown before loading
13 years ago
orbiter 115abc8917 - more attributes for search progress bar
14 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
14 years ago
orbiter 5b579e21a3 code cleanup
14 years ago
orbiter 9b25d07295 - added geo information parsing to html parser
14 years ago
low012 2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
14 years ago
orbiter 694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
low012 3d95981f7d *) cleaning up the code a little bit
14 years ago
f1ori 9d2159582f * fix system update if urls are in blacklist (for example for very general blacklists like *.de)
14 years ago
orbiter 7bb4b001ed - view image files from cache
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter 58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
14 years ago
mikeworks 61e87c0b14 IndexControlRWIs_p.html, IndexControlURLs_p.html, ViewFile.html/.java: changes to HTML output and   in case of empty values for XHTML strict / transitional validation
14 years ago
orbiter 10a9cb1971 simplified snippet computation process and separated the algorithm into two classes
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago