Commit Graph

5394 Commits (8429967ea7400d640a1cc294f489870c69ed3a7e)

Author SHA1 Message Date
Michael Peter Christen 4d5da75814 fix for parser problem if a <a>-tag is 'within' html tags with unclosed
13 years ago
Michael Peter Christen 91a86f0b06 fixed to network graph testing
13 years ago
Michael Peter Christen 7b5b9baee0 added citation rank to ranking profile
13 years ago
Michael Peter Christen 046f3a7e8d check if httpc has decompressed the release file and rename the file
13 years ago
Michael Christen 02e4dedff2 fix to url citation collection
13 years ago
Michael Christen e32055aa15 added stub classes for
13 years ago
Michael Christen ac5d124ee0 experimental implementation of a citation ranking as post-ranking
13 years ago
Michael Christen 8fc86fe397 added storage of full anchor link structure:
13 years ago
Michael Christen 22f05c83ff fixed default must-match filter for full domain crawls - the old filter
13 years ago
Lotus 0b3f39136e allow custom ppm lower than minimum button on /Crawler_p.html
13 years ago
Michael Peter Christen 532c7cf827 added physics experiment to the graph plotter. not active by default
13 years ago
Michael Peter Christen aba9b1bfa0 better names for elements of a linked graph
13 years ago
Michael Peter Christen 0cc0290978 bugfix for a must-not-match pattern check. This bug did not make the
13 years ago
Michael Peter Christen 2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method.
13 years ago
Michael Peter Christen 8aba045ba1 if a new pop-up page is set in config portal, then this page applies
13 years ago
Michael Peter Christen 8c06925984 animation of the web structure picture
13 years ago
Michael Peter Christen 898fa7c3f3 use tld heuristic to check if a domain is local or global
13 years ago
Michael Peter Christen 213c8d97f2 use less proccesses in process pool
13 years ago
Michael Peter Christen c639248c23 protection against strange answers from remote peers during search
13 years ago
Michael Peter Christen 36e4d82b27 changed ranking
13 years ago
Michael Peter Christen 096c17e7cd added test code
13 years ago
Michael Peter Christen 665626a51b catch OOM errors during scanning
13 years ago
Michael Peter Christen 1cd711d005 added classes for citation references (for new citation ranking)
13 years ago
Michael Peter Christen 33a405dab8 ipv6 bugfix
13 years ago
Michael Peter Christen c6c61be3f0 fix for http://bugs.yacy.net/view.php?id=148
13 years ago
Michael Peter Christen e0f1e7d904 added new citation reference data structure that shall be used for a
13 years ago
Michael Peter Christen e18a4f6b74 more tolerant merge iterator
13 years ago
Michael Peter Christen 0d148c3353 more logging in resource observer
13 years ago
Michael Peter Christen 2fa037ae1d enhanced crawler
13 years ago
Michael Peter Christen e101c2e0e2 added changes from copperdust (submitted by email):
13 years ago
low012 2120db289a *) Small change which should solve problem with cgitb module in Python CGI scripts.
13 years ago
Lotus ee89cf5ae5 fix must match filter for full domain crawl
13 years ago
Michael Peter Christen 8d63a5887c bugfixes
13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
13 years ago
Michael Peter Christen 7e4e3fe5b6 free some memory after parsing html
13 years ago
Michael Peter Christen 4540174fe0 memory hacks
13 years ago
Michael Peter Christen b4409cc803 small redesign of blob column index and usage
13 years ago
Michael Peter Christen d5c1f2746e performance hack
13 years ago
Michael Peter Christen 803963aebd performance hack: better space grow in CharBuffer (speeds up html
13 years ago
Michael Peter Christen 8b0920b0b5 tried to fix the ipv6 problem as reported in bug
13 years ago
Michael Peter Christen e2f8f263e8 changed storage of search words: keep order
13 years ago
Michael Peter Christen ed39ef2890 changed generation of protocol information
13 years ago
Michael Peter Christen 0b67a0a5d8 added a column index for tables in blob files. This is heavily used
13 years ago
Michael Peter Christen 2e5cd6a1b2 fixed parser extension deny list generation and usage
13 years ago
Michael Peter Christen 8bee1472c9 there is no noindex, only nofollow in links
13 years ago
Michael Peter Christen 3cd6dcd352 do not add new solr fields as activated fields
13 years ago
Michael Peter Christen e3bb73c3d6 serialized some database access methods
13 years ago
Michael Peter Christen 7e728867e5 added a synchronization around iterations to prevent IO-deadlocking
13 years ago
Michael Peter Christen 355ecf330f reduced target file site to 64mb
13 years ago
Michael Peter Christen 10ae6d94a1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago