sixcooler
f64e78497a
fix for reload-feature in Crawler_p
13 years ago
cominch
a120ef660b
RDF demo servlet
13 years ago
Michael Peter Christen
638390930d
another patch to fix the Crawler_p layout
13 years ago
Michael Peter Christen
c846e9ca14
redesign of the crawler monitor page: show crawled pages instead of
...
queue of urls that shall be crawled
13 years ago
Michael Peter Christen
08dcf3e5d1
hack to get all results if the actual number is between 10 and 64
13 years ago
Michael Peter Christen
f8cd57c92f
new indexing strategy: ALL links that appear anywhere are indexed, not
...
only links where the content can be parsed. All non-parseable links are
placed into the noload queue. The search process must therefore be able
to filter out non-text search results.
- This fixes the problem that image search results appeared in the text
search.
- The interactive search can retrieve now ALL types of links
- The p2p interface is now extended to retrieve only certain types of
links (text, image, video, apps)
- The search process has an extension to filter the right document type
according to the search query
13 years ago
Michael Peter Christen
fa7b3481b3
better navigation in file search: less results by first try, but much
...
faster. after the first search is done, buttons appear to get more
results for the same search
13 years ago
Michael Peter Christen
6e51a00a2f
Revert "fix for page navigation: show only as much pages as are available for given navigation constraints, not as given by total results size"
...
This reverts commit 73f5a9e8b3
.
13 years ago
Michael Peter Christen
73f5a9e8b3
fix for page navigation: show only as much pages as are available for
...
given navigation constraints, not as given by total results size
13 years ago
Michael Peter Christen
9ad1d8dde2
complete redesign of crawl queue monitoring: do not look at a
...
ready-prepared crawl list but at the stacks of the domains that are
stored for balanced crawling. This affects also the balancer since that
does not need to prepare the pre-selected crawl list for monitoring. As
a effect:
- it is no more possible to see the correct order of next to-be-crawled
links, since that depends on the actual state of the balancer stack the
next time another url is requested for loading
- the balancer works better since the next url can be selected according
to the current situation and not according to a pre-selected order.
13 years ago
apfelmaennchen
c7f88f3fd1
fix for http://bugs.yacy.net/view.php?id=101 - the default crawl
...
depth for bookmarks is now editable.
13 years ago
Michael Peter Christen
f214f6ebb4
added no-load queues to the crawler monitor
13 years ago
Michael Christen
1cf0f35621
the link to the path shall be the path
13 years ago
apfelmaennchen
77317a88e0
Added nice jquery tagsinput to bookmarks dialog - similar to delicious.com ;-)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8133 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
9b0879c184
added a hint that the interactive search is only searching in the local index
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8116 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5b2e68b60d
fixed page navigation counter
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8113 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
77a080ced9
smaller fixes for YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8105 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
dd1482aaf5
further update to YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8100 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
564374d1fe
- included YMarks in addition to old bookmarks in yacysearchitem.html; don't get confused by the old bookmark dialog, the ymark is automatically added silently beforehand.
...
- reworked bookmark creation on crawlstart
- many smaller adjustments to ymarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8072 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
6287c2b4a9
YMarks:
...
- introduced tag manager - a quite powerful tool (still not 100% stable, so be careful)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8060 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
5581be12fb
YMarks:
...
- added backend and api for tag management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8058 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
a3eebfdcba
YMarks:
...
- show active/running crawls
- execute crawls (works currently only if API entry is available)
- various smaller fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8056 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
4f95f72124
YMarks:
...
- working direct importer for YaCy Crawl Starts
- working direct import for old bookmarks.db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8052 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
a8dfe787ed
- updated to jquery flexigrid 1.1
...
- YMarks.html automatically recognizes if a bookmark is a crawl start
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8040 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f8b8c82421
- refactoring of getpageinfo_p.xml (moved out of util)
...
- added more logging in getpageinfo_p.xml
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8037 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ff32469272
added a link to /api/util/getpageinfo_p.xml as API to crawl start info and to ViewFile.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8035 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
5f7dbe1c42
- some refactoring (ymarks)
...
- improvement for autotagger (is now able to create/detect multi word tags e.g. 'open source')
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8031 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
2adc30d335
suppressing size if size unknown
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8005 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
b5b09b329c
BOOSTED the image search function. The result page now shows the images as embedded image link from the original source and not from the
...
built-in image buffering and re-sizing servlet. The result is shown much faster now not because YaCy does not need to re-size the images but
for a very strange other reason: because of RFC specification (http://tools.ietf.org/html/rfc2616#section-8.1.4 ) a browser does not open more than
two connections to the same server at the same time. If the YaCy image servlet is used, then the target host is the YaCy host for all images
and that prevents a parallel computation of the image loading.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7998 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
30d340563e
fix in result count display
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7967 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e48ce5d80e
- style change for search box: larger font, selected by default
...
- style change for search results: by default no parser, size, image info
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7949 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
b0b4886618
try to avoid the unresolved pattern in search result
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7940 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
656286347e
fix for javascript error during search (not ready yet)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7923 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
0229029dcf
a bit protection against search result bugs in interactive search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7920 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ca09081341
better interaction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7875 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
8e03b8ee8b
better integration of server list in interactive search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7870 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
594d8f546a
#cccamp11 maintenance fix: anons may find up to 1000 items in interactive search (was: 100)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7866 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
115abc8917
- more attributes for search progress bar
...
- moved cache strategy to cora package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7778 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
fcd4b03892
show progress of search after display of results is finished
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7712 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
b0bdf2d9ed
*) Oops!
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7490 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
de065e594f
*) make sure that only positive values are accepted as refresh interval on Crawler Monitor page
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7489 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
621e176071
enhancement in table display of path names
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7417 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
2751c52617
layout
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7415 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
89ae6101b9
fix for NPE and added comment in search result
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7412 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
e38217fe88
small changes to scanner
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7393 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
58b59f9bc8
- a collection of bug fixes and some redesign of the Scanner class
...
- fixed smb crawling
- added smbget to download script generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7381 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
c36da90261
added a very fast ftp file list generator to site crawler:
...
- when a site-crawl for ftp sites is now started, then a special directory-tree harvester gets the complete directory structure of a ftp server at once
- the harvester runs concurrently and feeds into the normal crawl queue
also in this:
- fixed the 'start from file' crawl function
- added a link detector for the html parser. The html parser can now also extract links that are not included in <a> tags.
- this causes that a crawl start is now also possible from clear text link files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7367 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
4565b2f2c0
removed the display option from index.html, yacysearch.html and yacyinteractive.html
...
instead, a setting at ConfigPortal.html can be made to define if the topmenu shall be shown at these pages or if there is no naviagtion at all.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7366 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
18d33b5c6d
fixed several search result navigation bugs
...
fixed bad behaviours during search result collection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7362 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
49b5a206cd
- better caclculation of search result size
...
- predefined search recommendations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7361 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago