orbiter
a29a11e526
added evaluation of incoming links in webstructure api
...
the api hash changed, new XML schema.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5774 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f6691411b5
- migration of files from SplitTable (which are used for the URL-DB) to a different file name format.
...
- the file generation logic is slightly different: files may now have only a maximum size of one gigabyte and a maximum age of one month.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5773 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
shostakovich
1f37cc6107
Robots.txt is now reused after one day. See forum-topic:
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1669&p=13565#p13565
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5772 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f21a8c9e9c
a different naming scheme for BLOBArray files. This may be necessary if blobs are written more often than once in a second.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5771 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
7ba078daa1
- added fast site-operator
...
- refactoring merge into BLOBArray
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b4126432bc
hardening of index dump write process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5769 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9bfb2641db
- removed deprecated threads
...
- added automatic http client reset. this was necessary because excessive intranet crawling caused deadlocks. this hack solved the problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5768 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
293290c317
fix for bad assert in last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5767 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
bd409fb7ba
added web structure analysis for a special domain that can be requested from the api.
...
Example:
http://localhost:8080/api/webstructure.xml?about=www.yacy.net
returns a xml with the following content:
<?xml version="1.0"?>
<webstructure>
<domains reference="reverse" count="1" maxref="300">
<domain host="www.yacy.net" id="FXg39Q" date="20090401">
<citation host="java.sun.com" id="o-R3yY" count="1" />
<citation host="yacy-suche.de" id="-KCLaB" count="1" />
<citation host="suma-ev.de" id="VRAHIA" count="1" />
<citation host="www.kit.edu" id="EMaLDQ" count="1" />
<citation host="yacy.net" id="Fh1hyQ" count="1" />
<citation host="www.fzk.de" id="V2Kl-A" count="1" />
<citation host="en.wikipedia.org" id="rwtdfR" count="3" />
<citation host="vimeo.com" id="MmdQDY" count="3" />
<citation host="liebel.fzk.de" id="sX4ozA" count="6" />
</domain>
</domains>
</webstructure>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5766 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b6c2167143
- patch for bad web structure dumps
...
- added automatic slow down of accessed to specific domains when access to a web page fails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5765 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0139988c04
- added writing of temporary file names and renaming to final file name when index dump/merge are done. Interrupted merges can be cleaned up.
...
- added clean-up of unfinished merges and unused idx/gap files
- enhanced merge file selection method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5764 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3621aa96ab
- added a memory protection for the IndexCell migration
...
- fix for bad cell file selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5763 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
568e8f1741
fix in unmountBLOB
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5762 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9da69d6b68
- better selection of files to be merged
...
- fix for getChannel().close(), which works on windows but not on macs and linux
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5761 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d39a5b42ca
more care about open file handles. Now files also close on windows and can be deleted afterwards.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5760 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
029495e64d
fixed bug introduced in SVN 5756 in EcoTable.put()
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5759 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
587838bd09
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5758 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d2e2420a68
- added another file selection method for index cell merge
...
- more hacks to check that files are closed propertly and filehandles do not exist after files are closed.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5757 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
96eaecda3e
- added migration class to go from index collections to the index cell data structure.
...
- added better control over file deletion, because this sometimes fails, especially on windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5756 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
9ab009b16b
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1890#p13476
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5755 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0f0b4aec75
better index cell merge logic
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5754 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
832fef670f
migration of urls-files into subdirectory METADATA
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5753 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fa07234d4e
fix for clear method: now deletes files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5752 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
eb65990f85
small fix for opera in yacyui-portalsearch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5751 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
695c420bcd
small fix for yacyui-portalsearch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5750 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
95885a263a
- added default properies to yacyui-portalsearch
...
- see http://localhost.8080/yacy/ui/yacyui-portaltest.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5749 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
c001a020a9
- small modifications to yacyui-portalsearch
...
- see http://forum.yacy-websuche.de/viewtopic.php?f=15&t=1762&p=13459#p13459
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5748 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lulabad
df87e4dbf6
missing count of send Index and URLs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5747 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
34a825f90d
small fix for yacyui-portaltest.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5746 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
9f9d7f875d
small fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5745 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
453f3aaa94
RichClient: further clean-up
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5744 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
e888c9a934
RichClient:
...
- renamed base theme to start theme
- reoved all but start theme
- additional themes can be downloaded from http://jquery-ui.googlecode.com/files/jquery-ui-themes-1.7.zip
- or a custom theme can be generated at http://jqueryui.com/themeroller/
- themes are installed into DATA/LOCALE/htroot/yacy/ui/css/themes
- update for RichClient theme selection will follow soon
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5743 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
42c5f930c8
reverted an accidental commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5742 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
b5e6232f8d
small correction of font-size for portal search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5741 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
7425c6c3ca
added an ajax loading graph to portal search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5740 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
a975ae4a7e
Added YaCy portal search: http://localhost:8080/yacy/ui/yacyui-portaltest.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5739 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
b57a1820bd
small fix for jquery-faviconize-1.0.js to handle https properly
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5738 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
075b58a0a9
minor fixes for RichClient search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5737 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
borg-0300
c450e3746b
svn attributes added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5736 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
37f892b988
added new concurrent merger class for IndexCell RWI data
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5735 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
borg-0300
8c494afcfe
svn attributes added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5734 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
67aaffc0a2
- added Latency control to the crawler:
...
because of the strongly enhanced indexing speed when using the new IndexCell RWI data structures (> 2000PPM on my notebook), it is now necessary to control the crawling speed depending on the response time of the target server (which is also YaCy in case of some intranet indexing use cases).
The latency factor in crawl delay times is derived from the time that a target hosts takes to answer on http requests. For internet domains, the crawl delay is a minimum of twice the response time, in intranet cases the delay time is now a halve of the response time.
- added API to monitor the latency times of the crawler:
a new api at /api/latency_p.xml returns the current response times of domains, the time when the domain was accessed by the crawler the last time and many more attributes.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5733 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
7426dde6a6
windows installer:
...
* install 64 bit JRE in case of 64 bit OS (not testet yet)
* less languages but localized hint boxes
* some cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5732 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0926310461
another performance hack
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5731 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ebe5d69d14
performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5730 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
61f9dbf0cc
- fixed a display problem in watch crawler
...
- another small enhancement in balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5729 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b3f75e48fa
- enhanced balancer: auto-solving of waiting-deadlocks
...
- removed deprecated cache-init size value
- more debug lines for IndexCell cache dump merge
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5728 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9a90ea05e0
added a merge operation for IndexCell data structures
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5727 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d99ff745aa
fix for http://forum.yacy-websuche.de/viewtopic.php?p=13378#p13378
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5726 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0c3ab291c4
fix for http://forum.yacy-websuche.de/viewtopic.php?p=13354#p13354
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5725 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago