orbiter
c08f9b36a4
refactoring of wiki parser.
...
This was done to prepare the wiki parser as parser for wikipedia dumps, which will be used for performance test (to omit crawling)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5785 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
faeff21012
- fix for display of automatic ReCrawls in surftips
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5784 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
44e01afa5b
- refactoring
...
- a little bit more abstraction
- new interfaces for index abstraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5783 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
82fb60a720
increased memory limit for emergency cache flush
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5782 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4905a17f6a
moved xerces.jar from libx to lib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5781 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
9180617dd9
*) Classes to handle import of lists (especially blacklists) from XML files, not used yet, but will be used soon.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5780 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
596e6215dc
fix in case of white space in path name
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5779 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b887f4a116
keep more free mem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5778 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c2359f20dd
refactoring: better abstraction of reference and metadata prototypes.
...
This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index.
Moved to version 0.74
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ab656687d7
more strict BLOB initialization .. may also help to save some ram
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5776 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
5b138ada16
fixes to web structure reference collection and url construction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5775 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
a29a11e526
added evaluation of incoming links in webstructure api
...
the api hash changed, new XML schema.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5774 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f6691411b5
- migration of files from SplitTable (which are used for the URL-DB) to a different file name format.
...
- the file generation logic is slightly different: files may now have only a maximum size of one gigabyte and a maximum age of one month.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5773 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
shostakovich
1f37cc6107
Robots.txt is now reused after one day. See forum-topic:
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1669&p=13565#p13565
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5772 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f21a8c9e9c
a different naming scheme for BLOBArray files. This may be necessary if blobs are written more often than once in a second.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5771 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
7ba078daa1
- added fast site-operator
...
- refactoring merge into BLOBArray
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b4126432bc
hardening of index dump write process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5769 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9bfb2641db
- removed deprecated threads
...
- added automatic http client reset. this was necessary because excessive intranet crawling caused deadlocks. this hack solved the problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5768 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
293290c317
fix for bad assert in last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5767 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
bd409fb7ba
added web structure analysis for a special domain that can be requested from the api.
...
Example:
http://localhost:8080/api/webstructure.xml?about=www.yacy.net
returns a xml with the following content:
<?xml version="1.0"?>
<webstructure>
<domains reference="reverse" count="1" maxref="300">
<domain host="www.yacy.net" id="FXg39Q" date="20090401">
<citation host="java.sun.com" id="o-R3yY" count="1" />
<citation host="yacy-suche.de" id="-KCLaB" count="1" />
<citation host="suma-ev.de" id="VRAHIA" count="1" />
<citation host="www.kit.edu" id="EMaLDQ" count="1" />
<citation host="yacy.net" id="Fh1hyQ" count="1" />
<citation host="www.fzk.de" id="V2Kl-A" count="1" />
<citation host="en.wikipedia.org" id="rwtdfR" count="3" />
<citation host="vimeo.com" id="MmdQDY" count="3" />
<citation host="liebel.fzk.de" id="sX4ozA" count="6" />
</domain>
</domains>
</webstructure>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5766 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b6c2167143
- patch for bad web structure dumps
...
- added automatic slow down of accessed to specific domains when access to a web page fails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5765 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0139988c04
- added writing of temporary file names and renaming to final file name when index dump/merge are done. Interrupted merges can be cleaned up.
...
- added clean-up of unfinished merges and unused idx/gap files
- enhanced merge file selection method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5764 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3621aa96ab
- added a memory protection for the IndexCell migration
...
- fix for bad cell file selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5763 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
568e8f1741
fix in unmountBLOB
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5762 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9da69d6b68
- better selection of files to be merged
...
- fix for getChannel().close(), which works on windows but not on macs and linux
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5761 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d39a5b42ca
more care about open file handles. Now files also close on windows and can be deleted afterwards.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5760 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
029495e64d
fixed bug introduced in SVN 5756 in EcoTable.put()
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5759 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
587838bd09
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5758 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d2e2420a68
- added another file selection method for index cell merge
...
- more hacks to check that files are closed propertly and filehandles do not exist after files are closed.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5757 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
96eaecda3e
- added migration class to go from index collections to the index cell data structure.
...
- added better control over file deletion, because this sometimes fails, especially on windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5756 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
9ab009b16b
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1890#p13476
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5755 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0f0b4aec75
better index cell merge logic
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5754 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
832fef670f
migration of urls-files into subdirectory METADATA
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5753 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fa07234d4e
fix for clear method: now deletes files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5752 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
eb65990f85
small fix for opera in yacyui-portalsearch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5751 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
695c420bcd
small fix for yacyui-portalsearch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5750 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
95885a263a
- added default properies to yacyui-portalsearch
...
- see http://localhost.8080/yacy/ui/yacyui-portaltest.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5749 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
c001a020a9
- small modifications to yacyui-portalsearch
...
- see http://forum.yacy-websuche.de/viewtopic.php?f=15&t=1762&p=13459#p13459
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5748 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lulabad
df87e4dbf6
missing count of send Index and URLs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5747 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
34a825f90d
small fix for yacyui-portaltest.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5746 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
9f9d7f875d
small fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5745 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
453f3aaa94
RichClient: further clean-up
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5744 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
e888c9a934
RichClient:
...
- renamed base theme to start theme
- reoved all but start theme
- additional themes can be downloaded from http://jquery-ui.googlecode.com/files/jquery-ui-themes-1.7.zip
- or a custom theme can be generated at http://jqueryui.com/themeroller/
- themes are installed into DATA/LOCALE/htroot/yacy/ui/css/themes
- update for RichClient theme selection will follow soon
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5743 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
42c5f930c8
reverted an accidental commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5742 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
b5e6232f8d
small correction of font-size for portal search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5741 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
7425c6c3ca
added an ajax loading graph to portal search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5740 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
a975ae4a7e
Added YaCy portal search: http://localhost:8080/yacy/ui/yacyui-portaltest.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5739 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
b57a1820bd
small fix for jquery-faviconize-1.0.js to handle https properly
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5738 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
075b58a0a9
minor fixes for RichClient search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5737 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
borg-0300
c450e3746b
svn attributes added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5736 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago