this makes YaCy search results VERY fast for all verify=false search cases
and it enhances the search speed also for all other snippet-fetch cases.
With this change my peer performed 100 Queries Per Second (!!!) while doing 10 queries simultanously (!!!)
in an intranet index of 20000 URLs on my 16-core Mac
Check this yourself by doing:
cd bin
./searchtestmulti.sh
after finishing the run, divide 1000 by the given time per query (which is the qps for one thread)
and then multiply again by 10 (because 10 search threads has been started)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7231 6c8d7289-2bf4-0310-a012-ef5d649a1542
*) more beautyful and easier to understand code (IMO)
*) added display= parameter to a lot of links in Wiki.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7226 6c8d7289-2bf4-0310-a012-ef5d649a1542
terminal_p.html: Put back the old ID which was really easy to find
IndexCreate.js: Because XHTML 1.0 Strict does not allow name tags for some elements rewrote most element access functions to use getElementById
Table_API_p.html and all other html pages: Some XHTMl 1.0 Strict fixes, changed checkAll javascript, marked the first row with checkboxes as unsortable where applicable
Table_API_p.java and all other java pages: URLencoded lines with possible ampersands & -> & for validation XHTML 1.0 Strict sourcecode
--> All Index Create pages should validate now. Hope I did not break anything else (too much :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7225 6c8d7289-2bf4-0310-a012-ef5d649a1542
- better crawl star for files paths and smb paths
- added time-out wrapper for dns resolving and reverse resolving to prevent blockings
- fixed intranet scanner result list check boxes
- prevented htcache usage in case of file and smb crawling (not necessary, documents are locally available)
- fixed rss feed loader
- fixes sitemap loader which had not been restricted to single files (crawl-depth must be zero)
- clearing of crawl result lists when a network switch was done
- higher maximum file size for crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7214 6c8d7289-2bf4-0310-a012-ef5d649a1542
- nobody understand the auto-dom filter without a lenghtly introduction about the function of a crawler
- nobody ever used the auto-dom filter other than with a crawl depth of 1
- the auto-dom filter was buggy since the filter did not survive a restart and then a search index contained waste
- the function of the auto-dom filter was in fact to just load a link list from the given start url and then start separate crawls for all these urls restricted by their domain
- the new Site Link-List option shows the target urls in real-time during input of the start url (like the robots check) and gives a transparent feed-back what it does before it can be used
- the new option also fits into the easy site-crawl start menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7213 6c8d7289-2bf4-0310-a012-ef5d649a1542
terminal_p.html: Set new link for starting a crawl to CrawlStartSite_p.html and replaced the old embed object of the Among.us Flash object by their new JS which takes care of adding the object correctly
de.lng: Moved the translations for the JS part from yacyinteractive.html to the yacyinteractive.js part
--> Terminal page is now valid XHTML 1.0 Transitional
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7211 6c8d7289-2bf4-0310-a012-ef5d649a1542
ConfigPortal.html: Fixed some HTML problems to validate at least XHTML 1.0 Transitional - for strict the target attribute of the a link has to be removed
yacyinteractive.html: Moved all JS code to an external yacyinteractive.js file in JS folder
yacysearch.html: Removed embedded scripts from in between the body tags - now everything is loaded in the header
de.lng: Just in case JS files will be parsed at some point added translation for yacyinteractive.html result counter
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7208 6c8d7289-2bf4-0310-a012-ef5d649a1542
CrawlStartIntranet_p.html: New Intranet Crawl Start Servlet - minor HTML changes to get XHTML 1.0 Strict validation, remove (double) name tags, remove single ending </dt>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7204 6c8d7289-2bf4-0310-a012-ef5d649a1542
- migrated the 'yacy' user agent to 'yacybot' in many client methods since the 'yacy' user agent is only used for the proxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7199 6c8d7289-2bf4-0310-a012-ef5d649a1542
WatchWebStructure_p.html: Added JS verification of RGB color codes (currently only RGB value is checked but this could be enhanced to also check for websafe colors)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7195 6c8d7289-2bf4-0310-a012-ef5d649a1542
This should avoid confusion after a search for a word where it is possible to delete the word. If a delete button is shown to delete the word, then there should not be a button available to delete the whole index to avoide a wrong usage when a user searches only for a word to delete it.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7194 6c8d7289-2bf4-0310-a012-ef5d649a1542