yacy_search_server/htroot/Help.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>YaCy: Help</title>
#[metas]#
</head>
<body marginheight="0" marginwidth="0" leftmargin="0" topmargin="0">
#[header]#
<br><br>
<h2>Help</h2>

<p>
This is a distributed web crawler and also a caching HTTP proxy. You are using the <i>online-interface</i> of the application. You can use this interface to configure your personal settings, proxy settings, access control and crawling properties. You can also use this interface to start crawls, send messages to other peers and monitor your index, cache status and crawling processes. Most important, you can use the search page to search either your own or the <i>global</i> index.
</p>

<p>
For more detailed information, visit the <a href="http://www.yacy.net/yacy">YaCy homepage</a>.
</p>

<h3>Local and Global Search: Options and Functions</h3>
The proxy provides a search interface that accesses your local index, created from web pages that passed the proxy.
The search can also be applied globally, by searching other peers. You can use the following options to enhance your search results:

<table border="0" cellspacing="1" cellpadding="3" width="100%">

<tr><td width="30%" valign="top"><b>Search Word List</b></td><td width="70%">
You can search for several words simultanous. Words must be separated by a single space.
The words are treated conjunctive, that means every must occur in the result, not any.
If you do a global search (see below) you may get different results each time you do a search.

</td></tr><tr><td valign="top"><b>Maximum Number of Results</b></td><td>
You can select the number of wanted maximum links. We do not yet support multiple result pages for virtually any possible link.
Instead we encourage you to enhance the search result by submitting more search words.

</td></tr><tr><td valign="top"><b>Result Order Options</b></td><td>
The search engine provides an experimental 'Quality' ranking. In contrast to other known search engines we provide also
a result order by date. If you change the order to 'Date-Quality' the most recently updated page from the search results is listed first.
For pages that have the same date the second order, 'Quality' is applied.

</td></tr><tr><td valign="top"><b>Resource Domain</b></td><td>
This search engine is constructed to search the web pages that pass the proxy. But the search index is distributed to other peers as well,
so you can search also globally: this function is currently only rudimentary, but can be choosen for test cases. Future releases will
automatically distribute index information <i>before</i> a search happends to form a performant distributed hash table -- a very fast global search.

</td></tr><tr><td valign="top"><b>Maximum Search Time</b></td><td>
Searching the local index is extremely fast, it happends within milliseconds, even for a large number (millions) of pages. But searching the
global index needs more time to find the correct remote peer that contains best search results. This is especially the case while the
distributed index is in test mode. Search results get more stable (repeated global search produce more similar results) the longer
the search time is.

</td></tr></table>
<hr>
You may want to use accesskeys to navigate through the YaCy webinterface:<p>

    <li> Windows and Internet Explorer: Alt + Accesskey + Enter<br>
    <li> Windows and Mozilla/Firefox/Netscape: Alt + Accesskey<br>
    <li> Windows and Opera: Shift + Esc + Accesskey<br>
    <li> Macintosh and Internet Explorer: Strg + Accesskey + Enter<br>
    <li> Macintosh and Safari: Strg + Accesskey<br>
    <li> Macintosh and Mozilla/Firefox/Netscape: Strg + Accesskey<br>
    <li> Macintosh and Opera: Shift + Esc + Accesskey<br>
    <li> Linux Mandrake and Galeon/Mozilla: Alt + Accesskey<br>
    <li> All OS and Amaya: Strg + Accesskey<p>
	
s --> Search Page<br>
n --> News<br>
w --> Network<br>
t --> Status<br>
<br>
<br>
<hr>
YaCy uses Regular Expressions for some functions, for example in the blacklist.<br>
<br>
There are some standards for these regexps, YaCy uses the syntax used by Perl 5.<br>
Here ist a short overview about the functions, which should fir for most cases:<br>
<br>
<br>
<table>
<tr><td>.</td><td>: arbitrary character</td></tr>
<tr><td>x</td><td>: character x</td></tr>
<tr><td>^x</td><td>: not x</td></tr>
<tr><td>x*</td><td>: 0 or more times x</td></tr>
<tr><td>x?</td><td>: 0 or 1 time x</td></tr>
<tr><td>x+</td><td>: 1 or more times x</td></tr>
<tr><td>xy</td><td>: concatenation of x and y</td></tr>
<tr><td>x|y</td><td>: x or y</td></tr>
<tr><td>[abc]</td><td>: a or b or c (same as a|b|c)</td></tr>
<tr><td>[a-c]</td><td>: a or b or c (same as above)</td></tr>
<tr><td>x{n}</td><td>: exactly n appearances of x</td></tr>
<tr><td>x{n,}</td><td>: at least n appearances of x</td></tr>
<tr><td>x{n,m}</td><td>: at least n, maximum m appearanches of x</td></tr>
<tr><td>( )</td><td>: Modify priority of instructions</td></tr>
<tr><td>\</td><td>: Escape-Character, used to escape special characters (for example "[" or "*"), so that they loose their special meaning</td></tr>
</table>
<br>
<br>
Regex follow a special priority (descending): concatenation, unary operators (*,+,^,{}), binary operators (|). This can be overridden with brackets.<br>
<br>
Example:<br>
<br>
.*heise.de/.*/[0-9]+<br>
<br>
This matches heise.de/ with a string in front of it, for example "http://www.", followed by any string, then a slash and a number. The dot in "heise.de" is not escaped with "\", because it represents any character, thus the "." itself, too.<br>
A possible URL which would match this regexp is: http://www.heise.de/newsticker/meldung/59421<br>
An URL which would not match is: http://www.heise.de/tp/r4/artikel/20/20701/1.html<br>
There is ".html" at the end, which is not included with the Regular Expression.
#[footer]#
</body>
</html>
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">`
			`<html>`
			`<head>`
Changed <head> YACY into YaCy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@58 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<title>YaCy: Help</title>`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`#[metas]#`
			`</head>`
			`<body marginheight="0" marginwidth="0" leftmargin="0" topmargin="0">`
			`#[header]#`
			`<br><br>`
			`<h2>Help</h2>`

			`<p>`
Added some accesskeys to webinterface and updated help with some hints on how to use them. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@91 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`This is a distributed web crawler and also a caching HTTP proxy. You are using the <i>online-interface</i> of the application. You can use this interface to configure your personal settings, proxy settings, access control and crawling properties. You can also use this interface to start crawls, send messages to other peers and monitor your index, cache status and crawling processes. Most important, you can use the search page to search either your own or the <i>global</i> index.`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`</p>`

			`<p>`
Added some accesskeys to webinterface and updated help with some hints on how to use them. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@91 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`For more detailed information, visit the <a href="http://www.yacy.net/yacy">YaCy homepage</a>.`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`</p>`

			`<h3>Local and Global Search: Options and Functions</h3>`
Added some accesskeys to webinterface and updated help with some hints on how to use them. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@91 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`The proxy provides a search interface that accesses your local index, created from web pages that passed the proxy.`
			`The search can also be applied globally, by searching other peers. You can use the following options to enhance your search results:`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago
			`<table border="0" cellspacing="1" cellpadding="3" width="100%">`

			`<tr><td width="30%" valign="top"><b>Search Word List</b></td><td width="70%">`
			`You can search for several words simultanous. Words must be separated by a single space.`
			`The words are treated conjunctive, that means every must occur in the result, not any.`
			`If you do a global search (see below) you may get different results each time you do a search.`

			`</td></tr><tr><td valign="top"><b>Maximum Number of Results</b></td><td>`
			`You can select the number of wanted maximum links. We do not yet support multiple result pages for virtually any possible link.`
			`Instead we encourage you to enhance the search result by submitting more search words.`

			`</td></tr><tr><td valign="top"><b>Result Order Options</b></td><td>`
			`The search engine provides an experimental 'Quality' ranking. In contrast to other known search engines we provide also`
			`a result order by date. If you change the order to 'Date-Quality' the most recently updated page from the search results is listed first.`
			`For pages that have the same date the second order, 'Quality' is applied.`

			`</td></tr><tr><td valign="top"><b>Resource Domain</b></td><td>`
			`This search engine is constructed to search the web pages that pass the proxy. But the search index is distributed to other peers as well,`
			`so you can search also globally: this function is currently only rudimentary, but can be choosen for test cases. Future releases will`
			`automatically distribute index information <i>before</i> a search happends to form a performant distributed hash table -- a very fast global search.`

			`</td></tr><tr><td valign="top"><b>Maximum Search Time</b></td><td>`
			`Searching the local index is extremely fast, it happends within milliseconds, even for a large number (millions) of pages. But searching the`
			`global index needs more time to find the correct remote peer that contains best search results. This is especially the case while the`
			`distributed index is in test mode. Search results get more stable (repeated global search produce more similar results) the longer`
			`the search time is.`

			`</td></tr></table>`
Added some accesskeys to webinterface and updated help with some hints on how to use them. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@91 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<hr>`
			`You may want to use accesskeys to navigate through the YaCy webinterface:<p>`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago
Added some accesskeys to webinterface and updated help with some hints on how to use them. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@91 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<li> Windows and Internet Explorer: Alt + Accesskey + Enter<br>`
			`<li> Windows and Mozilla/Firefox/Netscape: Alt + Accesskey<br>`
			`<li> Windows and Opera: Shift + Esc + Accesskey<br>`
			`<li> Macintosh and Internet Explorer: Strg + Accesskey + Enter<br>`
			`<li> Macintosh and Safari: Strg + Accesskey<br>`
			`<li> Macintosh and Mozilla/Firefox/Netscape: Strg + Accesskey<br>`
			`<li> Macintosh and Opera: Shift + Esc + Accesskey<br>`
			`<li> Linux Mandrake and Galeon/Mozilla: Alt + Accesskey<br>`
Fixed some spelling mistakes and removed underline from accesskey chars git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@121 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<li> All OS and Amaya: Strg + Accesskey<p>`

			`s --> Search Page<br>`
integrated YaCyNews basis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@455 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`n --> News<br>`
			`w --> Network<br>`
Fixed some spelling mistakes and removed underline from accesskey chars git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@121 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`t --> Status<br>`
Added Regex tutorial to Help.html, because many people don't understand Regex. Added german translation for Regex tutorial. Added bold and underline to bbCode. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@522 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`<br>`
			`<br>`
			`<hr>`
			`YaCy uses Regular Expressions for some functions, for example in the blacklist.<br>`
			`<br>`
			`There are some standards for these regexps, YaCy uses the syntax used by Perl 5.<br>`
			`Here ist a short overview about the functions, which should fir for most cases:<br>`
			`<br>`
			`<br>`
			`<table>`
			`<tr><td>.</td><td>: arbitrary character</td></tr>`
			`<tr><td>x</td><td>: character x</td></tr>`
			`<tr><td>^x</td><td>: not x</td></tr>`
			`<tr><td>x*</td><td>: 0 or more times x</td></tr>`
			`<tr><td>x?</td><td>: 0 or 1 time x</td></tr>`
			`<tr><td>x+</td><td>: 1 or more times x</td></tr>`
			`<tr><td>xy</td><td>: concatenation of x and y</td></tr>`
			`<tr><td>x\|y</td><td>: x or y</td></tr>`
			`<tr><td>[abc]</td><td>: a or b or c (same as a\|b\|c)</td></tr>`
			`<tr><td>[a-c]</td><td>: a or b or c (same as above)</td></tr>`
			`<tr><td>x{n}</td><td>: exactly n appearances of x</td></tr>`
			`<tr><td>x{n,}</td><td>: at least n appearances of x</td></tr>`
			`<tr><td>x{n,m}</td><td>: at least n, maximum m appearanches of x</td></tr>`
			`<tr><td>( )</td><td>: Modify priority of instructions</td></tr>`
			`<tr><td>\</td><td>: Escape-Character, used to escape special characters (for example "[" or "*"), so that they loose their special meaning</td></tr>`
			`</table>`
			`<br>`
			`<br>`
			`Regex follow a special priority (descending): concatenation, unary operators (*,+,^,{}), binary operators (\|). This can be overridden with brackets.<br>`
			`<br>`
			`Example:<br>`
			`<br>`
			`.heise.de/./[0-9]+<br>`
			`<br>`
			`This matches heise.de/ with a string in front of it, for example "http://www.", followed by any string, then a slash and a number. The dot in "heise.de" is not escaped with "\", because it represents any character, thus the "." itself, too.<br>`
			`A possible URL which would match this regexp is: http://www.heise.de/newsticker/meldung/59421<br>`
			`An URL which would not match is: http://www.heise.de/tp/r4/artikel/20/20701/1.html<br>`
			`There is ".html" at the end, which is not included with the Regular Expression.`
initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542 20 years ago			`#[footer]#`
			`</body>`
			`</html>`