You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
175 lines
7.9 KiB
175 lines
7.9 KiB
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head>
|
|
<title>YaCy '#[clientname]#': Crawl Results</title>
|
|
#%env/templates/metas.template%#
|
|
</head>
|
|
<body id="CrawlResults">
|
|
#%env/templates/header.template%#
|
|
<div class="SubMenu">
|
|
<h3>Crawl Results</h3>
|
|
<ul class="SubMenu">
|
|
<li><a href="/CrawlResults.html" class="MenuItemLink">Overview</a></li>
|
|
<li><a href="/CrawlResults.html?process=1" class="MenuItemLink lock">(1) Receipts</a></li>
|
|
<li><a href="/CrawlResults.html?process=2" class="MenuItemLink lock">(2) Queries</a></li>
|
|
<li><a href="/CrawlResults.html?process=3" class="MenuItemLink lock">(3) DHT Transfer</a></li>
|
|
<li><a href="/CrawlResults.html?process=4" class="MenuItemLink lock">(4) Proxy Use</a></li>
|
|
<li><a href="/CrawlResults.html?process=5" class="MenuItemLink lock">(5) Local Crawling</a></li>
|
|
<li><a href="/CrawlResults.html?process=6" class="MenuItemLink">(6) Global Crawling</a></li>
|
|
</ul>
|
|
</div>
|
|
|
|
#(process)#
|
|
<h2>Crawl Results Overview</h2>
|
|
<p>These are monitoring pages for the different indexing queues.</p>
|
|
<p>YaCy knows 5 different ways to acquire web indexes. The details of these processes (1-5) are described within the submenu's listed
|
|
above which also will show you a table with indexing results so far. The information in these tables is considered as private,
|
|
so you need to log-in with your administration password.</p>
|
|
<p>Case (6) is a monitor of the local receipt-generator, the opposed case of (1). It contains also an indexing result monitor but is not considered private
|
|
since it shows crawl requests from other peers.
|
|
</p>
|
|
<p><img src="/env/grafics/indexmonitor.png" alt="An illustration how yacy works" /></p>
|
|
<p>The image above illustrates the data flow initiated by web index acquisition.
|
|
Some processes occur double to document the complex index migration structure.
|
|
</p>
|
|
::
|
|
<h2>(1) Results of Remote Crawl Receipts</h2>
|
|
<p>This is the list of web pages that this peer initiated to crawl,
|
|
but had been crawled by <em>other</em> peers.
|
|
This is the 'mirror'-case of process (6).
|
|
</p>
|
|
<p><em>Use Case:</em> You get entries here, if you start a local crawl on the 'Index Creation'-Page and check the
|
|
'Do Remote Indexing'-flag. Every page that a remote peer indexes upon this peer's request
|
|
is reported back and can be monitored here.</p>
|
|
::
|
|
<h2>(2) Results for Result of Search Queries</h2>
|
|
<p>This index transfer was initiated by your peer by doing a search query.
|
|
The index was crawled and contributed by other peers.</p>
|
|
<p><em>Use Case:</em> This list fills up if you do a search query on the 'Search Page'</p>
|
|
::
|
|
<h2>(3) Results for Index Transfer</h2>
|
|
<p>The url fetch was initiated and executed by other peers.
|
|
These links here have been transmitted to you because your peer is the most appropriate for storage according to
|
|
the logic of the Global Distributed Hash Table.</p>
|
|
<p><em>Use Case:</em> This list may fill if you check the 'Index Receive'-flag on the 'Index Control' page</p>
|
|
::
|
|
<h2>(4) Results for Proxy Indexing</h2>
|
|
<p>These web pages had been indexed as result of your proxy usage.
|
|
<strong>No personal or protected page is indexed</strong>;
|
|
such pages are detected by Cookie-Use or POST-Parameters (either in URL or as HTTP protocol)
|
|
and automatically excluded from indexing.</p>
|
|
<p><em>Use Case:</em> You must use YaCy as proxy to fill up this table.
|
|
Set the proxy settings of your browser to the same port as given
|
|
on the 'Settings'-page in the 'Proxy and Administration Port' field.</p>
|
|
::
|
|
<h2>(5) Results for Local Crawling</h2>
|
|
<p>These web pages had been crawled by your own crawl task.</p>
|
|
<p><em>Use Case:</em> start a crawl by setting a crawl start point on the 'Index Create' page.</p>
|
|
::
|
|
<h2>(6) Results for Global Crawling</h2>
|
|
<p>These pages had been indexed by your peer, but the crawl was initiated by a remote peer.
|
|
This is the 'mirror'-case of process (1).</p>
|
|
<p><em>Use Case:</em> This list may fill if you check the 'Accept remote crawling requests'-flag on the 'Index Crate' page</p>
|
|
#(/process)#
|
|
|
|
|
|
#(table)#
|
|
<p><em>The stack is empty.</em></p>
|
|
::
|
|
<p><em>Statistics about #[domains]# domains in this stack:</em></p>
|
|
<table cellpadding="2" cellspacing="1" >
|
|
<tr class="TableHeader">
|
|
<td align="center"></td>
|
|
<td><strong>Domain</strong></td>
|
|
<td><strong>URLs</strong></td>
|
|
</tr>
|
|
#{domains}#
|
|
<tr class="TableCell#(dark)#Light::Dark#(/dark)#">
|
|
<td align="center">
|
|
<form action="#[feedbackpage]#" method="post" enctype="multipart/form-data">
|
|
<div>
|
|
<input type="hidden" name="process" value="#[tabletype]#" />
|
|
<input type="hidden" name="hashpart" value="#[hashpart]#" />
|
|
<input type="hidden" name="domain" value="#[domain]#" />
|
|
<input type="submit" name="deletedomain" value="delete all" />
|
|
</div>
|
|
</form>
|
|
</td>
|
|
<td><a href="http://#[domain]#/">#[domain]#</a></td>
|
|
<td>#[count]#</td>
|
|
</tr>
|
|
#{/domains}#
|
|
</table><br />
|
|
|
|
<p><em>
|
|
#(size)#
|
|
Showing all #[all]# entries in this stack.
|
|
::
|
|
Showing latest #[count]# lines from a stack of #[all]# entries.
|
|
#(/size)#
|
|
</em></p>
|
|
<table cellpadding="2" cellspacing="1" >
|
|
<tr class="TableHeader">
|
|
<td align="center">
|
|
<form action="#[feedbackpage]#" method="post" enctype="multipart/form-data">
|
|
<div>
|
|
<input type="hidden" name="process" value="#[tabletype]#" />
|
|
<input type="submit" name="clearlist" value="clear list" />
|
|
</div>
|
|
</form>
|
|
</td>
|
|
#(showInit)#::<td><strong>Initiator</strong></td>#(/showInit)#
|
|
#(showExec)#::<td><strong>Executor</strong></td>#(/showExec)#
|
|
#(showDate)#::<td><strong>Modified</strong></td>#(/showDate)#
|
|
#(showWords)#::<td><strong>Words</strong></td>#(/showWords)#
|
|
#(showTitle)#::<td><strong>Title</strong></td>#(/showTitle)#
|
|
#(showURL)#::<td><strong>URL</strong></td>#(/showURL)#
|
|
</tr>
|
|
#{indexed}#
|
|
<tr class="TableCell#(dark)#Light::Dark#(/dark)#">
|
|
<td align="center">
|
|
<form action="#[feedbackpage]#" method="post" enctype="multipart/form-data">
|
|
<div>
|
|
<input type="hidden" name="process" value="#[tabletype]#" />
|
|
<input type="hidden" name="hash" value="#[urlhash]#" />
|
|
<input type="submit" name="deleteentry" value="delete" />
|
|
</div>
|
|
</form>
|
|
</td>
|
|
#(showInit)#::<td>#[initiatorSeed]#</td>#(/showInit)#
|
|
#(showExec)#::<td>#[executorSeed]#</td>#(/showExec)#
|
|
|
|
#(showDate)#::<td>#[modified]#</td>#(/showDate)#
|
|
#(showWords)#::<td>#[count]#</td>#(/showWords)#
|
|
|
|
#(showTitle)#
|
|
::
|
|
<td>
|
|
#(available)#
|
|
<span class="tt">-not cached-</span>
|
|
::
|
|
<a href="ViewFile.html?action=info&urlHash=#[urlHash]#" class="small" title="#[urltitle]#">#(nodescr)#no title::#[urldescr]##(/nodescr)#</a>
|
|
#(/available)#
|
|
</td>
|
|
#(/showTitle)#
|
|
|
|
#(showURL)#
|
|
::
|
|
<td>
|
|
#(available)#
|
|
<span class="tt">-not cached-</span>
|
|
::
|
|
<a href="ViewFile.html?action=info&urlHash=#[urlHash]#" class="small" title="#[urltitle]#">#[url]#</a>
|
|
#(/available)#
|
|
</td>
|
|
#(/showURL)#
|
|
|
|
</tr>
|
|
#{/indexed}#
|
|
</table>
|
|
::
|
|
#(/table)#
|
|
#%env/templates/footer.template%#
|
|
</body>
|
|
</html>
|