it is now possible to get the results in two steps:
- first retrieve all IDs as given for a query
- then retieve each document individually
This was necessary for very large result sets where a query may run for
hours and is possibly terminated by a solr-internal timeout. This occurs
regulary during postprocessing and therefore this commit may fix
unwanted postprocessing terminations.
hour. The table can be shown with
http://localhost:8090/Tables_p.html?table=stats
The entries have the following meaning:
aM: activeLastMonth
aW: activeLastWeek
aD: activeLastDay
aH: activeLastHour
cC: countConnected (Active Senior)
cD: countDisconnected (Passive Senior)
cP: countPotential (Junior)
cR: count of the RWI entries
cI: size of the index (number of documents)
The entry keys are abbreviated to reduce the space in the table as the
name is written again for every row.
This is the beginning of a 'yacystats' micro-alternative als built-in
function in YaCy. Graphics may follow after some time if enough test
data is available.
found a situation after crash (reboot) with existing running semaphore but YaCy not running.
Ping generated exception which finally deleted the conf file (during pre-read procedure)
- change to ping (catch exception solved it)
- additionally removed delete yacy.conf file (if needed we need to make a backup)
when the buffer is cleaned after a buffer flush which is not then
available in Solr since that is waiting for a commit. In such cases the
counter would run backwards which is prevented by ignoring the buffer
size.
solr+buffer and the hit cache. This shall help during first crawls to
see a running document counter even if there was no commit meanwhile to
solr. To support that strategy, the hit cache must be written earlier.
during document parsing; instead use the same references that would also
be written into the webgraph. That should cause that the webgraph and
the citation index express the exact same semantic.
string which is visible in the browser. That makes it possible that the
browser instructs the user how to change a forgotten admin password
(during runtime).
attributes are attached to artificial constructed index.html files which
list directories. Such files are naturally rejected by the crawler and
should not appear in the error log because these files are part of the
construction of file crawlers and confuse users if they see them in the
error log.
- default mySeed.ip to hostip in SeedDB.initMySeed() if Intranetmode
this allows to become senior status in intranet hosted search network with view peers,
otherwise peer would stay junior because of default init with loopback ip as public (dna) ip.