sgaebel
80785b785e
adds deleting during recrawl
4 years ago
sgaebel
8d2e7262d9
Recrawl:
...
- set the chunksize to 100 to meet the max of the embedded solr
- re-enable sorting (the case where we switched it of should be away)
- enable recrawling on remote-solr
6 years ago
luccioman
8b572b7337
Commit Solr index before simulating or starting recrawl job.
...
This ensures up-to-date simulation query results, and recrawl
processing.
7 years ago
luccioman
5e2812c060
Automatically refresh running recrawl report when JavaScript is enabled.
...
For users who would prefer to keep JavaScript disabled, a manual Refresh
button is still available.
7 years ago
luccioman
4e03335625
Added more details to the recrawl job report
7 years ago
luccioman
d95d393a0d
Add a query link to local Solr to browse selected recrawl candidates
7 years ago
luccioman
59f7763af6
Display recrawl job report also when job is actively running
7 years ago
luccioman
0c9e0b3566
Record recrawl calls to make them schedulable
7 years ago
luccioman
433e241e4f
Added a report info box about eventual last terminated recrawl job
...
For easier monitoring of recrawls.
7 years ago
luccioman
b2af25b14f
Added a stop condition to the Recrawl busy thread
7 years ago
luccioman
421728d25a
Made possible to customize selection query before launching a recrawl
7 years ago
luccioman
fab6e54fec
Enforced controls (HTTP method, token) on ReIndex and ReCrawl operations
7 years ago
reger
72f6a0b0b2
enhance recrawl job
...
- allow to modify the query to select documents to process (after job has started)
- allow to include failed urls (httpstatus <> 200)
10 years ago
reger
ace71a8877
Initial (experimental) implementation of index update/re-crawl job
...
added to IndexReIndexMonitor_p.html
Selects existing documents from index and feeds it to the crawler.
currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY]
Documents are added in small chunks (200) to the crawler, only if no other crawl is running.
10 years ago
Michael Peter Christen
5a060c9f26
refactoring of reindexSolr (just replaced constant string)
10 years ago
reger
5f0bb1214f
modified FieldReIndex to reindex queries with low number of documents first
...
by using a internally a score map with number of documents as score
and working through the list from low to high.
10 years ago
Michael Peter Christen
15b2fad6a2
reverted latest change for reindexing because that works actually only
...
for internal Solr indexes. This is mainly caused by the fact that an
external Solr may be also a SolrCloud which do not support LukeRequests,
which are needed to request the old Schema.
11 years ago
Michael Peter Christen
e09218129c
remove check for local solr. This check was made during a time when Solr
...
was optional and another alternative metadata store was available. Since
that store is now removed, Solr is always available (internally or
externally)
11 years ago
reger
82d81a57bd
info msg if no embedded Solr http://bugs.yacy.net/view.php?id=279
11 years ago
reger
02fe8b43ba
Field Re-Indexing: display list of fields in reindex queue
...
change servlet to display statistic on 1st click (instead after refresh)
11 years ago
Michael Peter Christen
1fd006cc56
fixes using the embedded connector
11 years ago
reger
79401cb938
added reindex option for documents with disabled or obsolete fields to Solr Schema Editor page (IndexSchema_p.html)
...
this allows to remove obsolete fields from the index (according to current schema config)
by selecting all documents containig disabled fields.
12 years ago