#(reload)#::#(/reload)# YaCy '#[clientname]#': URL Database Administration #%env/templates/metas.template%# #%env/templates/header.template%# #%env/templates/submenuIndexImport.template%#

Index Export

The local index currently contains #[ucount]# documents, only #[ucount200]# exportable with status code 200 - the remaining are error documents.

#(lurlexport)#::
Loaded URL Export
Export Path
URL Filter
 .*.* (default) is a catch-all; format: java regex
query
 *:* (default) is a catch-all; format: :
maximum age (seconds)
 -1 = unlimited -> no document is too old
maximum number of records per chunk
 if exceeded: several chunks are stored; -1 = unlimited (makes only one chunk)
Export Size
full size, all fields:  minified; only fields sku, date, title, description, text_t
Export Format
Full Data Records:
JSON (Rich and full-text Elasticsearch data, one document per line in one flat JSON file, can be bulk-imported to elasticsearch. Here is an example for opensearch, using docker:
Start docker container of opensearch:
docker run --name opensearch -p 9200:9200 -d -e OPENSEARCH_JAVA_OPTS="-Xms2G -Xmx2G" -e discovery.type=single-node -e DISABLE_SECURITY_PLUGIN=true -v $(pwd)/opensearch_data:/usr/share/opensearch/data opensearchproject/opensearch:latest
Unblock index creation:
curl -X PUT "http://localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "persistent": { "cluster.blocks.create_index": null } }'
Create the search index:
curl -X PUT "http://localhost:9200/collection1/yacy"
Bulk-upload the index file:
curl -XPOST "http://localhost:9200/collection1/yacy/_bulk?filter_path=took,errors" -H "Content-Type: application/x-ndjson" --data-binary @yacy_dump_XXX.flatjson
Make a search, get 10 results, search in fields text_t, title, description with boosts:
curl -X POST "http://localhost:9200/collection1/yacy/_search" -H 'Content-Type: application/json' -d' {"size": 10, "query": {"multi_match": { "query": "one two three", "fields": ["text_t", "title^10", "description^3"], "fuzziness": "AUTO" }}}'
XML (Rich and full-text Solr data, one document per line in one large xml file, can be processed with shell tools, can be imported with DATA/SURROGATE/in/)
XML (RSS)
Full URL List:
Plain Text List (URLs only)
HTML (URLs with title)
Only Domain:
Plain Text List (domains only)
HTML (domains as URLs, no title)
Only Text:
Fulltext of Search Index Text
 
::
Export to file #[exportfile]# is running .. #[urlcount]# Documents so far
:: #(/lurlexport)# #(lurlexportfinished)#::
Finished export of #[urlcount]# Documents to file #[exportfile]#
Import this file by moving it to DATA/SURROGATES/in
:: #(/lurlexportfinished)# #(lurlexporterror)#::
Export to file #[exportfile]# failed: #[exportfailmsg]#
:: #(/lurlexporterror)# #(dumprestore)#::
Dump and Restore of Solr Index #(dumpRestoreEnabled)#
This feature is available only when a local embedded Solr is active.
::#(/dumpRestoreEnabled)#
 
Dump File
 
:: #(/dumprestore)# #(indexdump)#:: :: :: #(/indexdump)# #(indexRestore)#:: :: :: #(/indexRestore)# #%env/templates/footer.template%#