dropped GSA support (GSA API is still in YaCy Grid) The 6.6.6 solr index works without migration also with 7.7.3pull/402/head
parent
c0d9a3e9a7
commit
43a9f4f574
Binary file not shown.
@ -1,202 +0,0 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@ -1,584 +0,0 @@
|
||||
/**
|
||||
* GSAResponseWriter
|
||||
* Copyright 2012 by Michael Peter Christen
|
||||
* First released 14.08.2012 at http://yacy.net
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2.1 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public License
|
||||
* along with this program in the file lgpl21.txt
|
||||
* If not, see <http://www.gnu.org/licenses/>.
|
||||
*/
|
||||
|
||||
package net.yacy.cora.federate.solr.responsewriter;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.Writer;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.time.DateTimeException;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collection;
|
||||
import java.util.Date;
|
||||
import java.util.HashMap;
|
||||
import java.util.HashSet;
|
||||
import java.util.LinkedHashSet;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.Map.Entry;
|
||||
import java.util.Set;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
import org.apache.lucene.document.Document;
|
||||
import org.apache.lucene.index.IndexableField;
|
||||
import org.apache.solr.client.solrj.response.QueryResponse;
|
||||
import org.apache.solr.common.SolrDocument;
|
||||
import org.apache.solr.common.SolrDocumentList;
|
||||
import org.apache.solr.common.params.CommonParams;
|
||||
import org.apache.solr.common.util.NamedList;
|
||||
import org.apache.solr.common.util.XML;
|
||||
import org.apache.solr.request.SolrQueryRequest;
|
||||
import org.apache.solr.response.QueryResponseWriter;
|
||||
import org.apache.solr.response.ResultContext;
|
||||
import org.apache.solr.response.SolrQueryResponse;
|
||||
import org.apache.solr.search.DocIterator;
|
||||
import org.apache.solr.search.DocList;
|
||||
import org.apache.solr.search.SolrIndexSearcher;
|
||||
|
||||
import net.yacy.cora.date.ISO8601Formatter;
|
||||
import net.yacy.cora.protocol.HeaderFramework;
|
||||
import net.yacy.cora.util.CommonPattern;
|
||||
import net.yacy.http.servlets.GSAsearchServlet;
|
||||
import net.yacy.peers.operation.yacyVersion;
|
||||
import net.yacy.search.Switchboard;
|
||||
import net.yacy.search.schema.CollectionSchema;
|
||||
|
||||
/**
|
||||
* implementation of a GSA search result.
|
||||
* example: GET /gsa/searchresult?q=chicken+teriyaki&output=xml&client=test&site=test&sort=date:D:S:d1
|
||||
* for a xml reference, see https://developers.google.com/search-appliance/documentation/614/xml_reference
|
||||
*/
|
||||
public class GSAResponseWriter implements QueryResponseWriter, SolrjResponseWriter {
|
||||
|
||||
private static String YaCyVer = null;
|
||||
private static final char lb = '\n';
|
||||
private enum GSAToken {
|
||||
CACHE_LAST_MODIFIED, // Date that the document was crawled, as specified in the Date HTTP header when the document was crawled for this index.
|
||||
CRAWLDATE, // An optional element that shows the date when the page was crawled. It is shown only for pages that have been crawled within the past two days.
|
||||
U, // The URL of the search result.
|
||||
UE, // The URL-encoded version of the URL that is in the U parameter.
|
||||
GD, // Contains the description of a KeyMatch result..
|
||||
T, // The title of the search result.
|
||||
RK, // Provides a ranking number used internally by the search appliance.
|
||||
ENT_SOURCE, // Identifies the application ID (serial number) of the search appliance that contributes to a result. Example: <ENT_SOURCE>S5-KUB000F0ADETLA</ENT_SOURCE>
|
||||
FS, // Additional details about the search result.
|
||||
R, // details of an individual search result.
|
||||
S, // The snippet for the search result. Query terms appear in bold in the results. Line breaks are included for proper text wrapping.
|
||||
LANG, // Indicates the language of the search result. The LANG element contains a two-letter language code.
|
||||
HAS; // Encapsulates special features that are included for this search result.
|
||||
}
|
||||
|
||||
|
||||
private static final char[] XML_START = (
|
||||
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<GSP VER=\"3.2\">\n<!-- This is a Google Search Appliance API result, provided by YaCy. See https://developers.google.com/search-appliance/documentation/614/xml_reference -->\n").toCharArray();
|
||||
private static final char[] XML_STOP = "</GSP>\n".toCharArray();
|
||||
|
||||
// pre-select a set of YaCy schema fields for the solr searcher which should cause a better caching
|
||||
private static final CollectionSchema[] extrafields = new CollectionSchema[]{
|
||||
CollectionSchema.id, CollectionSchema.sku, CollectionSchema.title, CollectionSchema.description_txt,
|
||||
CollectionSchema.last_modified, CollectionSchema.load_date_dt, CollectionSchema.size_i,
|
||||
CollectionSchema.language_s, CollectionSchema.collection_sxt
|
||||
};
|
||||
|
||||
private static final Set<String> SOLR_FIELDS = new HashSet<>();
|
||||
static {
|
||||
|
||||
SOLR_FIELDS.add(CollectionSchema.language_s.getSolrFieldName());
|
||||
for (CollectionSchema field: extrafields) SOLR_FIELDS.add(field.getSolrFieldName());
|
||||
}
|
||||
|
||||
private static class ResHead {
|
||||
public long offset, numFound;
|
||||
public int rows;
|
||||
//public int status, QTime;
|
||||
//public String df, q, wt;
|
||||
//public float maxScore;
|
||||
}
|
||||
|
||||
public static class Sort {
|
||||
public String sort = null, action = null, direction = null, mode = null, format = null;
|
||||
public Sort(String d) {
|
||||
this.sort = d;
|
||||
String[] s = CommonPattern.DOUBLEPOINT.split(d);
|
||||
if (s.length < 1) return;
|
||||
this.action = s[0]; // date
|
||||
this.direction = s.length > 1 ? s[1] : "D"; // A or D
|
||||
this.mode = s.length > 2 ? s[2] : "S"; // S, R, L
|
||||
this.format = s.length > 3 ? s[3] : "d1"; // d1
|
||||
}
|
||||
public String toSolr() {
|
||||
if (this.action != null && "date".equals(this.action)) {
|
||||
return CollectionSchema.last_modified.getSolrFieldName() + " " + (("D".equals(this.direction) ? "desc" : "asc"));
|
||||
}
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public String getContentType(final SolrQueryRequest request, final SolrQueryResponse response) {
|
||||
return CONTENT_TYPE_XML_UTF8;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void init(@SuppressWarnings("rawtypes") NamedList n) {
|
||||
}
|
||||
|
||||
@Override
|
||||
public void write(final Writer writer, final SolrQueryRequest request, final SolrQueryResponse rsp) throws IOException {
|
||||
|
||||
final long start = System.currentTimeMillis();
|
||||
|
||||
final Object responseObj = rsp.getResponse();
|
||||
|
||||
if(responseObj instanceof ResultContext) {
|
||||
/* Regular response object */
|
||||
|
||||
final DocList documents = ((ResultContext) responseObj).getDocList();
|
||||
|
||||
final Object highlightingObj = rsp.getValues().get("highlighting");
|
||||
final Map<String, Collection<String>> snippets = highlightingObj instanceof NamedList
|
||||
? OpensearchResponseWriter.snippetsFromHighlighting((NamedList<?>) highlightingObj)
|
||||
: new HashMap<>();
|
||||
|
||||
// parse response header
|
||||
final ResHead resHead = new ResHead();
|
||||
resHead.rows = request.getParams().getInt(CommonParams.ROWS, 0);
|
||||
resHead.offset = documents.offset(); // equal to 'start'
|
||||
resHead.numFound = documents.matches();
|
||||
//resHead.df = (String) val0.get("df");
|
||||
//resHead.q = (String) val0.get("q");
|
||||
//resHead.wt = (String) val0.get("wt");
|
||||
//resHead.status = (Integer) responseHeader.get("status");
|
||||
//resHead.QTime = (Integer) responseHeader.get("QTime");
|
||||
//resHead.maxScore = response.maxScore();
|
||||
|
||||
// write header
|
||||
writeHeader(writer, request, resHead, start);
|
||||
|
||||
// body introduction
|
||||
writeBodyIntro(writer, request, resHead, documents.size());
|
||||
|
||||
writeDocs(writer, request, documents, snippets, resHead);
|
||||
|
||||
writer.write("</RES>"); writer.write(lb);
|
||||
writer.write(XML_STOP);
|
||||
} else if(responseObj instanceof SolrDocumentList) {
|
||||
/*
|
||||
* The response object can be a SolrDocumentList when the response is partial,
|
||||
* for example when the allowed processing time has been exceeded
|
||||
*/
|
||||
final SolrDocumentList documents = (SolrDocumentList) responseObj;
|
||||
|
||||
final Object highlightingObj = rsp.getValues().get("highlighting");
|
||||
final Map<String, Collection<String>> snippets = highlightingObj instanceof NamedList
|
||||
? OpensearchResponseWriter.snippetsFromHighlighting((NamedList<?>) highlightingObj)
|
||||
: new HashMap<>();
|
||||
|
||||
writeSolrDocumentList(writer, request, snippets, start, documents);
|
||||
} else {
|
||||
throw new IOException("Unable to process Solr response format");
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public void write(Writer writer, SolrQueryRequest request, String coreName, QueryResponse rsp) throws IOException {
|
||||
final long start = System.currentTimeMillis();
|
||||
|
||||
writeSolrDocumentList(writer, request, snippetsFromHighlighting(rsp.getHighlighting()), start,
|
||||
rsp.getResults());
|
||||
}
|
||||
|
||||
/**
|
||||
* Produce snippets from Solr (they call that 'highlighting')
|
||||
*
|
||||
* @param sorlHighlighting highlighting from Solr
|
||||
* @return a map from urlhashes to a list of snippets for that url
|
||||
*/
|
||||
private Map<String, Collection<String>> snippetsFromHighlighting(
|
||||
final Map<String, Map<String, List<String>>> sorlHighlighting) {
|
||||
final Map<String, Collection<String>> snippets = new HashMap<>();
|
||||
if (sorlHighlighting == null) {
|
||||
return snippets;
|
||||
}
|
||||
for (final Entry<String, Map<String, List<String>>> highlightingEntry : sorlHighlighting.entrySet()) {
|
||||
final String urlHash = highlightingEntry.getKey();
|
||||
final Map<String, List<String>> highlights = highlightingEntry.getValue();
|
||||
final LinkedHashSet<String> urlSnippets = new LinkedHashSet<>();
|
||||
for (final List<String> texts : highlights.values()) {
|
||||
urlSnippets.addAll(texts);
|
||||
}
|
||||
snippets.put(urlHash, urlSnippets);
|
||||
}
|
||||
return snippets;
|
||||
}
|
||||
|
||||
/**
|
||||
* Append to the writer a representation of a list of Solr documents. All
|
||||
* parameters are required and must not be null.
|
||||
*
|
||||
* @param writer an open output writer
|
||||
* @param request the Solr request
|
||||
* @param snippets the snippets computed from the Solr highlighting
|
||||
* @param start the results start index
|
||||
* @param documents the Solr documents to process
|
||||
* @throws IOException when a write error occurred
|
||||
*/
|
||||
private void writeSolrDocumentList(final Writer writer, final SolrQueryRequest request,
|
||||
final Map<String, Collection<String>> snippets, final long start, final SolrDocumentList documents)
|
||||
throws IOException {
|
||||
|
||||
// parse response header
|
||||
final ResHead resHead = new ResHead();
|
||||
resHead.rows = request.getParams().getInt(CommonParams.ROWS, 0);
|
||||
resHead.offset = documents.getStart();
|
||||
resHead.numFound = documents.getNumFound();
|
||||
|
||||
// write header
|
||||
writeHeader(writer, request, resHead, start);
|
||||
|
||||
// body introduction
|
||||
writeBodyIntro(writer, request, resHead, documents.size());
|
||||
|
||||
writeDocs(writer, documents, snippets, resHead, request.getParams().get("originalQuery"));
|
||||
|
||||
writer.write("</RES>"); writer.write(lb);
|
||||
writer.write(XML_STOP);
|
||||
}
|
||||
|
||||
/**
|
||||
* Append the response header to the writer. All parameters are required and
|
||||
* must not be null.
|
||||
*
|
||||
* @param writer an open output writer
|
||||
* @param request the Solr request
|
||||
* @param resHead results header information
|
||||
* @param startTime this writer processing start time in milliseconds since
|
||||
* Epoch
|
||||
* @throws IOException when a write error occurred
|
||||
*/
|
||||
private void writeHeader(final Writer writer, final SolrQueryRequest request, final ResHead resHead,
|
||||
final long startTime) throws IOException {
|
||||
final Map<Object,Object> context = request.getContext();
|
||||
|
||||
writer.write(XML_START);
|
||||
final String query = request.getParams().get("originalQuery");
|
||||
final String site = getContextString(context, "site", "");
|
||||
final String sort = getContextString(context, "sort", "");
|
||||
final String client = getContextString(context, "client", "");
|
||||
final String ip = getContextString(context, "ip", "");
|
||||
final String access = getContextString(context, "access", "");
|
||||
final String entqr = getContextString(context, "entqr", "");
|
||||
OpensearchResponseWriter.solitaireTag(writer, "TM", Long.toString(System.currentTimeMillis() - startTime));
|
||||
OpensearchResponseWriter.solitaireTag(writer, "Q", query);
|
||||
paramTag(writer, "sort", sort);
|
||||
paramTag(writer, "output", "xml_no_dtd");
|
||||
paramTag(writer, "ie", StandardCharsets.UTF_8.name());
|
||||
paramTag(writer, "oe", StandardCharsets.UTF_8.name());
|
||||
paramTag(writer, "client", client);
|
||||
paramTag(writer, "q", query);
|
||||
paramTag(writer, "site", site);
|
||||
paramTag(writer, "start", Long.toString(resHead.offset));
|
||||
paramTag(writer, "num", Integer.toString(resHead.rows));
|
||||
paramTag(writer, "ip", ip);
|
||||
paramTag(writer, "access", access); // p - search only public content, s - search only secure content, a - search all content, both public and secure
|
||||
paramTag(writer, "entqr", entqr); // query expansion policy; (entqr=1) -- Uses only the search appliance's synonym file, (entqr=1) -- Uses only the search appliance's synonym file, (entqr=3) -- Uses both standard and local synonym files.
|
||||
}
|
||||
|
||||
/**
|
||||
* Append the response body introduction to the writer. All parameters are
|
||||
* required and must not be null.
|
||||
*
|
||||
* @param writer an open output writer
|
||||
* @param resHead results header information
|
||||
* @param responseCount the number of result documents
|
||||
* @throws IOException when a write error occurred
|
||||
*/
|
||||
private void writeBodyIntro(final Writer writer, final SolrQueryRequest request, final ResHead resHead,
|
||||
final int responseCount) throws IOException {
|
||||
final Map<Object,Object> context = request.getContext();
|
||||
final String site = getContextString(context, "site", "");
|
||||
final String sort = getContextString(context, "sort", "");
|
||||
final String client = getContextString(context, "client", "");
|
||||
final String access = getContextString(context, "access", "");
|
||||
writer.write("<RES SN=\"" + (resHead.offset + 1) + "\" EN=\"" + (resHead.offset + responseCount) + "\">"); writer.write(lb); // The index (1-based) of the first and last search result returned in this result set.
|
||||
writer.write("<M>" + resHead.numFound + "</M>"); writer.write(lb); // The estimated total number of results for the search.
|
||||
writer.write("<FI/>"); writer.write(lb); // Indicates that document filtering was performed during this search.
|
||||
long nextStart = resHead.offset + responseCount;
|
||||
long nextNum = Math.min(resHead.numFound - nextStart, responseCount < resHead.rows ? 0 : resHead.rows);
|
||||
long prevStart = resHead.offset - resHead.rows;
|
||||
if (prevStart >= 0 || nextNum > 0) {
|
||||
writer.write("<NB>");
|
||||
if (prevStart >= 0) {
|
||||
writer.write("<PU>");
|
||||
XML.escapeCharData("/gsa/search?q=" + request.getParams().get(CommonParams.Q) + "&site=" + site +
|
||||
"&lr=&ie=UTF-8&oe=UTF-8&output=xml_no_dtd&client=" + client + "&access=" + access +
|
||||
"&sort=" + sort + "&start=" + prevStart + "&sa=N", writer); // a relative URL pointing to the NEXT results page.
|
||||
writer.write("</PU>");
|
||||
}
|
||||
if (nextNum > 0) {
|
||||
writer.write("<NU>");
|
||||
XML.escapeCharData("/gsa/search?q=" + request.getParams().get(CommonParams.Q) + "&site=" + site +
|
||||
"&lr=&ie=UTF-8&oe=UTF-8&output=xml_no_dtd&client=" + client + "&access=" + access +
|
||||
"&sort=" + sort + "&start=" + nextStart + "&num=" + nextNum + "&sa=N", writer); // a relative URL pointing to the NEXT results page.
|
||||
writer.write("</NU>");
|
||||
}
|
||||
writer.write("</NB>");
|
||||
}
|
||||
writer.write(lb);
|
||||
}
|
||||
|
||||
/**
|
||||
* Append to the writer a representation of a list of Solr documents. All
|
||||
* parameters are required and must not be null.
|
||||
*
|
||||
* @param writer an open output writer
|
||||
* @param request the Solr request
|
||||
* @param documents the Solr documents to process
|
||||
* @param snippets the snippets computed from the Solr highlighting
|
||||
* @param resHead results header information
|
||||
* @throws IOException when a write error occurred
|
||||
*/
|
||||
private void writeDocs(final Writer writer, final SolrQueryRequest request, final DocList documents,
|
||||
final Map<String, Collection<String>> snippets, final ResHead resHead)
|
||||
throws IOException {
|
||||
// parse body
|
||||
final String query = request.getParams().get("originalQuery");
|
||||
SolrIndexSearcher searcher = request.getSearcher();
|
||||
DocIterator iterator = documents.iterator();
|
||||
String urlhash = null;
|
||||
final int responseCount = documents.size();
|
||||
for (int i = 0; i < responseCount; i++) {
|
||||
int id = iterator.nextDoc();
|
||||
Document doc = searcher.doc(id, SOLR_FIELDS);
|
||||
List<IndexableField> fields = doc.getFields();
|
||||
|
||||
// pre-scan the fields to get the mime-type
|
||||
String mime = "";
|
||||
for (IndexableField value: fields) {
|
||||
String fieldName = value.name();
|
||||
if (CollectionSchema.content_type.getSolrFieldName().equals(fieldName)) {
|
||||
mime = value.stringValue();
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// write the R header for a search result
|
||||
writer.write("<R N=\"" + (resHead.offset + i + 1) + "\"" + (i == 1 ? " L=\"2\"" : "") + (mime != null && mime.length() > 0 ? " MIME=\"" + mime + "\"" : "") + ">"); writer.write(lb);
|
||||
List<String> descriptions = new ArrayList<>();
|
||||
List<String> collections = new ArrayList<>();
|
||||
int size = 0;
|
||||
boolean title_written = false; // the solr index may contain several; we take only the first which should be the visible tag in <title></title>
|
||||
String title = null;
|
||||
for (IndexableField value: fields) {
|
||||
String fieldName = value.name();
|
||||
|
||||
if (CollectionSchema.language_s.getSolrFieldName().equals(fieldName)) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.LANG.name(), value.stringValue());
|
||||
} else if (CollectionSchema.id.getSolrFieldName().equals(fieldName)) {
|
||||
urlhash = value.stringValue();
|
||||
} else if (CollectionSchema.sku.getSolrFieldName().equals(fieldName)) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.U.name(), value.stringValue());
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.UE.name(), value.stringValue());
|
||||
} else if (CollectionSchema.title.getSolrFieldName().equals(fieldName) && !title_written) {
|
||||
title = value.stringValue();
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.T.name(), highlight(title, query));
|
||||
title_written = true;
|
||||
} else if (CollectionSchema.description_txt.getSolrFieldName().equals(fieldName)) {
|
||||
descriptions.add(value.stringValue());
|
||||
} else if (CollectionSchema.last_modified.getSolrFieldName().equals(fieldName)) {
|
||||
Date d = new Date(Long.parseLong(value.stringValue()));
|
||||
writer.write("<FS NAME=\"date\" VALUE=\"" + formatGSAFS(d) + "\"/>\n");
|
||||
} else if (CollectionSchema.load_date_dt.getSolrFieldName().equals(fieldName)) {
|
||||
Date d = new Date(Long.parseLong(value.stringValue()));
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.CRAWLDATE.name(), HeaderFramework.formatRFC1123(d));
|
||||
} else if (CollectionSchema.size_i.getSolrFieldName().equals(fieldName)) {
|
||||
size = value.stringValue() != null && value.stringValue().length() > 0 ? Integer.parseInt(value.stringValue()) : -1;
|
||||
} else if (CollectionSchema.collection_sxt.getSolrFieldName().equals(fieldName)) {
|
||||
collections.add(value.stringValue());
|
||||
}
|
||||
}
|
||||
// compute snippet from texts
|
||||
Collection<String> snippet = urlhash == null ? null : snippets.get(urlhash);
|
||||
OpensearchResponseWriter.removeSubsumedTitle(snippet, title);
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.S.name(), snippet == null || snippet.size() == 0 ? (descriptions.size() > 0 ? descriptions.get(0) : "") : OpensearchResponseWriter.getLargestSnippet(snippet));
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.GD.name(), descriptions.size() > 0 ? descriptions.get(0) : "");
|
||||
String cols = collections.toString();
|
||||
if (collections.size() > 0) OpensearchResponseWriter.solitaireTag(writer, "COLS" /*SPECIAL!*/, collections.size() > 1 ? cols.substring(1, cols.length() - 1).replaceAll(" ", "") : collections.get(0));
|
||||
writer.write("<HAS><L/><C SZ=\""); writer.write(Integer.toString(size / 1024)); writer.write("k\" CID=\""); writer.write(urlhash); writer.write("\" ENC=\"UTF-8\"/></HAS>\n");
|
||||
if (YaCyVer == null) YaCyVer = yacyVersion.thisVersion().getName() + "/" + Switchboard.getSwitchboard().peers.mySeed().hash;
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.ENT_SOURCE.name(), YaCyVer);
|
||||
OpensearchResponseWriter.closeTag(writer, "R");
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Append to the writer a representation of a list of Solr documents. All
|
||||
* parameters are required and must not be null.
|
||||
*
|
||||
* @param writer an open output writer
|
||||
* @param documents the Solr documents to process
|
||||
* @param snippets the snippets computed from the Solr highlighting
|
||||
* @param resHead results header information
|
||||
* @param query the original search query
|
||||
* @throws IOException when a write error occurred
|
||||
*/
|
||||
private void writeDocs(final Writer writer, final SolrDocumentList documents,
|
||||
final Map<String, Collection<String>> snippets, final ResHead resHead, final String query)
|
||||
throws IOException {
|
||||
// parse body
|
||||
String urlhash = null;
|
||||
int i = 0;
|
||||
for (final SolrDocument doc : documents) {
|
||||
|
||||
// pre-scan the fields to get the mime-type
|
||||
final Object contentTypeObj = doc.getFirstValue(CollectionSchema.content_type.getSolrFieldName());
|
||||
final String mime = contentTypeObj != null ? contentTypeObj.toString() : "";
|
||||
|
||||
// write the R header for a search result
|
||||
writer.write("<R N=\"" + (resHead.offset + i + 1) + "\"" + (i == 1 ? " L=\"2\"" : "") + (mime != null && mime.length() > 0 ? " MIME=\"" + mime + "\"" : "") + ">"); writer.write(lb);
|
||||
final List<String> descriptions = new ArrayList<>();
|
||||
final List<String> collections = new ArrayList<>();
|
||||
int size = 0;
|
||||
String title = null;
|
||||
for (final Entry<String, Object> field : doc.entrySet()) {
|
||||
final String fieldName = field.getKey();
|
||||
final Object value = field.getValue();
|
||||
|
||||
if (CollectionSchema.language_s.getSolrFieldName().equals(fieldName)) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.LANG.name(), value.toString());
|
||||
} else if (CollectionSchema.id.getSolrFieldName().equals(fieldName)) {
|
||||
urlhash = value.toString();
|
||||
} else if (CollectionSchema.sku.getSolrFieldName().equals(fieldName)) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.U.name(), value.toString());
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.UE.name(), value.toString());
|
||||
} else if (CollectionSchema.title.getSolrFieldName().equals(fieldName)) {
|
||||
if(value instanceof Iterable) {
|
||||
for(final Object titleObj : (Iterable<?>)value) {
|
||||
if(titleObj != null) {
|
||||
/* get only the first title */
|
||||
title = titleObj.toString();
|
||||
break;
|
||||
}
|
||||
}
|
||||
} else if(value != null) {
|
||||
title = value.toString();
|
||||
}
|
||||
if(title != null) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.T.name(), highlight(title, query));
|
||||
}
|
||||
} else if (CollectionSchema.description_txt.getSolrFieldName().equals(fieldName)) {
|
||||
if(value instanceof Iterable) {
|
||||
for(final Object descriptionObj : (Iterable<?>)value) {
|
||||
if(descriptionObj != null) {
|
||||
descriptions.add(descriptionObj.toString());
|
||||
}
|
||||
}
|
||||
} else if(value != null) {
|
||||
descriptions.add(value.toString());
|
||||
}
|
||||
} else if (CollectionSchema.last_modified.getSolrFieldName().equals(fieldName) && value instanceof Date) {
|
||||
writer.write("<FS NAME=\"date\" VALUE=\"" + formatGSAFS((Date)value) + "\"/>\n");
|
||||
} else if (CollectionSchema.load_date_dt.getSolrFieldName().equals(fieldName) && value instanceof Date) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.CRAWLDATE.name(), HeaderFramework.formatRFC1123((Date)value));
|
||||
} else if (CollectionSchema.size_i.getSolrFieldName().equals(fieldName)) {
|
||||
size = value instanceof Integer ? (Integer)value : -1;
|
||||
} else if (CollectionSchema.collection_sxt.getSolrFieldName().equals(fieldName)) { // handle collection
|
||||
if(value instanceof Iterable) {
|
||||
for(final Object collectionObj : (Iterable<?>)value) {
|
||||
if(collectionObj != null) {
|
||||
collections.add(collectionObj.toString());
|
||||
}
|
||||
}
|
||||
} else if(value != null) {
|
||||
collections.add(value.toString());
|
||||
}
|
||||
}
|
||||
}
|
||||
// compute snippet from texts
|
||||
Collection<String> snippet = urlhash == null ? null : snippets.get(urlhash);
|
||||
OpensearchResponseWriter.removeSubsumedTitle(snippet, title);
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.S.name(), snippet == null || snippet.size() == 0 ? (descriptions.size() > 0 ? descriptions.get(0) : "") : OpensearchResponseWriter.getLargestSnippet(snippet));
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.GD.name(), descriptions.size() > 0 ? descriptions.get(0) : "");
|
||||
String cols = collections.toString();
|
||||
if (!collections.isEmpty()) {
|
||||
OpensearchResponseWriter.solitaireTag(writer, "COLS" /*SPECIAL!*/, collections.size() > 1 ? cols.substring(1, cols.length() - 1).replaceAll(" ", "") : collections.get(0));
|
||||
}
|
||||
writer.write("<HAS><L/><C SZ=\""); writer.write(Integer.toString(size / 1024)); writer.write("k\" CID=\""); writer.write(urlhash); writer.write("\" ENC=\"UTF-8\"/></HAS>\n");
|
||||
if (YaCyVer == null) YaCyVer = yacyVersion.thisVersion().getName() + "/" + Switchboard.getSwitchboard().peers.mySeed().hash;
|
||||
OpensearchResponseWriter.solitaireTag(writer, GSAToken.ENT_SOURCE.name(), YaCyVer);
|
||||
OpensearchResponseWriter.closeTag(writer, "R");
|
||||
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
private static String getContextString(Map<Object,Object> context, String key, String dflt) {
|
||||
Object v = context.get(key);
|
||||
if (v == null) return dflt;
|
||||
if (v instanceof String) return (String) v;
|
||||
if (v instanceof String[]) {
|
||||
String[] va = (String[]) v;
|
||||
return va.length == 0 ? dflt : va[0];
|
||||
}
|
||||
return dflt;
|
||||
}
|
||||
|
||||
public static void paramTag(final Writer writer, final String tagname, String value) throws IOException {
|
||||
if (value == null || value.length() == 0) return;
|
||||
writer.write("<PARAM name=\"");
|
||||
writer.write(tagname);
|
||||
writer.write("\" value=\"");
|
||||
XML.escapeAttributeValue(value, writer);
|
||||
writer.write("\" original_value=\"");
|
||||
XML.escapeAttributeValue(value, writer);
|
||||
writer.write("\"/>"); writer.write(lb);
|
||||
}
|
||||
|
||||
public static String highlight(String text, String query) {
|
||||
if (query != null) {
|
||||
String[] q = CommonPattern.SPACE.split(CommonPattern.PLUS.matcher(query.trim().toLowerCase()).replaceAll(" "));
|
||||
for (String s: q) {
|
||||
int p = text.toLowerCase().indexOf(s.toLowerCase());
|
||||
if (p < 0) continue;
|
||||
text = text.substring(0, p) + "<b>" + text.substring(p, p + s.length()) + "</b>" + text.substring(p + s.length());
|
||||
}
|
||||
return text.replaceAll(Pattern.quote("</b> <b>"), " ");
|
||||
}
|
||||
return text;
|
||||
}
|
||||
|
||||
/**
|
||||
* Format date for GSA (short form of ISO8601 date format)
|
||||
* @param date
|
||||
* @return datestring "yyyy-mm-dd"
|
||||
* @see ISO8601Formatter
|
||||
*/
|
||||
public final String formatGSAFS(final Date date) {
|
||||
if (date == null) {
|
||||
return "";
|
||||
}
|
||||
try {
|
||||
return GSAsearchServlet.FORMAT_GSAFS.format(date.toInstant());
|
||||
} catch (final DateTimeException e) {
|
||||
return "";
|
||||
}
|
||||
}
|
||||
|
||||
}
|
@ -1,286 +0,0 @@
|
||||
/**
|
||||
* search
|
||||
* Copyright 2012 by Michael Peter Christen, mc@yacy.net, Frankfurt am Main, Germany
|
||||
* First released 30.10.2013 at http://yacy.net
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2.1 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public License
|
||||
* along with this program in the file lgpl21.txt
|
||||
* If not, see <http://www.gnu.org/licenses/>.
|
||||
*/
|
||||
package net.yacy.http.servlets;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.OutputStream;
|
||||
import java.io.OutputStreamWriter;
|
||||
import java.io.Writer;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.time.LocalDate;
|
||||
import java.time.ZoneId;
|
||||
import java.time.format.DateTimeFormatter;
|
||||
import java.util.Date;
|
||||
import java.util.Iterator;
|
||||
import java.util.List;
|
||||
import java.util.Locale;
|
||||
import java.util.Map;
|
||||
|
||||
import javax.servlet.ServletException;
|
||||
import javax.servlet.http.HttpServlet;
|
||||
import javax.servlet.http.HttpServletRequest;
|
||||
import javax.servlet.http.HttpServletResponse;
|
||||
|
||||
import org.apache.solr.common.SolrDocumentList;
|
||||
import org.apache.solr.common.SolrException;
|
||||
import org.apache.solr.common.params.CommonParams;
|
||||
import org.apache.solr.common.params.DisMaxParams;
|
||||
import org.apache.solr.request.SolrQueryRequest;
|
||||
import org.apache.solr.request.SolrRequestInfo;
|
||||
import org.apache.solr.response.QueryResponseWriter;
|
||||
import org.apache.solr.response.ResultContext;
|
||||
import org.apache.solr.response.SolrQueryResponse;
|
||||
import org.apache.solr.util.FastWriter;
|
||||
|
||||
import net.yacy.cora.date.ISO8601Formatter;
|
||||
import net.yacy.cora.federate.solr.Ranking;
|
||||
import net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector;
|
||||
import net.yacy.cora.federate.solr.responsewriter.GSAResponseWriter;
|
||||
import net.yacy.cora.protocol.HeaderFramework;
|
||||
import net.yacy.cora.protocol.RequestHeader;
|
||||
import net.yacy.cora.util.ConcurrentLog;
|
||||
import net.yacy.data.UserDB;
|
||||
import net.yacy.search.Switchboard;
|
||||
import net.yacy.search.query.AccessTracker;
|
||||
import net.yacy.search.query.QueryGoal;
|
||||
import net.yacy.search.query.QueryModifier;
|
||||
import net.yacy.search.query.SearchEvent;
|
||||
import net.yacy.search.schema.CollectionSchema;
|
||||
import net.yacy.server.serverObjects;
|
||||
|
||||
|
||||
/**
|
||||
* This is a gsa result formatter for solr search results.
|
||||
* The result format is implemented according to
|
||||
* https://developers.google.com/search-appliance/documentation/614/xml_reference
|
||||
*/
|
||||
public class GSAsearchServlet extends HttpServlet {
|
||||
|
||||
private static final long serialVersionUID = 7835985518515673885L;
|
||||
|
||||
/** GSA date formatter (short form of ISO8601 date format) */
|
||||
private static final String PATTERN_GSAFS = "uuuu-MM-dd";
|
||||
|
||||
public static final DateTimeFormatter FORMAT_GSAFS = DateTimeFormatter.ofPattern(PATTERN_GSAFS)
|
||||
.withLocale(Locale.US).withZone(ZoneId.systemDefault());
|
||||
|
||||
private final static GSAResponseWriter responseWriter = new GSAResponseWriter();
|
||||
|
||||
@Override
|
||||
protected void doPost(HttpServletRequest request, HttpServletResponse response)
|
||||
throws ServletException, IOException {
|
||||
doGet(request, response);
|
||||
}
|
||||
|
||||
@Override
|
||||
protected void doGet(HttpServletRequest request, HttpServletResponse response)
|
||||
throws ServletException, IOException {
|
||||
response.setContentType(QueryResponseWriter.CONTENT_TYPE_XML_UTF8);
|
||||
response.setStatus(HttpServletResponse.SC_OK);
|
||||
respond(request, Switchboard.getSwitchboard(), response.getOutputStream());
|
||||
}
|
||||
|
||||
// ------------------------------------------
|
||||
/**
|
||||
* from here copy of old htroot/gsa/gsasearchresult.java
|
||||
* with modification to use HttpServletRequest instead of (yacy) RequestHeader
|
||||
*/
|
||||
|
||||
private void respond(final HttpServletRequest header, final Switchboard sb, final OutputStream out) {
|
||||
|
||||
// remember the peer contact for peer statistics
|
||||
String clientip = RequestHeader.client(header);
|
||||
if (clientip == null) clientip = "<unknown>"; // read an artificial header addendum
|
||||
String userAgent = header.getHeader(HeaderFramework.USER_AGENT);
|
||||
if (userAgent == null) userAgent = "<unknown>";
|
||||
sb.peers.peerActions.setUserAgent(clientip, userAgent);
|
||||
|
||||
// --- handled by Servlet securityHandler
|
||||
// check if user is allowed to search (can be switched in /ConfigPortal_p.html)
|
||||
boolean authenticated = header.isUserInRole(UserDB.AccessRight.ADMIN_RIGHT.toString()); //sb.adminAuthenticated(header) >= 2;
|
||||
// final boolean searchAllowed = authenticated || sb.getConfigBool(SwitchboardConstants.PUBLIC_SEARCHPAGE, true);
|
||||
// if (!searchAllowed) return null;
|
||||
|
||||
// create post
|
||||
serverObjects post = new serverObjects();
|
||||
post.put(CommonParams.Q, ""); post.put("num", "0");
|
||||
// convert servletrequest parameter to old style serverObjects map
|
||||
Map<String, String[]> map = header.getParameterMap();
|
||||
Iterator<Map.Entry<String, String[]>> it = map.entrySet().iterator();
|
||||
while (it.hasNext()) {
|
||||
Map.Entry<String, String[]> param = it.next();
|
||||
post.put(param.getKey(), param.getValue()); // hint: post.put uses String[] for String value anyways
|
||||
}
|
||||
|
||||
ConcurrentLog.info("GSA Query", post.toString());
|
||||
sb.intermissionAllThreads(3000); // tell all threads to do nothing for a specific time
|
||||
|
||||
// rename post fields according to result style
|
||||
//post.put(CommonParams.Q, post.remove("q")); // same as solr
|
||||
//post.put(CommonParams.START, post.remove("start")); // same as solr
|
||||
//post.put(, post.remove("client"));//required, example: myfrontend
|
||||
//post.put(, post.remove("output"));//required, example: xml,xml_no_dtd
|
||||
String originalQuery = post.get(CommonParams.Q, "");
|
||||
post.put("originalQuery", originalQuery);
|
||||
|
||||
// get a solr query string
|
||||
QueryGoal qg = new QueryGoal(originalQuery);
|
||||
List<String> solrFQ = qg.collectionTextFilterQuery(false);
|
||||
StringBuilder solrQ = qg.collectionTextQuery();
|
||||
post.put("defType", "edismax");
|
||||
for (String fq: solrFQ) post.add(CommonParams.FQ, fq);
|
||||
post.put(CommonParams.Q, solrQ.toString());
|
||||
post.put(CommonParams.ROWS, post.remove("num"));
|
||||
post.put(CommonParams.ROWS, Math.min(post.getInt(CommonParams.ROWS, 10), (authenticated) ? 100000000 : 100));
|
||||
|
||||
// set ranking
|
||||
final Ranking ranking = sb.index.fulltext().getDefaultConfiguration().getRanking(0);
|
||||
final String qf = ranking.getQueryFields();
|
||||
if (!qf.isEmpty()) post.put(DisMaxParams.QF, qf);
|
||||
if (post.containsKey(CommonParams.SORT)) {
|
||||
// if a gsa-style sort attribute is given, use this to set the solr sort attribute
|
||||
GSAResponseWriter.Sort sort = new GSAResponseWriter.Sort(post.get(CommonParams.SORT, ""));
|
||||
String sorts = sort.toSolr();
|
||||
if (sorts == null) {
|
||||
post.remove(CommonParams.SORT);
|
||||
} else {
|
||||
post.put(CommonParams.SORT, sorts);
|
||||
}
|
||||
} else {
|
||||
// if no such sort attribute is given, use the ranking as configured for YaCy
|
||||
String fq = ranking.getFilterQuery();
|
||||
String bq = ranking.getBoostQuery();
|
||||
String bf = ranking.getBoostFunction();
|
||||
if (fq.length() > 0) post.put(CommonParams.FQ, fq);
|
||||
if (bq.length() > 0) post.put(DisMaxParams.BQ, bq);
|
||||
if (bf.length() > 0) post.put("boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29
|
||||
}
|
||||
String daterange[] = post.remove("daterange");
|
||||
if (daterange != null) {
|
||||
String origfq = post.get(CommonParams.FQ);
|
||||
String datefq = "";
|
||||
for (String dr: daterange) {
|
||||
String from_to[] = dr.endsWith("..") ? new String[]{dr.substring(0, dr.length() - 2), ""} : dr.startsWith("..") ? new String[]{"", dr.substring(2)} : dr.split("\\.\\.");
|
||||
if (from_to.length != 2) continue;
|
||||
Date from = this.parseGSAFS(from_to[0]);
|
||||
if (from == null) from = new Date(0);
|
||||
Date to = this.parseGSAFS(from_to[1]);
|
||||
if (to == null) to = new Date();
|
||||
to.setTime(to.getTime() + 24L * 60L * 60L * 1000L); // we add a day because the day is inclusive
|
||||
String z = CollectionSchema.last_modified.getSolrFieldName() + ":[" + ISO8601Formatter.FORMATTER.format(from) + " TO " + ISO8601Formatter.FORMATTER.format(to) + "]";
|
||||
datefq = datefq.length() == 0 ? z : " OR " + z;
|
||||
}
|
||||
if (datefq.length() > 0) post.put(CommonParams.FQ, origfq == null || origfq.length() == 0 ? datefq : "(" + origfq + ") AND (" + datefq + ")");
|
||||
}
|
||||
post.put(CommonParams.FL,
|
||||
CollectionSchema.content_type.getSolrFieldName() + ',' +
|
||||
CollectionSchema.id.getSolrFieldName() + ',' +
|
||||
CollectionSchema.sku.getSolrFieldName() + ',' +
|
||||
CollectionSchema.title.getSolrFieldName() + ',' +
|
||||
CollectionSchema.description_txt.getSolrFieldName() + ',' +
|
||||
CollectionSchema.load_date_dt.getSolrFieldName() + ',' +
|
||||
CollectionSchema.last_modified.getSolrFieldName() + ',' +
|
||||
CollectionSchema.size_i.getSolrFieldName());
|
||||
post.put("hl", "true");
|
||||
post.put("hl.q", originalQuery);
|
||||
post.put("hl.fl", CollectionSchema.description_txt + "," + CollectionSchema.h4_txt.getSolrFieldName() + "," + CollectionSchema.h3_txt.getSolrFieldName() + "," + CollectionSchema.h2_txt.getSolrFieldName() + "," + CollectionSchema.h1_txt.getSolrFieldName() + "," + CollectionSchema.text_t.getSolrFieldName());
|
||||
post.put("hl.alternateField", CollectionSchema.description_txt.getSolrFieldName());
|
||||
post.put("hl.simple.pre", "<b>");
|
||||
post.put("hl.simple.post", "</b>");
|
||||
post.put("hl.fragsize", Integer.toString(SearchEvent.SNIPPET_MAX_LENGTH));
|
||||
|
||||
//String[] access = post.remove("access");
|
||||
//String[] entqr = post.remove("entqr");
|
||||
|
||||
// add sites operator
|
||||
String[] site = post.remove("site"); // example: col1|col2
|
||||
if (site != null && site[0].length() > 0) {
|
||||
String origfq = post.get(CommonParams.FQ);
|
||||
String sitefq = QueryModifier.parseCollectionExpression(site[0]);
|
||||
post.put(CommonParams.FQ, origfq == null || origfq.length() == 0 ? sitefq : "(" + origfq + ") AND (" + sitefq + ")");
|
||||
}
|
||||
|
||||
// get the embedded connector
|
||||
EmbeddedSolrConnector connector = sb.index.fulltext().getDefaultEmbeddedConnector();
|
||||
if (connector == null) return;
|
||||
|
||||
// do the solr request
|
||||
SolrQueryRequest req = connector.request(post.toSolrParams(null));
|
||||
SolrQueryResponse response = null;
|
||||
Exception e = null;
|
||||
try {response = connector.query(req);} catch (final SolrException ee) {e = ee;}
|
||||
if (response != null) e = response.getException();
|
||||
if (e != null) {
|
||||
ConcurrentLog.logException(e);
|
||||
if (req != null) req.close();
|
||||
SolrRequestInfo.clearRequestInfo();
|
||||
return;
|
||||
}
|
||||
|
||||
// set some context for the writer
|
||||
/*
|
||||
Map<Object,Object> context = req.getContext();
|
||||
context.put("ip", header.get("CLIENTIP", ""));
|
||||
context.put("client", "vsm_frontent");
|
||||
context.put("sort", sort.sort);
|
||||
context.put("site", site == null ? "" : site);
|
||||
context.put("access", access == null ? "p" : access[0]);
|
||||
context.put("entqr", entqr == null ? "3" : entqr[0]);
|
||||
*/
|
||||
|
||||
// write the result directly to the output stream
|
||||
Writer ow = new FastWriter(new OutputStreamWriter(out, StandardCharsets.UTF_8));
|
||||
try {
|
||||
responseWriter.write(ow, req, response);
|
||||
ow.flush();
|
||||
} catch (final IOException e1) {
|
||||
} finally {
|
||||
req.close();
|
||||
SolrRequestInfo.clearRequestInfo();
|
||||
try {ow.close();} catch (final IOException e1) {}
|
||||
}
|
||||
|
||||
// log result
|
||||
Object rv = response.getValues().get("response");
|
||||
int matches = 0;
|
||||
if (rv != null && rv instanceof ResultContext) {
|
||||
matches = ((ResultContext) rv).getDocList().matches();
|
||||
} else if (rv != null && rv instanceof SolrDocumentList) {
|
||||
matches = (int) ((SolrDocumentList) rv).getNumFound();
|
||||
}
|
||||
AccessTracker.addToDump(originalQuery, matches);
|
||||
ConcurrentLog.info("GSA Query", "results: " + matches + ", for query:" + post.toString());
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse GSA date string (short form of ISO8601 date format)
|
||||
* @param datestring
|
||||
* @return date or null
|
||||
* @see ISO8601Formatter
|
||||
*/
|
||||
public final Date parseGSAFS(final String datestring) {
|
||||
try {
|
||||
return Date
|
||||
.from(LocalDate.parse(datestring, FORMAT_GSAFS).atStartOfDay(ZoneId.systemDefault()).toInstant());
|
||||
} catch (final RuntimeException e) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
}
|
Loading…
Reference in new issue