fixed ranking for add-function queries: this did not work. The option

was removed. All function queries are now boosts (multiplies the score
according to a function). This is also the recommended way to boost
rankings based on functions as explained in
http://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/
pull/1/head
Michael Peter Christen 12 years ago
parent ac5fa9fe48
commit 97775fbebc

@ -960,22 +960,18 @@ search.ranking.solr.collection.boostname.tmp.0=_default
search.ranking.solr.collection.boostfields.tmp.0=text_t^2.0,url_paths_sxt^50.0,title^100.0,synonyms_sxt^1.0 search.ranking.solr.collection.boostfields.tmp.0=text_t^2.0,url_paths_sxt^50.0,title^100.0,synonyms_sxt^1.0
search.ranking.solr.collection.boostquery.tmp.0=fuzzy_signature_unique_b:true^100000.0 search.ranking.solr.collection.boostquery.tmp.0=fuzzy_signature_unique_b:true^100000.0
search.ranking.solr.collection.boostfunction.tmp.0= search.ranking.solr.collection.boostfunction.tmp.0=
search.ranking.solr.collection.boostfunctionmode.tmp.0=add
search.ranking.solr.collection.boostname.tmp.1=_date search.ranking.solr.collection.boostname.tmp.1=_date
search.ranking.solr.collection.boostfields.tmp=text_t^1.0 search.ranking.solr.collection.boostfields.tmp=text_t^1.0
search.ranking.solr.collection.boostquery.tmp.1=fuzzy_signature_unique_b:true^100000.0 search.ranking.solr.collection.boostquery.tmp.1=fuzzy_signature_unique_b:true^100000.0
search.ranking.solr.collection.boostfunction.tmp.1=recip(ms(NOW,last_modified),3.16e-11,1,1) search.ranking.solr.collection.boostfunction.tmp.1=recip(ms(NOW,last_modified),3.16e-11,1,1)
search.ranking.solr.collection.boostfunctionmode.tmp.1=multiply
search.ranking.solr.collection.boostname.tmp.2=_intranet search.ranking.solr.collection.boostname.tmp.2=_intranet
search.ranking.solr.collection.boostfields.tmp.2=text_t^2.0,url_paths_sxt^20.0,title^10000.0,h1_txt^10000.0,h2_txt^1000.0,synonyms_sxt^1.0 search.ranking.solr.collection.boostfields.tmp.2=text_t^2.0,url_paths_sxt^20.0,title^10000.0,h1_txt^10000.0,h2_txt^1000.0,synonyms_sxt^1.0
search.ranking.solr.collection.boostquery.tmp.2=fuzzy_signature_unique_b:true^100000.0 search.ranking.solr.collection.boostquery.tmp.2=fuzzy_signature_unique_b:true^100000.0
search.ranking.solr.collection.boostfunction.tmp.2=pow(url_chars_i,2) search.ranking.solr.collection.boostfunction.tmp.2=pow(url_chars_i,2)
search.ranking.solr.collection.boostfunctionmode.tmp.2=add
search.ranking.solr.collection.boostname.tmp.3=_unused3 search.ranking.solr.collection.boostname.tmp.3=_unused3
search.ranking.solr.collection.boostfields.tmp.3=text_t^1.0 search.ranking.solr.collection.boostfields.tmp.3=text_t^1.0
search.ranking.solr.collection.boostquery.tmp.3=fuzzy_signature_unique_b:true^100000.0 search.ranking.solr.collection.boostquery.tmp.3=fuzzy_signature_unique_b:true^100000.0
search.ranking.solr.collection.boostfunction.tmp.3=div(add(1,references_i),add(url_chars_i,pow(clickdepth_i,3))) search.ranking.solr.collection.boostfunction.tmp.3=div(add(1,references_i),add(url_chars_i,pow(clickdepth_i,3)))
search.ranking.solr.collection.boostfunctionmode.tmp.3=multiply
# the following values are used to identify duplicate content # the following values are used to identify duplicate content
search.ranking.solr.doubledetection.minlength=3 search.ranking.solr.doubledetection.minlength=3

@ -22,20 +22,15 @@
<fieldset> <fieldset>
<input type="hidden" name="profileNr" value="#[profileNr]#" /> <input type="hidden" name="profileNr" value="#[profileNr]#" />
<legend>Boost Function</legend> <legend>Boost Function</legend>
A Boost Function can combine numeric values from the result document to produce a number which is either added or multiplied with the other boost value from the query result. A Boost Function can combine numeric values from the result document to produce a number which is multiplied with the score value from the query result.
To see all available fields, see the <a href="IndexSchema_p.html">YaCy Solr Schema</a> and look for numeric values (these are names with suffix '_i'). To see all available fields, see the <a href="IndexSchema_p.html">YaCy Solr Schema</a> and look for numeric values (these are names with suffix '_i').
To find out which kind of operations are possible, see the <a href="http://wiki.apache.org/solr/FunctionQuery">Solr Function Query</a> documentation. To find out which kind of operations are possible, see the <a href="http://wiki.apache.org/solr/FunctionQuery">Solr Function Query</a> documentation.
Example: to order by date, use "recip(ms(NOW,last_modified),3.16e-11,1,1)", to order by clickdepth, use "div(100,add(clickdepth_i,1))". Example: to order by date, use "recip(rord(last_modified),1,1000,1000)", to order by clickdepth, use "div(100,add(clickdepth_i,1))".
<dl> <dl>
<dt style="width:260px;margin:0;padding:0;height:1.8em;"><label for="bf" id="bf_label">#[modeKey]#</label></dt> <dt style="width:260px;margin:0;padding:0;height:1.8em;"><label for="bf" id="bf_label">boost</label></dt>
<dd style="width:360px;margin:0;padding:0;height:1.8em;float:left;display:inline;" id="bf_dd"> <dd style="width:360px;margin:0;padding:0;height:1.8em;float:left;display:inline;" id="bf_dd">
<input name="bf" id="bf" type="text" align="left" size="100" value="#[bf]#" /> <input name="bf" id="bf" type="text" align="left" size="100" value="#[bf]#" />
</dd> </dd>
<dt style="width:260px;margin:0;padding:0;height:1.8em;"><label for="bq">mode</label></dt>
<dd style="width:360px;margin:0;padding:0;height:1.8em;float:left;display:inline;" id="bf_dd">
<input type="radio" name="mode" id="add" onclick="document.getElementById('bf_label').innerHTML='bf'" value="add" #(add.checked)#:: checked="checked"#(/add.checked)# />add&nbsp;&nbsp;&nbsp;
<input type="radio" name="mode" id="multiply" onclick="document.getElementById('bf_label').innerHTML='boost'" value="multiply" #(multiply.checked)#:: checked="checked"#(/multiply.checked)# />multiply
</dd>
<dt style="width:260px;margin:0;padding:0;height:1.8em;"></dt> <dt style="width:260px;margin:0;padding:0;height:1.8em;"></dt>
<dd style="width:360px;margin:0;padding:0;height:1.8em;float:left;display:inline;"> <dd style="width:360px;margin:0;padding:0;height:1.8em;float:left;display:inline;">
<input type="submit" name="EnterBF" value="Set Boost Function" /> <input type="submit" name="EnterBF" value="Set Boost Function" />

@ -90,22 +90,16 @@ public class RankingSolr_p {
if (post != null && post.containsKey("EnterBF")) { if (post != null && post.containsKey("EnterBF")) {
String bf = post.get("bf"); String bf = post.get("bf");
String mode = post.get("mode");
if (bf != null) { if (bf != null) {
sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + profileNr, bf); sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + profileNr, bf);
sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTIONMODE_ + profileNr, mode);
sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setBoostFunction(bf); sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setBoostFunction(bf);
sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setMode(Ranking.BoostFunctionMode.valueOf(mode));
} }
} }
if (post != null && post.containsKey("ResetBF")) { if (post != null && post.containsKey("ResetBF")) {
String bf = ""; //"div(add(1,references_i),pow(add(1,inboundlinkscount_i),1.6))"; String bf = ""; //"div(add(1,references_i),pow(add(1,inboundlinkscount_i),1.6))";
String mode = "add";
if (bf != null) { if (bf != null) {
sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + profileNr, bf); sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + profileNr, bf);
sb.setConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTIONMODE_ + profileNr, mode);
sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setBoostFunction(bf); sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setBoostFunction(bf);
sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr).setMode(Ranking.BoostFunctionMode.valueOf(mode));
} }
} }
@ -129,9 +123,6 @@ public class RankingSolr_p {
prop.put("boosts", i); prop.put("boosts", i);
prop.put("bq", ranking.getBoostQuery()); prop.put("bq", ranking.getBoostQuery());
prop.put("bf", ranking.getBoostFunction()); prop.put("bf", ranking.getBoostFunction());
prop.put("modeKey", ranking.getMethod() == Ranking.BoostFunctionMode.add ? "bf" : "boost");
prop.put("add.checked", ranking.getMethod() == Ranking.BoostFunctionMode.add ? 1 : 0);
prop.put("multiply.checked", ranking.getMethod() == Ranking.BoostFunctionMode.add ? 0 : 1);
for (int j = 0; j < 4; j++) { for (int j = 0; j < 4; j++) {
prop.put("profiles_" + j + "_nr", j); prop.put("profiles_" + j + "_nr", j);

@ -109,7 +109,7 @@ public class searchresult {
// get a solr query string // get a solr query string
QueryGoal qg = new QueryGoal(originalQuery, originalQuery); QueryGoal qg = new QueryGoal(originalQuery, originalQuery);
StringBuilder solrQ = qg.collectionQueryString(sb.index.fulltext().getDefaultConfiguration()); StringBuilder solrQ = qg.collectionQueryString(sb.index.fulltext().getDefaultConfiguration(), 0);
post.put("defType", "edismax"); post.put("defType", "edismax");
post.put(CommonParams.Q, solrQ.toString()); post.put(CommonParams.Q, solrQ.toString());
post.put(CommonParams.ROWS, post.remove("num")); post.put(CommonParams.ROWS, post.remove("num"));
@ -130,8 +130,8 @@ public class searchresult {
Ranking ranking = sb.index.fulltext().getDefaultConfiguration().getRanking(0); Ranking ranking = sb.index.fulltext().getDefaultConfiguration().getRanking(0);
String bq = ranking.getBoostQuery(); String bq = ranking.getBoostQuery();
String bf = ranking.getBoostFunction(); String bf = ranking.getBoostFunction();
if (bq.length() > 0) post.put("bq", bq); // a boost query that moves double content to the back if (bq.length() > 0) post.put("bq", bq);
if (bf.length() > 0) post.put(ranking.getMethod() == Ranking.BoostFunctionMode.add ? "bf" : "boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 if (bf.length() > 0) post.put("boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29
} }
post.put(CommonParams.FL, post.put(CommonParams.FL,
CollectionSchema.content_type.getSolrFieldName() + ',' + CollectionSchema.content_type.getSolrFieldName() + ',' +

@ -146,6 +146,9 @@ public class select {
if (post == null) return null; if (post == null) return null;
sb.intermissionAllThreads(3000); // tell all threads to do nothing for a specific time sb.intermissionAllThreads(3000); // tell all threads to do nothing for a specific time
// get the ranking profile id
int profileNr = post.getInt("profileNr", 0);
// rename post fields according to result style // rename post fields according to result style
if (!post.containsKey(CommonParams.Q) && post.containsKey("query")) { if (!post.containsKey(CommonParams.Q) && post.containsKey("query")) {
String querystring = post.get("query", ""); String querystring = post.get("query", "");
@ -154,7 +157,7 @@ public class select {
querystring = modifier.parse(querystring); querystring = modifier.parse(querystring);
modifier.apply(post); modifier.apply(post);
QueryGoal qg = new QueryGoal(querystring, querystring); QueryGoal qg = new QueryGoal(querystring, querystring);
StringBuilder solrQ = qg.collectionQueryString(sb.index.fulltext().getDefaultConfiguration()); StringBuilder solrQ = qg.collectionQueryString(sb.index.fulltext().getDefaultConfiguration(), profileNr);
post.put(CommonParams.Q, solrQ.toString()); // sru patch post.put(CommonParams.Q, solrQ.toString()); // sru patch
} }
String q = post.get(CommonParams.Q, ""); String q = post.get(CommonParams.Q, "");
@ -162,14 +165,14 @@ public class select {
if (!post.containsKey(CommonParams.ROWS)) post.put(CommonParams.ROWS, post.remove("maximumRecords", 10)); // sru patch if (!post.containsKey(CommonParams.ROWS)) post.put(CommonParams.ROWS, post.remove("maximumRecords", 10)); // sru patch
post.put(CommonParams.ROWS, Math.min(post.getInt(CommonParams.ROWS, 10), (authenticated) ? 10000 : 100)); post.put(CommonParams.ROWS, Math.min(post.getInt(CommonParams.ROWS, 10), (authenticated) ? 10000 : 100));
// set default ranking if this is not given in the request // set ranking according to profile number if ranking attributes are not given in the request
if (!post.containsKey("sort")) { if (!post.containsKey("sort") && !post.containsKey("bq") && !post.containsKey("bf") && !post.containsKey("boost")) {
if (!post.containsKey("defType")) post.put("defType", "edismax"); if (!post.containsKey("defType")) post.put("defType", "edismax");
Ranking ranking = sb.index.fulltext().getDefaultConfiguration().getRanking(0); Ranking ranking = sb.index.fulltext().getDefaultConfiguration().getRanking(profileNr);
String bq = ranking.getBoostQuery(); String bq = ranking.getBoostQuery();
String bf = ranking.getBoostFunction(); String bf = ranking.getBoostFunction();
if (!post.containsKey("bq") && bq.length() > 0) post.put("bq", bq); // a boost query that moves double content to the back if (bq.length() > 0) post.put("bq", bq);
if (!(post.containsKey("bf") || post.containsKey("boost")) && bf.length() > 0) post.put(ranking.getMethod() == Ranking.BoostFunctionMode.add ? "bf" : "boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 if (bf.length() > 0) post.put("boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29
} }
// get a response writer for the result // get a response writer for the result

@ -36,13 +36,8 @@ public class Ranking {
private static float quantRate = 0.5f; // to be filled with search.ranking.solr.doubledetection.quantrate private static float quantRate = 0.5f; // to be filled with search.ranking.solr.doubledetection.quantrate
private static int minTokenLen = 3; // to be filled with search.ranking.solr.doubledetection.minlength private static int minTokenLen = 3; // to be filled with search.ranking.solr.doubledetection.minlength
public static enum BoostFunctionMode {
add, multiply;
}
private Map<SchemaDeclaration, Float> fieldBoosts; private Map<SchemaDeclaration, Float> fieldBoosts;
private String name, boostQuery, boostFunction; private String name, boostQuery, boostFunction;
private BoostFunctionMode mode;
public Ranking() { public Ranking() {
super(); super();
@ -50,7 +45,6 @@ public class Ranking {
this.fieldBoosts = new LinkedHashMap<SchemaDeclaration, Float>(); this.fieldBoosts = new LinkedHashMap<SchemaDeclaration, Float>();
this.boostQuery = ""; this.boostQuery = "";
this.boostFunction = ""; this.boostFunction = "";
this.mode = BoostFunctionMode.add;
} }
@ -118,15 +112,6 @@ public class Ranking {
return this.boostFunction; return this.boostFunction;
} }
public void setMode(BoostFunctionMode method) {
this.mode = method;
}
public BoostFunctionMode getMethod() {
return this.mode;
}
/* /*
* duplicate check static methods * duplicate check static methods
*/ */

@ -470,7 +470,6 @@ public final class Switchboard extends serverSwitch {
r.updateBoosts(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFIELDS_ + i, "text_t^1.0")); r.updateBoosts(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFIELDS_ + i, "text_t^1.0"));
r.setBoostQuery(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTQUERY_ + i, "")); r.setBoostQuery(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTQUERY_ + i, ""));
r.setBoostFunction(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + i, "")); r.setBoostFunction(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ + i, ""));
r.setMode(Ranking.BoostFunctionMode.valueOf(this.getConfig(SwitchboardConstants.SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTIONMODE_ + i, "add")));
} }
// initialize index // initialize index

@ -494,7 +494,6 @@ public final class SwitchboardConstants {
public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTFIELDS_ = "search.ranking.solr.collection.boostfields.tmp."; public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTFIELDS_ = "search.ranking.solr.collection.boostfields.tmp.";
public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTQUERY_ = "search.ranking.solr.collection.boostquery.tmp."; public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTQUERY_ = "search.ranking.solr.collection.boostquery.tmp.";
public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ = "search.ranking.solr.collection.boostfunction.tmp."; public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTION_ = "search.ranking.solr.collection.boostfunction.tmp.";
public static final String SEARCH_RANKING_SOLR_COLLECTION_BOOSTFUNCTIONMODE_ = "search.ranking.solr.collection.boostfunctionmode.tmp.";
/** /**
* system tray * system tray

@ -195,7 +195,7 @@ public class QueryGoal {
for (final byte[] b: blues) this.include_hashes.remove(b); for (final byte[] b: blues) this.include_hashes.remove(b);
} }
public StringBuilder collectionQueryString(CollectionConfiguration configuration) { public StringBuilder collectionQueryString(CollectionConfiguration configuration, int rankingProfile) {
final StringBuilder q = new StringBuilder(80); final StringBuilder q = new StringBuilder(80);
// parse special requests // parse special requests
@ -222,7 +222,7 @@ public class QueryGoal {
// combine these queries for all relevant fields // combine these queries for all relevant fields
wc = 0; wc = 0;
Float boost; Float boost;
Ranking r = configuration.getRanking(0); Ranking r = configuration.getRanking(rankingProfile);
for (Map.Entry<SchemaDeclaration,Float> entry: r.getBoostMap()) { for (Map.Entry<SchemaDeclaration,Float> entry: r.getBoostMap()) {
SchemaDeclaration field = entry.getKey(); SchemaDeclaration field = entry.getKey();
boost = entry.getValue(); boost = entry.getValue();

@ -384,14 +384,15 @@ public final class QueryParams {
if (this.queryGoal.getIncludeStrings().size() == 0) return null; if (this.queryGoal.getIncludeStrings().size() == 0) return null;
// construct query // construct query
final SolrQuery params = new SolrQuery(); final SolrQuery params = new SolrQuery();
params.setQuery(this.queryGoal.collectionQueryString(this.indexSegment.fulltext().getDefaultConfiguration()).toString()); int rankingProfile = this.ranking.coeff_date == RankingProfile.COEFF_MAX ? 1 : (this.modifier.sitehash != null || this.modifier.sitehost != null) ? 2 : 0;
params.setQuery(this.queryGoal.collectionQueryString(this.indexSegment.fulltext().getDefaultConfiguration(), rankingProfile).toString());
params.setParam("defType", "edismax"); params.setParam("defType", "edismax");
Ranking ranking = indexSegment.fulltext().getDefaultConfiguration().getRanking(0); Ranking ranking = indexSegment.fulltext().getDefaultConfiguration().getRanking(rankingProfile); // for a by-date ranking select different ranking profile
//Ranking ranking = indexSegment.fulltext().getDefaultConfiguration().getRanking(this.ranking.coeff_date == RankingProfile.COEFF_MAX ? 1 : (this.modifier.sitehash != null || this.modifier.sitehost != null) ? 2 : 0); // for a by-date ranking select different ranking profile
String bq = ranking.getBoostQuery(); String bq = ranking.getBoostQuery();
String bf = ranking.getBoostFunction(); String bf = ranking.getBoostFunction();
if (bq.length() > 0) params.setParam("bq", bq); // a boost query that moves double content to the back if (bq.length() > 0) params.setParam("bq", bq);
if (bf.length() > 0) params.setParam(ranking.getMethod() == Ranking.BoostFunctionMode.add ? "bf" : "boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 if (bf.length() > 0) params.setParam("boost", bf); // a boost function extension, see http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29
params.setStart(this.offset); params.setStart(this.offset);
params.setRows(this.itemsPerPage); params.setRows(this.itemsPerPage);
params.setFacet(false); params.setFacet(false);

Loading…
Cancel
Save