[
https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103260#comment-15103260
]
Diego Ceccarelli edited comment on SOLR-8542 at 1/16/16 5:18 PM:
-----------------------------------------------------------------
Hi Ishan, thanks for pointing out SOLR-8183, I didn't know about that, it seems
quite related.
We can plug RankLib creating a new class representing the new LTR model,
extending
[ModelMetadata|https://github.com/bloomberg/lucene-solr/blob/trunk-learning-to-rank-plugin/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/ModelMetadata.java],
for example:
{code:java}
public class RankLibModel extends ModelMetadata {
Ranker rankLibRanker;
RankerFactory rankerFactory = new RankerFactory();
DenseDataPoint documentFeatures = new DenseDataPoint(); // this
contructor is missing, we will need a way to create a datapoint
public RankLibModel(String name, String type, List<Feature> features,
String featureStoreName, Collection<Feature> allFeatures,
NamedParams params) {
super(name, type, features, featureStoreName, allFeatures,
params);
// the file containing the model is a parameter
String ranklibModelFile = getParams().getParam("model-file")
// load the model
rankLibRanking = rankerFactory.loadModel(ranklibModelFile);
}
@Override
public float score(float[] modelFeatureValuesNormalized) {
// set the feature vector in the datapoint object
documentFeatures.setFeatureVector(modelFeatureValuesNormalized)
// predict the score using the ranklib model
return rankLibRanker.eval(documentFeatures);
}
}
{code}
This code will load a particular RankLib model, using the file specified into
the model store configuration.
If you send to Solr a model configuration file like this:
{code}
{
"type":"org.apache.solr.ltr.ranking.RankLibModel",
"name":"ranklib-GBDT",
"features":[
{"name":"isInStock"},
{"name":"price"},
{"name":"originalScore"},
{"name":"productNameMatchQuery"}
],
"params":{
"model-file":"/data/ranking/ranking-GBDT.txt"
}
}
{code}
The plugin will create a RankLib model by using the model in
{{/data/ranking/ranking-GBDT.txt}} and you'll be able
to use it at ranking time using its name {{ranklib-GBDT}}, adding the {{ltr}}
param to the query:
{code}
http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr
model=ranklib-GBDT reRankDocs=25}
{code}
At query time, the features {{isInStock}} , {{price}} , {{originalScore}} , and
{{productNameMatchQuery}} will be computed and provided in the {{score(float[]
modelFeatureValuesNormalized)}} method in order to get the new predicted score
for each document. If RankLib's licence is compatible I think we could plug
this into the plugin. Any comments?
was (Author: diegoceccarelli):
Hi Ishan, thanks for pointing out SOLR-8183, I didn't know about that, it seems
quite related.
We can plug RankLib creating a new class representing the new LTR model,
extending
[ModelMetadata|https://github.com/bloomberg/lucene-solr/blob/trunk-learning-to-rank-plugin/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/ModelMetadata.java],
for example:
{code:java}
public class RankLibModel extends ModelMetadata {
Ranker rankLibRanker;
RankerFactory rankerFactory = new RankerFactory();
DenseDataPoint documentFeatures = new DenseDataPoint(); // this
contructor is missing, we will need a way to create a datapoint
public RankLibModel(String name, String type, List<Feature> features,
String featureStoreName, Collection<Feature> allFeatures,
NamedParams params) {
super(name, type, features, featureStoreName, allFeatures,
params);
// the file containing the model is a parameter
String ranklibModelFile = getParams().getParam("model-file")
// load the model
rankLibRanking = rankerFactory.loadModel(ranklibModelFile);
}
@Override
public float score(float[] modelFeatureValuesNormalized) {
// set the feature vector in the datapoint object
documentFeatures.setFeatureVector(modelFeatureValuesNormalized)
// predict the score using the ranklib model
return rankLibRanker.eval(documentFeatures);
}
}
{code}
This code will load a particular RankLib model, using the file specified into
the model store configuration.
If you send to Solr a model configuration file like this:
{code}
{
"type":"org.apache.solr.ltr.ranking.RankLibModel",
"name":"ranklib-GBDT",
"features":[
{"name":"isInStock"},
{"name":"price"},
{"name":"originalScore"},
{"name":"productNameMatchQuery"}
],
"params":{
"model-file":"/data/ranking/ranking-GBDT.txt"
}
}
{code}
The plugin will create a RankLib model by using the model in
{{/data/ranking/ranking-GBDT.txt}} and you'll be able
to use it at ranking time using its name {{ranklib-GBDT}}, adding the {{ltr}}
param to the query:
{code}
http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr
model=ranklib-GBDT reRankDocs=25}
{code}
At query time, the features {{isInStock}} , {{price}} , {{originalScore}} , and
{{productNameMatchQuery}} will be computed and
and provided in the {{score(float[] modelFeatureValuesNormalized)}} method in
order to get the new predicted score
for each document. If RankLib's licence is compatible I think we could plug
this into the plugin. Any comments?
> Integrate Learning to Rank into Solr
> ------------------------------------
>
> Key: SOLR-8542
> URL: https://issues.apache.org/jira/browse/SOLR-8542
> Project: Solr
> Issue Type: New Feature
> Reporter: Joshua Pantony
> Assignee: Christine Poerschke
> Priority: Minor
> Attachments: README.md, README.md, SOLR-8542-branch_5x.patch,
> SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into
> Solr. Solr Learning to Rank (LTR) provides a way for you to extract features
> directly inside Solr for use in training a machine learned model. You can
> then deploy that model to Solr and use it to rerank your top X search
> results. This concept was previously presented by the authors at Lucene/Solr
> Revolution 2015 (
> http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
> ).
> The attached code was jointly worked on by Joshua Pantony, Michael Nilsson,
> and Diego Ceccarelli.
> Any chance this could make it into a 5x release? We've also attached
> documentation as a github MD file, but are happy to convert to a desired
> format.
> h3. Test the plugin with solr/example/techproducts in 6 steps
> Solr provides some simple example of indices. In order to test the plugin
> with
> the techproducts example please follow these steps
> h4. 1. compile solr and the examples
> cd solr
> ant dist
> ant example
> h4. 2. run the example
> ./bin/solr -e techproducts
> h4. 3. stop it and install the plugin:
>
> ./bin/solr stop
> mkdir example/techproducts/solr/techproducts/lib
> cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar
> example/techproducts/solr/techproducts/lib/
> cp contrib/ltr/example/solrconfig.xml
> example/techproducts/solr/techproducts/conf/
> h4. 4. run the example again
>
> ./bin/solr -e techproducts
> h4. 5. index some features and a model
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore'
> --data-binary "@./contrib/ltr/example/techproducts-features.json" -H
> 'Content-type:application/json'
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore'
> --data-binary "@./contrib/ltr/example/techproducts-model.json" -H
> 'Content-type:application/json'
> h4. 6. have fun !
> *access to the default feature store*
> http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_
> *access to the model store*
> http://localhost:8983/solr/techproducts/schema/mstore
> *perform a query using the model, and retrieve the features*
> http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]