florinbabes commented on PR #2030:
URL: https://github.com/apache/solr/pull/2030#issuecomment-1784155535

   > Hi Florin!
   > 
   > Thanks for opening this pull request, with a test and scoped to only 
change the learning-to-rank model store.
   > 
   > I've added a commit with minor tweaks, feel free to revert or amend.
   > 
   > Taking a step back from the code, here's a couple of thoughts or questions:
   > 
   > * Do we need to consider backwards compatibility i.e. existing storage 
behaviour changing and/or how to communicate the change in the upgrade notes?
   > * Might some users prefer the existing behaviour i.e. more storage space 
but also more human readable?
   > * Would consistency between the LTR model store and the LTR feature store 
be helpful?
   > * Might it be possible to support both the existing and the new behaviour?
   >   
   >   * A system property e.g. `-Dsolr.ltr.model-store.indentSize=2` and 
`-Dsolr.ltr.feature-store.indentSize=2` perhaps?
   >   * A configurable somehow, how?
   >   * A parameter that is passed during upload e.g. different models could 
have different indentation?
   >   * An alternative endpoint for uploading?
   >   * Something else?
   > 
   > What do you think?
   
   Hello @cpoerschke,
   Thanks for the fix. 
   
   The fix is backwards compatible with old saved models. When an update 
(add/delete) to the model store is done, the new file will be saved as 
compacted JSON. 
   `curl 
http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8
   {
     "initArgs":{},
     "initializedOn":"2023-10-29T15:25:29.481Z",
     "managedList":[{
         "name":"6029760550880411648",
         "class":"org.apache.solr.ltr.model.LinearModel",
         "store":"_DEFAULT_",
         "features":[
           {
             "name":"title",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},
           {
             "name":"description",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},
           {
             "name":"keywords",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},
           {
             "name":"popularity",
             "norm":{
               "class":"org.apache.solr.ltr.norm.MinMaxNormalizer",
               "params":{
                 "min":"0.0",
                 "max":"10.0"}}},
           {
             "name":"text",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},
           {
             "name":"queryIntentPerson",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},
           {
             "name":"queryIntentCompany",
             "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],
         "params":{"weights":{
             "title":0.0,
             "description":0.1,
             "keywords":0.2,
             "popularity":0.3,
             "text":0.4,
             "queryIntentPerson":0.1231231,
             "queryIntentCompany":0.12121211}}}]}%
   ➜  ~ curl -XPUT 'http://localhost:8983/solr/small_models/schema/model-store' 
--data-binary "@/tmp/linear-model-new.json" -H 'Content-type:application/json'
   {
     "responseHeader":{
       "status":0,
       "QTime":6
     }
   }%
   ➜  ~ curl 
http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8
   
{"initArgs":{},"initializedOn":"2023-10-29T15:44:47.459Z","updatedSinceInit":"2023-10-29T15:45:15.689Z","managedList":[{"name":"6029760550880411648-after-fix","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231
 
,"queryIntentCompany":0.12121211}}},{"name":"6029760550880411648","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231,"queryIntentCompany":0.12121211}}}]}%
   ➜  ~ curl 
http://localhost:8983/solr/small_models/select\?debugQuery\=false\&fl\=id%2C%20score%2C%5Bfeatures%5D\&indent\=true\&q.op\=OR\&q\=\*%3A\*\&rq\=%7B\!ltr%20model%3D6029760550880411648-after-fix%7D\&useParams\=
   {
     "responseHeader":{
       "zkConnected":true,
       "status":0,
       "QTime":4,
       "params":{
         "df":"text",
         "indent":"true",
         "echoParams":"all",
         "fl":"id, score,[features]",
         "q.op":"OR",
         "distrib.singlePass":"true",
         "rows":"10",
         "q":"*:*",
         "shards.tolerant":"true",
         "sow":"true",
         
"shards.preference":"replica.type:PULL,replica.location:local,replica.base:stable:hash",
         "wt":"json",
         "debugQuery":"false",
         "useParams":"",
         "rq":"{!ltr model=6029760550880411648-after-fix}",
         "rid":"null-191"}},
     
"response":{"numFound":2,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
         {
           "id":"123",
           "score":3.5116758,
           
"[features]":"title=1.0,description=2.0,keywords=2.0,popularity=3.0,text=4.0,queryIntentPerson=5.0,queryIntentCompany=5.0"},
         {
           "id":"1223",
           "score":3.5116758,
           
"[features]":"title=1.0,description=2.0,keywords=2.0,popularity=3.0,text=4.0,queryIntentPerson=5.0,queryIntentCompany=5.0"}]
     }}`
   
   As for "Might some users prefer the existing behaviour i.e. more storage 
space but also more human readable?", only the file is stored as compacted 
JSON. If model-store endpoint will return a formatted JSON.
   Ex:
   `~ curl 
http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8
   
{"initArgs":{},"initializedOn":"2023-10-29T15:44:47.459Z","updatedSinceInit":"2023-10-29T16:06:58.021Z","managedList":[{"name":"6029760550880411648-after-fix","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231
 
,"queryIntentCompany":0.12121211}}},{"name":"6029760550880411648","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231,"queryIntentCompany":0.12121211}}}]}%
   ➜  ~ curl http://localhost:8983/solr/small_models/schema/model-store
   {
     "responseHeader":{
       "status":0,
       "QTime":1
     },
     "models":[{
       "name":"6029760550880411648-after-fix",
       "class":"org.apache.solr.ltr.model.LinearModel",
       "store":"_DEFAULT_",
       "features":[{
         "name":"title",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"description",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"keywords",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"popularity",
         "norm":{
           "class":"org.apache.solr.ltr.norm.MinMaxNormalizer",
           "params":{
             "min":"0.0",
             "max":"10.0"
           }
         }
       },{
         "name":"text",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"queryIntentPerson",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"queryIntentCompany",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       }],
       "params":{
         "weights":{
           "title":0.0,
           "description":0.1,
           "keywords":0.2,
           "popularity":0.3,
           "text":0.4,
           "queryIntentPerson":0.1231231,
           "queryIntentCompany":0.12121211
         }
       }
     },{
       "name":"6029760550880411648",
       "class":"org.apache.solr.ltr.model.LinearModel",
       "store":"_DEFAULT_",
       "features":[{
         "name":"title",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"description",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"keywords",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"popularity",
         "norm":{
           "class":"org.apache.solr.ltr.norm.MinMaxNormalizer",
           "params":{
             "min":"0.0",
             "max":"10.0"
           }
         }
       },{
         "name":"text",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"queryIntentPerson",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       },{
         "name":"queryIntentCompany",
         "norm":{
           "class":"org.apache.solr.ltr.norm.IdentityNormalizer"
         }
       }],
       "params":{
         "weights":{
           "title":0.0,
           "description":0.1,
           "keywords":0.2,
           "popularity":0.3,
           "text":0.4,
           "queryIntentPerson":0.1231231,
           "queryIntentCompany":0.12121211
         }
       }
     }]
   }%`
   
   Yes, a consistency between LTR model-store and LTR feature-store will be 
helpful. I will do a commit soon. 
   
   I do not think we should keep the both behaviours because this an 
improvement and not a new feature. The users should not be affected because the 
model-store API endpoint will return the same formatted JSON as before. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to