florinbabes commented on PR #2030: URL: https://github.com/apache/solr/pull/2030#issuecomment-1784155535
> Hi Florin! > > Thanks for opening this pull request, with a test and scoped to only change the learning-to-rank model store. > > I've added a commit with minor tweaks, feel free to revert or amend. > > Taking a step back from the code, here's a couple of thoughts or questions: > > * Do we need to consider backwards compatibility i.e. existing storage behaviour changing and/or how to communicate the change in the upgrade notes? > * Might some users prefer the existing behaviour i.e. more storage space but also more human readable? > * Would consistency between the LTR model store and the LTR feature store be helpful? > * Might it be possible to support both the existing and the new behaviour? > > * A system property e.g. `-Dsolr.ltr.model-store.indentSize=2` and `-Dsolr.ltr.feature-store.indentSize=2` perhaps? > * A configurable somehow, how? > * A parameter that is passed during upload e.g. different models could have different indentation? > * An alternative endpoint for uploading? > * Something else? > > What do you think? Hello @cpoerschke, Thanks for the fix. The fix is backwards compatible with old saved models. When an update (add/delete) to the model store is done, the new file will be saved as compacted JSON. `curl http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8 { "initArgs":{}, "initializedOn":"2023-10-29T15:25:29.481Z", "managedList":[{ "name":"6029760550880411648", "class":"org.apache.solr.ltr.model.LinearModel", "store":"_DEFAULT_", "features":[ { "name":"title", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}, { "name":"description", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}, { "name":"keywords", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}, { "name":"popularity", "norm":{ "class":"org.apache.solr.ltr.norm.MinMaxNormalizer", "params":{ "min":"0.0", "max":"10.0"}}}, { "name":"text", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}, { "name":"queryIntentPerson", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}, { "name":"queryIntentCompany", "norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}], "params":{"weights":{ "title":0.0, "description":0.1, "keywords":0.2, "popularity":0.3, "text":0.4, "queryIntentPerson":0.1231231, "queryIntentCompany":0.12121211}}}]}% ➜ ~ curl -XPUT 'http://localhost:8983/solr/small_models/schema/model-store' --data-binary "@/tmp/linear-model-new.json" -H 'Content-type:application/json' { "responseHeader":{ "status":0, "QTime":6 } }% ➜ ~ curl http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8 {"initArgs":{},"initializedOn":"2023-10-29T15:44:47.459Z","updatedSinceInit":"2023-10-29T15:45:15.689Z","managedList":[{"name":"6029760550880411648-after-fix","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231 ,"queryIntentCompany":0.12121211}}},{"name":"6029760550880411648","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231,"queryIntentCompany":0.12121211}}}]}% ➜ ~ curl http://localhost:8983/solr/small_models/select\?debugQuery\=false\&fl\=id%2C%20score%2C%5Bfeatures%5D\&indent\=true\&q.op\=OR\&q\=\*%3A\*\&rq\=%7B\!ltr%20model%3D6029760550880411648-after-fix%7D\&useParams\= { "responseHeader":{ "zkConnected":true, "status":0, "QTime":4, "params":{ "df":"text", "indent":"true", "echoParams":"all", "fl":"id, score,[features]", "q.op":"OR", "distrib.singlePass":"true", "rows":"10", "q":"*:*", "shards.tolerant":"true", "sow":"true", "shards.preference":"replica.type:PULL,replica.location:local,replica.base:stable:hash", "wt":"json", "debugQuery":"false", "useParams":"", "rq":"{!ltr model=6029760550880411648-after-fix}", "rid":"null-191"}}, "response":{"numFound":2,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[ { "id":"123", "score":3.5116758, "[features]":"title=1.0,description=2.0,keywords=2.0,popularity=3.0,text=4.0,queryIntentPerson=5.0,queryIntentCompany=5.0"}, { "id":"1223", "score":3.5116758, "[features]":"title=1.0,description=2.0,keywords=2.0,popularity=3.0,text=4.0,queryIntentPerson=5.0,queryIntentCompany=5.0"}] }}` As for "Might some users prefer the existing behaviour i.e. more storage space but also more human readable?", only the file is stored as compacted JSON. If model-store endpoint will return a formatted JSON. Ex: `~ curl http://localhost:8983/solr/small_models/admin/file\?wt\=json\&_\=1698594868357\&file\=_schema_model-store.json\&contentType\=application%2Fjson%3Bcharset%3Dutf-8 {"initArgs":{},"initializedOn":"2023-10-29T15:44:47.459Z","updatedSinceInit":"2023-10-29T16:06:58.021Z","managedList":[{"name":"6029760550880411648-after-fix","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231 ,"queryIntentCompany":0.12121211}}},{"name":"6029760550880411648","class":"org.apache.solr.ltr.model.LinearModel","store":"_DEFAULT_","features":[{"name":"title","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"description","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"keywords","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"popularity","norm":{"class":"org.apache.solr.ltr.norm.MinMaxNormalizer","params":{"min":"0.0","max":"10.0"}}},{"name":"text","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentPerson","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}},{"name":"queryIntentCompany","norm":{"class":"org.apache.solr.ltr.norm.IdentityNormalizer"}}],"params":{"weights":{"title":0.0,"description":0.1,"keywords":0.2,"popularity":0.3,"text":0.4,"queryIntentPerson":0.1231231,"queryIntentCompany":0.12121211}}}]}% ➜ ~ curl http://localhost:8983/solr/small_models/schema/model-store { "responseHeader":{ "status":0, "QTime":1 }, "models":[{ "name":"6029760550880411648-after-fix", "class":"org.apache.solr.ltr.model.LinearModel", "store":"_DEFAULT_", "features":[{ "name":"title", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"description", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"keywords", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"popularity", "norm":{ "class":"org.apache.solr.ltr.norm.MinMaxNormalizer", "params":{ "min":"0.0", "max":"10.0" } } },{ "name":"text", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"queryIntentPerson", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"queryIntentCompany", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } }], "params":{ "weights":{ "title":0.0, "description":0.1, "keywords":0.2, "popularity":0.3, "text":0.4, "queryIntentPerson":0.1231231, "queryIntentCompany":0.12121211 } } },{ "name":"6029760550880411648", "class":"org.apache.solr.ltr.model.LinearModel", "store":"_DEFAULT_", "features":[{ "name":"title", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"description", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"keywords", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"popularity", "norm":{ "class":"org.apache.solr.ltr.norm.MinMaxNormalizer", "params":{ "min":"0.0", "max":"10.0" } } },{ "name":"text", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"queryIntentPerson", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } },{ "name":"queryIntentCompany", "norm":{ "class":"org.apache.solr.ltr.norm.IdentityNormalizer" } }], "params":{ "weights":{ "title":0.0, "description":0.1, "keywords":0.2, "popularity":0.3, "text":0.4, "queryIntentPerson":0.1231231, "queryIntentCompany":0.12121211 } } }] }%` Yes, a consistency between LTR model-store and LTR feature-store will be helpful. I will do a commit soon. I do not think we should keep the both behaviours because this an improvement and not a new feature. The users should not be affected because the model-store API endpoint will return the same formatted JSON as before. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org