Hi Florin and Matthias,

Thanks for sharing about this!

Looking into where the JSON indentation in storage comes from -- from code 
reading only -- I think this is the code trail:
 * 
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/modules/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java#L46
 * 
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/core/src/java/org/apache/solr/rest/ManagedResource.java#L348
 * 
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/core/src/java/org/apache/solr/rest/ManagedResource.java#L238
 * 
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/core/src/java/org/apache/solr/rest/ManagedResourceStorage.java#L443
 * 
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/solrj/src/java/org/apache/solr/common/util/Utils.java#L218-L227

Thinking out aloud ...

... how might storing the model in non-JSON format work? Haven't looked into.

... what are the concerns about DefaultWrapperModel usage, I'm guessing it's 
the split nature (i.e. the wrapper part of the model in ZK but the resource 
part on disk) -- is that so?

Looking at the 
https://solr.apache.org/docs/9_4_0/modules/ltr/org/apache/solr/ltr/model/DefaultWrapperModel.html
 javadocs for an example configuration and there noticing the

  "resource": "models/myModel.json"

element made me wonder ...

... what if the resource being wrapped was not external (giving rise to the 
split nature scenario) but internal inlined in the model?

... Yes, something like

  "content": "{ \"class\": \"org.apache.solr.ltr.model.LinearModel\", \"name\": 
\"myModelName\", \"params\": { ... } }"

would be a bit human-unreadable but would it save enough space?

And/Or could representation in non-JSON format be of interest in some use cases?

  "format": "foobar",
  "content": " ... model representation in foobar format goes here ... "

https://github.com/apache/solr/pull/2018 explores the 
AlternativeFormatWrapperModel idea.

Hope that helps.

Best wishes,
Christine

From: users@solr.apache.org At: 10/16/23 15:02:23 UTC+1:00To:  
users@solr.apache.org
Subject: Re: Zk big files issues and model store

Hi Florin,

What has worked for me was making model deployment a software deployment
task and bundling the model
in the JAR I deployed with Solr as the DefaultWrapperModel also loads
resources from the classpath.

Cheers
Matthias


On Sun, Oct 15, 2023 at 8:54 AM Florin Babes <babesflo...@gmail.com> wrote:

> Hello,
> We reached the limit of zk for storing LTR models. I want to avoid the
> usage of DefaultWrapperModel for as long as possible because we have
> deployed in an container orchestrator and this implementation can be
> really risky if you can not guarantee the presence of the model on
> disk all the time.
> So I want to use the managed model feature to upload some bigger
> models but zk is dying with OOM. What we noticed is that solr stores
> the models in a json indented file. By saving the models in compacted
> json our models will be 60% smaller.
> Do you think that we should try to implement this? Could this work and
> allow us to postpone the moment of using DefaultWrapperModel?
> Thanks,
> Florin Babes
>


Reply via email to