Re: Solr as a dedicated data store?

dmitri maziuk Fri, 08 Apr 2022 10:53:34 -0700

On 2022-04-07 11:51 PM, Shawn Heisey wrote:
...

As I understand it, ES offers reindex capability by storing the entireinput document into a field in the index. Which means that the indexwill be lot bigger than it needs to be, which is going to affectperformance. If the field is not indexed, then the performance impactmay not be huge, but it will not be zero. And it wouldn't reallyimprove the speed of a full reindex, it just makes it possible to do areindex without an external data source.
The same thing can be done with Solr, and it is something I woulddefinitely say needs to be part of any index design where Solr will be aprimary data store. That capability should be available in Solr, but Ido not think it should be enabled by default.

What would be the advantage over dumping the documents into a text file(xml, json) and doing a full re-import? In principle you could dumpeverything Solr needs into the file and only check if it's all thereduring the import; that plus the protocol overhead would be the onlydownside. And deleting the existing index will take a little extra time.

The upside if we can stick the files into git and have versions, itshould compress really well, we can clone it to off-site storage etc. etc.


Dima

Re: Solr as a dedicated data store?

Reply via email to