Re: Document serializable representation

Denis Bazhenov Thu, 30 Mar 2017 02:02:45 -0700

We already have done this. Many years ago :)

At the moment we have 7 shards. The problem with getting more shards is that 
search become less cost effective (in terms of cluster CPU time per request) as 
you split index in more shards. Considering response time is good enough and 
the fact search nodes are ~90% of all hardware budget of the cluster, it’s much 
more cost effective to split analysis from IndexWriter than split index in more 
shards. It simply would require from us to put disproportionately more hardware 
in cluster.


> On Mar 30, 2017, at 18:36, Uwe Schindler <u...@thetaphi.de> wrote:
> 
> What you would better do is to just split your index into multiple shards and 
> have separate IndexWriter instances on different machines. Those can act on 
> their own. This is what Elasticsearch or Solr are doing: They accept the 
> document, decide which shard they should be located and transfer the plain 
> fieldname:value pairs over the network. Each node then creates Lucene 
> IndexableDocuments out of it and passes to their own IndexWriter. 

---
Denis Bazhenov <dot...@gmail.com>

Re: Document serializable representation

Reply via email to