It's quite common to hear about the benefit of sharding. Until we reach the I/O bound on our machines, sharding is likely to reduce the query time. Furthermore working on smaller indexes will make the single searches faster on the smaller nodes. But what about the other way around ? What if we actually shard much more than needed ? Are we going to see also an increase in the query time ( due to the overhead of query distribution and aggregation of results ? )
Example : 100.000.000 docs across 16 shards . Wouldn't be more effective to have 4 or 8 shards maximum ? I suspect that prototyping is the right answer, but do we have any general suggestions strongly motivated ? Any resource to study ? Cheers -- -------------------------- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
