Sure. Let's start by the simplest stream expression. This one only targets person collection.
*Stream Expression:* search(person, q="((((SmartSearchS:"france [$CU] [$PRJ] [$REC] "~100)^4 OR (SmartSearchS:"france [$CU] [$PRJ] [$RECL] "~100)^3 OR (SmartSearchS:"france [$CU] [$PRJ] "~100)^2) OR (((SmartSearchS:(france*))) OR ((SmartSearchS:("france")))^3)) AND ((*:* -StatusSFD:("\*\*\*System Delete\*\*\*")) AND type_level:(parent)))", fl="PersonIDDoc,score", sort="score desc,PersonIDDoc desc", rows="1000") *Schema* <field name="PersonIDDoc" type="string" indexed="true" stored="true" docValues="true" /> *No sharding* *1 shard 45.38GB with *64,348,740 docs stream expresion time : 660 ms *S**harding* *2 shards 23GB each* stream expresion time : 4000 ms On Wed, 10 May 2023 at 04:45, Joel Bernstein <joels...@gmail.com> wrote: > Can you share the expressions? Then we can discuss where the sharding comes > into play. > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Tue, May 9, 2023 at 1:17 PM Sergio García Maroto <marot...@gmail.com> > wrote: > > > Hi, > > > > I am working currently on implementing sharding on current Solr Cloud > > Cluster. > > Main idea is to be able to scale horizontally. > > > > At the moment, without sharding we have all collections sitting on all > > servers. > > We have as well pretty heavy streaming expressions returning many ids. > > Average of 300,000 ids to join. > > > > After doing sharding I see a huge increase on CPU and memory usage. > > Making queries way slower comparing sharding to not sharding. > > > > I guess that's expected bacuase the joins need to send data across > servers > > over network. > > > > Any thoughs on best practices here. I guess a possible approach is to > split > > shards in more. > > > > Regards > > Sergio > > >