Re: Fastest way to index data to solr

Dave Thu, 29 Sep 2022 11:38:47 -0700

Another way to handle this is have your indexing code fork out to as many cores 
as the solr indexing server has. It’s way less work to force the code to run 
itself that many times in parallel, and as long as your sql queries and said 
tables are properly indexed the database shouldn’t be a bottle neck, just need 
to make sure the indexing server has the resources needed, since obviously you 
never index you a query server.  It’s just a copy and tuned different than the 
indexer for fast reads, not writes.


> On Sep 29, 2022, at 2:21 PM, Andy Lester <a...@petdance.com> wrote:
> 
> 
> 
>> On Sep 29, 2022, at 4:17 AM, Jan Høydahl <jan....@cominvent.com> wrote:
>> 
>> * Index with multiple threads on the client, experiment to find a good 
>> number based on the number of CPUs on receiving side
> 
> That may also mean having multiple clients. We went from taking about 8 hours 
> to index our entire 42M rows to about 1.5 hours because we ran 10 indexer 
> clients at once. Each indexer takes roughly 1/10th of the data and churns 
> away. We don't have any of the clients do a commit. After the indexers are 
> done, we run one more time through the queue with a commit at the end.
> 
> As Jan says, make sure it's not your database that is the bottleneck, and 
> experiment with how many clients you want to have going at once.
> 
> Andy

Re: Fastest way to index data to solr

Reply via email to