On 11/11/2023 12:32, Vince McMahon wrote:
What is the fastest way to load and index this large and wide CSV file?  It
is taking too long, 20+ hours, now.

I am assuming here that you are sending the CSV data directly to Solr and letting Solr parse it into documents. If that is incorrect, please fully describe your indexing software.

How many total documents are being indexed in those 20 hours?

How many threads do you have indexing simultaneously? How many CSV lines are you sending in each batch?

When I was maintaining large-ish Solr installs, I was doing the indexing single-threaded and it would do about 1000 docs per second. Indexing with multiple threads is the secret to making Solr index quickly.

Thanks,
Shawn

Reply via email to