On 12/15/23 05:41, Vince McMahon wrote:
Ishan, you are right.  Doing multithreaded Indexing is going much faster.
I found out after the remote machine became unresponsive very quickly ; it
crashed.  lol.
FWIW I got better results posting docs in batches from a single thread. Work is in a "private org" on gitlab so I can't post the link to the code, but the basic layout is a DB reader that yields rows and a writer that does requests.post() of a list of JSON docs. With the DB row -> JSON doc transformer in-between.

I played with the size of the batch as well as async/await queue before leaving it single-threaded w/ batch size of 5K docs: I had no speed advantage with larger batches in our setup. And it doesn't DDoS the index. ;)

Dima

Reply via email to