Re: Merge David and Goliath tables efficiently

2023-06-17 Thread Tomas Vondra
On 6/17/23 23:42, nicolas paris wrote: >>> My interpretation reading the query plan is: well sized small >>> batches of upserts leverage the indexes while the regular join >>> choose the sequential scan, including sorting and hashing which >>> takes forever time and resources including disk. >>

Re: Merge David and Goliath tables efficiently

2023-06-17 Thread nicolas paris
> > My interpretation reading the query plan is: well sized small > > batches of upserts leverage the indexes while the regular join > > choose the sequential scan, including sorting and hashing which > > takes forever time and resources including disk. > > You may be right, but it's hard to tell

Re: Merge David and Goliath tables efficiently

2023-06-17 Thread Tomas Vondra
On 6/17/23 15:48, Nicolas Paris wrote: > In my use case I have a 2billion / 1To table. I have daily data to upsert > around 2milion, with say 50% inserts, based on the primary key in a fresh > analyzed table. > > I have tested multiple strategies to merge the data, all based on first stage > to

Merge David and Goliath tables efficiently

2023-06-17 Thread Nicolas Paris
In my use case I have a 2billion / 1To table. I have daily data to upsert around 2milion, with say 50% inserts, based on the primary key in a fresh analyzed table. I have tested multiple strategies to merge the data, all based on first stage to copy the 2m dataset in an staging unlogged / index