I think you can also increase the query performance by creating indexes? On Mon, 14 Dec 2020 at 11:36, Rob Sargent <robjsarg...@gmail.com> wrote:
> > > > On Dec 14, 2020, at 4:47 AM, Thomas Kellerer <sham...@gmx.net> wrote: > > > > Karthik Shivashankar schrieb am 14.12.2020 um 12:38: > >> I have a postgres(v9.5) table named customer holding 1 billion rows. > >> It is not partitioned but it has an index against the primary key > >> (integer). I need to keep a very few records (say, about 10k rows) > >> and remove everything else. > >> > >> /insert into customer_backup select * from customer where customer_id > in (<id1>,<id2>,..); / > >> > >> > >> > >> If I go for something like above I'm afraid the insert-select may take > a very long time as when I ran > >> > >> /select count(*) from customer;/ > >> > >> it is taking about 45 minutes to return the count. > > > > Well, you need to compare the time with the same condition you use in > your > > CREATE TABLE .. AS SELECT statement, > > > > e.g.: > > > > select count(*) > > from customer > > where id in (....); > > > > Or: > > > > explain (analyze) > > select * > > from customer > > where id in (....); > > > > > > Regards > > Thomas > > > As for the actually copy of the specific records, I would ‘where exists’ > (even possibly with a temp table of ids) rather than in(id1..id10000) > > > > > > >