select count(*) from <table> is probably not using the index that your insert/select would, so I would not use that as a test for performance.
If customer_backup has an index, the insert-select will be performance-limited by updating that index. If you can do a *create table customer_backup* as *select * from customer where customer_id in (<id1>,<id2>,..); * I expect it to run quite fast as long as customer_id is indexed and analyzed. On Mon, Dec 14, 2020 at 9:37 AM Muhammad Bilal Jamil <mbjami...@gmail.com> wrote: > I think you can also increase the query performance by creating indexes? > > On Mon, 14 Dec 2020 at 11:36, Rob Sargent <robjsarg...@gmail.com> wrote: > >> >> >> > On Dec 14, 2020, at 4:47 AM, Thomas Kellerer <sham...@gmx.net> wrote: >> > >> > Karthik Shivashankar schrieb am 14.12.2020 um 12:38: >> >> I have a postgres(v9.5) table named customer holding 1 billion rows. >> >> It is not partitioned but it has an index against the primary key >> >> (integer). I need to keep a very few records (say, about 10k rows) >> >> and remove everything else. >> >> >> >> /insert into customer_backup select * from customer where customer_id >> in (<id1>,<id2>,..); / >> >> >> >> >> >> >> >> If I go for something like above I'm afraid the insert-select may take >> a very long time as when I ran >> >> >> >> /select count(*) from customer;/ >> >> >> >> it is taking about 45 minutes to return the count. >> > >> > Well, you need to compare the time with the same condition you use in >> your >> > CREATE TABLE .. AS SELECT statement, >> > >> > e.g.: >> > >> > select count(*) >> > from customer >> > where id in (....); >> > >> > Or: >> > >> > explain (analyze) >> > select * >> > from customer >> > where id in (....); >> > >> > >> > Regards >> > Thomas >> > >> As for the actually copy of the specific records, I would ‘where exists’ >> (even possibly with a temp table of ids) rather than in(id1..id10000) >> >> > >> >> >> >>