On 2005-02-25, "Dave Held" <[EMAIL PROTECTED]> wrote: > A possibility that I would like to pursue is to keep the staging data > from the previous day, do a COPY TO, import the new data into > another staging table with a COPY FROM, then export the fresh > data with another COPY TO. Then, I can write a fast C/C++ > program to do a line-by-line comparison of each record, isolating > the ones that have changed from the previous day. I can then > emit those records in a change file that should be relatively small > and easy to update.
I have an application that does something like this, but rather than use an external program, I do the comparison in the database itself: - import data from external system into a temporary table - compare the temporary table against the live data (a full outer join is a convenient way of doing this - I create an index on the temp table first) - perform insert/update/delete for each record that was added, changed or removed In my case the compare/update is in a pl/pgsql function. My data is only 2-3 million rows, a bit smaller than yours, but I have to update hourly, not daily, and spend no more than 5-10 minutes on each update (currently I can do it in 5: 2 to load the data, 3 to do the compare/update). -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match