Re: [GENERAL] COPY from .csv File and Remove Duplicates [RESOLVED]

2011-08-12 Thread Rich Shepard
On Fri, 12 Aug 2011, David Johnston wrote: Thus, you need to replace the "*" in the SELECT with the specific columns that correspond to the columns listed in to INSERT portion of the query. David, Mea culpa! I should have seen this myself. Now the query works and I have about 6K duplicate p

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-12 Thread David Johnston
INSERT INTO chem_too (lab_nbr, loc_name, sample_date, param, quant, units, qa_qc, easting, northing, remark) SELECT * FROM chemistry Natural Inner join ( SELECT loc_name, sample_date, param, Count(*) as duplicate_count FROM chemistry GROUP BY loc_name, sample_date, param) group

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-12 Thread Rich Shepard
On Fri, 12 Aug 2011, David Johnston wrote: Select * From table Natural Inner join ( SELECT loc_name, sample_date, param, Count(*) as duplicate_count FROM table Group by loc_name, sample_date, param ) grouped Where duplicate_count > 1; Tried to use the above in an INSERT INTO statement to a c

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-12 Thread Rich Shepard
On Fri, 12 Aug 2011, David Johnston wrote: Select * From table Natural Inner join ( SELECT loc_name, sample_date, param, Count(*) as duplicate_count FROM table Group by loc_name, sample_date, param ) grouped Where duplicate_count > 1 ; David, Thank you. I was close in my attempts, but not s

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-12 Thread David Johnston
> A pointer to the appropriate syntax for retrieving the entire row when > count(loc_name, sample_date, param) > 1 would be much appreciated. > > Rich > Select * From table Natural Inner join ( SELECT loc_name, sample_date, param, Count(*) as duplicate_count FROM table Group by loc_name, sampl

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-12 Thread Rich Shepard
On Thu, 11 Aug 2011, David Johnston wrote: If you have duplicates with matching real keys inserting into a staging table and then moving new records to the final table is your best option (in general it is better to do a two-step with a staging table since you can readily use Postgresql to perfo

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread Craig Ringer
On 12/08/2011 10:32 AM, David Johnston wrote: The general structure for the insert would be: INSERT INTO maintable (cols) SELECT cols FROM staging WHERE staging.idcols NOT IN (SELECT maintable.idcols FROM maintable); There may be more efficient ways to write the query but the idea is the same.

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread David Johnston
>> There is no true key, only an artificial key so I can ensure that rows are >> unique. That's in the main table with the 50K rows. No key column in the >> .csv file. If you have no true key then you have no way to ensure uniqueness. By adding an artificial key two records that are otherwise du

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread Craig Ringer
On 12/08/2011 7:13 AM, Rich Shepard wrote: I have a .csv file of approximately 10k rows to copy into this table. My two questions which have not been answered by reference to my postgres reference book or Google searches are: 1) Will the sequence automatically add the nextval() to each new reco

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread Rich Shepard
On Thu, 11 Aug 2011, David Johnston wrote: If you have duplicates with matching real keys inserting into a staging table and then moving new records to the final table is your best option (in general it is better to do a two-step with a staging table since you can readily use Postgresql to perfo

Re: [GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread David Johnston
On Aug 11, 2011, at 19:13, Rich Shepard wrote: > A table has a sequence to generate a primary key for inserted records with > NULLs in that column. > > I have a .csv file of approximately 10k rows to copy into this table. My > two questions which have not been answered by reference to my postg

[GENERAL] COPY from .csv File and Remove Duplicates

2011-08-11 Thread Rich Shepard
A table has a sequence to generate a primary key for inserted records with NULLs in that column. I have a .csv file of approximately 10k rows to copy into this table. My two questions which have not been answered by reference to my postgres reference book or Google searches are: 1) Will th