You might want to try pg_bulkload, and have it kick out malformed rows.
It's packaged for RHEL6 and above, plus various other distros.
On 09/04/2018 01:13 PM, Holly Gibons wrote:
I'm using PostgreSQL 9.0 via pgAdmin III
I'm trying to build a PostgreSQL/PostGIS database using Entire country
files dataset <http://geonames.nga.mil/gns/html/namefiles.html> but I'm
getting missing data error
I'm wondering if the copy command is affected by diacritics or I've not
set the database up properly
Created a new database with UTF8 encoding
I built the table schema based on the given format
<http://geonames.nga.mil/gns/html/gis_countryfiles.html> (but using type
text for NM_MODIFY_DATE not varchar, having these as dates didn't make a
difference)
I used large text viewer to open the uncompressed countries.txt file and
copied the top 5 rows into a test file
Using PostgreSQL Copy this test file imported correctly so I know my
schema is correct
|copy my_table(List of columns ) from '\\Path\\To\\TestFile.txt' WITH
delimiter E'\t' csv HEADER; |
However when I tried to ingest the larger countries.txt (2.9GB) file I
get an error "missing data" for column xyz at line 12345 (Last column with
data in that row, NM_MODIFY_DATE)
Using large text viewer again I located this line and copied together with
proceeding and following lines into my test file and tried the copy again
but get the same error
I opened the test file in excel to see if there is columns missing
although not every row has data in each column they do all match
the problem row has UFI -3373955 & UNI 6329083
I don't know if this is relevant but looking at the database properties ,
in pgAdmin, the 'collection' & 'Character type' are both set as
"English_United Kingdom, 1252 " I didn't set this and creating a new DB
the options are "C", "English_United Kingdom, 1252 " or "POSIX"
Could someone suggest what I'm doing wrong?
Thank you
--
Angular momentum makes the world go 'round.