Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-28 Thread Andrew Gierth
> "Matthias" == Matthias Apitz writes: Matthias> i.e. 0xc3 is translated to 0xc383 and the 2nd half, the Matthias> 0xbc to 0xc2bc, both translations have nothing to do with Matthias> the original split 0xc3bc, and perhaps in this case it Matthias> would be better to spill out a bl

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-28 Thread Matthias Apitz
El día sábado, marzo 28, 2020 a las 09:40:30a. m. +1300, Thomas Munro escribió: > On Sat, Mar 28, 2020 at 4:46 AM Tom Lane wrote: > > Matthias Apitz writes: > > > In short, it there a way to let \COPY accept such broken ISO bytes, just > > > complaining about, but not stopping the insert of the

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Rory Campbell-Lange
On 27/03/20, Andrew Gierth (and...@tao11.riddles.org.uk) wrote: > > "Rory" == Rory Campbell-Lange writes: > > Rory> Or: > > Rory> iconv -f WINDOWS-1252 -t UTF-8 -c < tempfile2 > tempfile3 > > No. That's just a conversion of win1252 to utf8 without regard for any > UTF8 that might alre

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Andrew Gierth
> "Rory" == Rory Campbell-Lange writes: Rory> Or: Rory> iconv -f WINDOWS-1252 -t UTF-8 -c < tempfile2 > tempfile3 No. That's just a conversion of win1252 to utf8 without regard for any UTF8 that might already be present in the input. Any such input will end up double-encoded, requirin

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Rory Campbell-Lange
On 27/03/20, Andrew Gierth (and...@tao11.riddles.org.uk) wrote: > > "Thomas" == Thomas Munro writes: > > Thomas> Something like this approach might be useful for fixing the CSV file: > > Thomas> > https://codereview.stackexchange.com/questions/185821/convert-a-mix-of-latin-1-and-utf-8-to-

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Andrew Gierth
> "Thomas" == Thomas Munro writes: Thomas> Something like this approach might be useful for fixing the CSV file: Thomas> https://codereview.stackexchange.com/questions/185821/convert-a-mix-of-latin-1-and-utf-8-to-proper-utf-8 Or: perl -MEncode -pe ' use bytes; sub c { decode("UTF-8",s

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Thomas Munro
On Sat, Mar 28, 2020 at 4:46 AM Tom Lane wrote: > Matthias Apitz writes: > > In short, it there a way to let \COPY accept such broken ISO bytes, just > > complaining about, but not stopping the insert of the row? > > No. We don't particularly believe in the utility of invalid data. > > If you do

Re: \COPY to accept non UTF-8 chars in CHAR columns

2020-03-27 Thread Tom Lane
Matthias Apitz writes: > In short, it there a way to let \COPY accept such broken ISO bytes, just > complaining about, but not stopping the insert of the row? No. We don't particularly believe in the utility of invalid data. If you don't actually care about what encoding your data is in, you co