Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-24 Thread Arnaud Lesauvage
Tomi NA a écrit : 2006/11/23, Arnaud Lesauvage <[EMAIL PROTECTED]>: Arnaud Lesauvage a écrit : > Brandon Aiken a écrit : >> It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. >> >> Try the UCS-2-INTERNAL and UCS-4-INTERNAL co

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Tomi NA
2006/11/23, Arnaud Lesauvage <[EMAIL PROTECTED]>: Arnaud Lesauvage a écrit : > Brandon Aiken a écrit : >> It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. >> >> Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, w

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Arnaud Lesauvage
Arnaud Lesauvage a écrit : Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Arnaud Lesauvage
Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS encoding using the system's

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Martijn van Oosterhout
On Wed, Nov 22, 2006 at 01:55:55PM -0500, Brandon Aiken wrote: > Gee, didn't Unicode just so simplify this codepage mess? Remember > when it was just ASCII, EBCDIC, ANSI, and localized codepages? I think that's one reason why Unix has standardised on UTF-8 rather than one of the other Unicode var

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/22, Brandon Aiken <[EMAIL PROTECTED]>: Gee, didn't Unicode just so simplify this codepage mess? Remember when it was just ASCII, EBCDIC, ANSI, and localized codepages? Unicode is a heaven sent, compared to 3 or 4 codepages representing any given (obviously non-English) language, and

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Brandon Aiken
[mailto:[EMAIL PROTECTED] On Behalf Of Arnaud Lesauvage Sent: Wednesday, November 22, 2006 12:38 PM To: Arnaud Lesauvage; General Subject: Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem Alvaro Herrera a écrit : > Arnaud Lesauvage wrote: >> Alvaro Herrera a écrit : >> >Arnaud Lesa

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
> > > I thought Win1252 was supposed to be almost the same as Latin1. > > > While I'd expect certain differences, I wouldn't expect it to use > > > 0x00 as data! > > > > > > Maybe you could have DTS export Unicode, which would > presumably be > > > UTF-16, then recode that to something else (

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Bruce Momjian
Arnaud Lesauvage wrote: > > I thought Win1252 was supposed to be almost the same as Latin1. While > > I'd expect certain differences, I wouldn't expect it to use 0x00 as > > data! > > > > Maybe you could have DTS export Unicode, which would presumably be > > UTF-16, then recode that to something

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Alvaro Herrera a écrit : >Arnaud Lesauvage wrote: > >>mydb=# SET client_encoding TO LATIN9; >>SET >>mydb=# COPY statistiques.detailrecherche (log_gid, >>champrecherche, valeurrecherche) FROM >>'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: > Alvaro Herrera a écrit : > >Arnaud Lesauvage wrote: > > > >>mydb=# SET client_encoding TO LATIN9; > >>SET > >>mydb=# COPY statistiques.detailrecherche (log_gid, > >>champrecherche, valeurrecherche) FROM > >>'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; > >>ERROR:

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: > mydb=# SET client_encoding TO LATIN9; > SET > mydb=# COPY statistiques.detailrecherche (log_gid, > champrecherche, valeurrecherche) FROM > 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; > ERROR: invalid byte sequence for encoding "LATIN9": 0x00 > HINT: This err

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Thomas H.
Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode Driver, I exported ~1000 rows per second in a 2-columns table with ~20M rows

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Alvaro Herrera a écrit : >Arnaud Lesauvage wrote: >>Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports th

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Magnus Hagander a écrit : > I have done this in Delphi using it's built in UTF8 encoding and > decoding routines. You can get a free copy of Delphi Turbo Explorer > which includes components for MS SQL server and ODBC, so it would be > pretty straight forward to get this working. > > The a

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tony Caduto
Of course, but it doesn't work !!! Whatever client encoding I choose in postgresql before COPYing, I get the 'invalid byte sequence error'. The farther I can get is exporting to UNICODE and importing as UTF8. Then COPY only breaks on the euro symbol (otherwise it breaks very early, I think

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Richard Huxton
Arnaud Lesauvage wrote: Richard Huxton a écrit : Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode Driver, I exported ~1000 ro

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: > Alvaro Herrera a écrit : > >Arnaud Lesauvage wrote: > >>Tomi NA a écrit : > I think I'll go this way... No other choice, actually ! > The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. > I don't really understand what this is. It supports the euro > sym

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Tomi NA a écrit : 2006/11/21, Arnaud Lesauvage <[EMAIL PROTECTED]>: Hi list ! I already posted this as "COPY FROM encoding error", but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (>20M rows), so a CSV export

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Tomi NA a écrit : >>I think I'll go this way... No other choice, actually ! >>The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. >>I don't really understand what this is. It supports the euro >>symbol, so it is probably not pure LATIN1, right

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
> >> I already posted this as "COPY FROM encoding error", but I > have been > >> doing some more tests since then. > >> > >> I'm trying to export data from MS SQL Server to PostgreSQL. > >> The tables are quite big (>20M rows), so a CSV export and a "COPY > >> FROM3 import seems to be the only

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
> > I have done this in Delphi using it's built in UTF8 encoding and > > decoding routines. You can get a free copy of Delphi > Turbo Explorer > > which includes components for MS SQL server and ODBC, so it > would be > > pretty straight forward to get this working. > > > > The actual meth

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/22, Arnaud Lesauvage <[EMAIL PROTECTED]>: Tomi NA a écrit : > 2006/11/21, Arnaud Lesauvage <[EMAIL PROTECTED]>: >> Hi list ! >> >> I already posted this as "COPY FROM encoding error", but I have >> been doing some more tests since then. >> >> I'm trying to export data from MS SQL Server t

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: > Tomi NA a écrit : > >>I think I'll go this way... No other choice, actually ! > >>The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. > >>I don't really understand what this is. It supports the euro > >>symbol, so it is probably not pure LATIN1, right ? > > > >I suppose

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Richard Huxton a écrit : Arnaud Lesauvage wrote: Richard Huxton a écrit : Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode D

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/21, Arnaud Lesauvage <[EMAIL PROTECTED]>: Hi list ! I already posted this as "COPY FROM encoding error", but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (>20M rows), so a CSV export and a "COPY FROM3

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol, so it is probably not pure LATIN1, right ? I suppose you'd have to look at the latin1 codepage ch

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Richard Huxton a écrit : Arnaud Lesauvage wrote: Hi list ! I already posted this as "COPY FROM encoding error", but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (>20M rows), so a CSV export and a "COPY FROM

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-21 Thread Richard Huxton
Tony Caduto wrote: Arnaud Lesauvage wrote: I then try to import into PostgreSQL. The farther I can get is when using the UNICODE export, and importing it using a client_encoding set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...). The copy then stops with an error : ERROR: invalid byte seque

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-21 Thread Tony Caduto
Arnaud Lesauvage wrote: I then try to import into PostgreSQL. The farther I can get is when using the UNICODE export, and importing it using a client_encoding set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...). The copy then stops with an error : ERROR: invalid byte sequence for encoding "UT