I got this going with a dump/reload. Beware however, 8.0b5 and the JDBC driver don't seem to work with UNICODE encoding for the database. You have to use b4 or HEAD.
Hunter > From: Kris Jurka <[EMAIL PROTECTED]> > Date: Tue, 30 Nov 2004 01:34:46 -0500 (EST) > To: Hunter Hillegas <[EMAIL PROTECTED]> > Cc: PostgreSQL <[EMAIL PROTECTED]>, Postgre JDBC List > <[EMAIL PROTECTED]> > Subject: Re: [JDBC] Invalid Character Data Problem > > > > On Fri, 26 Nov 2004, Hunter Hillegas wrote: > >> When I SELECT from a certain table, I see this JDBC exception: >> >> "Invalid character data was found. This is most likely caused by stored >> data containing characters that are invalid for the character set the >> database was created in. The most common example of this is storing 8bit >> data in a SQL_ASCII database." >> >> The database is indeed of type SQL_ASCII. The table stores mailing list data >> and has about 400,000 rows. >> >> Looking at the data via psql, I see that some of the rows have strange >> characters in them, such as question marks where I would not expect them, >> etc... >> >> What are my options? Is there a way to identify the 'bad' records, or the >> ones causing trouble? >> > > To really solve this problem you need to have a correctly encoded > database. This will involve a dump and restore process and possibly > recoding your data. This is straightforward if you know what > encoding your data is, although it will cause some downtime. > > To detect the bad data you can try various SELECTs with the JDBC driver > and see what errors out. The function below will determine if a > particular field has data with the high bit set which is something the > database really doesn't know what to do with. > > SELECT pkcolumn, hashighbit(columna), hashighbit(columnb) FROM mytable; > > Kris Jurka > > CREATE OR REPLACE FUNCTION hashighbit(text) RETURNS boolean AS ' > DECLARE > i int; > BEGIN > i := LENGTH($1); > WHILE i > 0 LOOP > IF ascii(substring($1, i, 1)) >= 128 THEN > RETURN true; > END IF; > i := i-1; > END LOOP; > RETURN false; > END; > ' LANGUAGE 'plpgsql'; > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]