I am using Postgresql 8.3.14 on our reporting system. There are scripts that 
collect data from many databases across the firm into this database. Recently I 
added tables from a particular database which has encoding UTF-8. My dump 
procedure says

\encoding ISO-8859-8
\copy ( SELECT ... ) to file

And this fails at a certain row because that row contains Arabic text and it 
cannot be mapped into ISO-8859-8 (which is 8 bit Hebrew).

This is an expected behavior, but I was wondering why, when I tested the same 
setup manually, it all worked well.

Turns out that when I did it manually, I did not specify the output encoding. I 
did the \copy straight. So the file was in UTF-8.

But this puzzles me, because I then took the file, ran psql and \copy <table> 
from file. And it worked. I tried it again now, and I can see the row with its 
Arabic content, even though it is not in the database encoding.

I checked \encoding. It replies 
ISO_8859_8
but it then happily gives me the Arabic row when I select it.

What's happening here? Why does the database accept input in the wrong encoding 
and doesn't shout when I then try to select that input?


Secondly, suppose I want to get pure ISO-8859-8 output for now, and replace 
every incompatible character within the select statement into '*' or whatever. 
Is there any function that will help me detect such characters? Can I tell the 
psql conversion function to ignore bad characters?

Thank you,
Herouth

Reply via email to