Hello,

    I´ve posted a message a few days ago about a html page being retrieved
with weird chars (through ICS's HttpCli). As very well suggested by JP in
his reply to my message, the page was endeed UTF-8 coded. But the question
remains (as I am currently building a weird char converter as they appear on
the captured page ... [yes, very dumb on my behalf]), how can I get the
retrieved characters as UTF-8? I mean, UTF-8 uses more then 1 Byte per char
and on the TStringStream I'm using to retrieve the data from the HttpCli I
get mixed type chars.
    All the letters (a..z, A..Z, 0..9 and some other chars) are being
retrived as 1 ASCII Byte except for some weird chars that are coming in some
other format using more than 1 Byte (by more than 1 Byte I don't mean 2
Bytes, I mean 2 or 3 Bytes depending on the case). Bellow I send you some
example strings taken directly from my application:

    What I get:
   a história do município de .. estrela do agronegócio â?oprêmio é
acima de tudo o reconhecimento do jornalismo, com foco no cidadão, que
estamos fazendo. Ã? o resultado de um trabalho feito dentro de uma empresa
pública de comunicaçãoâ?o

    What I was supposed to get:
    a história do município de .. estrela do agronegócio "prêmio é acima de
tudo o reconhecimento do jornalismo, com foco no cidadão, que estamos
fazendo. É o resultado de um trabalho feito dentro de uma empresa pública de
comunicação"

    Note: The weird chars can come in 2 or 3 Bytes. The char " comes as 3
Bytes (â?o). On the other hand the char É comes in 2 Bytes (Ã?).
    Note2.: The texts are in Brazilian Portuguese.

    The question is: Is the problem on the TStringStream that for some
reason is returning some ASCII chars and some others UTF-8 chars? Or the
problem is that I missed some property of THttpCli making the retrieved page
look so strange? Or the problem lies somewhere else far beyond my little
knowledge?

    Please help! :'(

Best regards,

Marcelo Grossi 

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to