Hi all,

    If anyone runs into this problem I had, use the following function 
(Delphi native) to solve it:

    Unit System;
    UTF8String = String;
    function Utf8ToAnsi(const S: UTF8String): string;

    Darn, it was so simple!!! (BTW, if you happen to see a weird char in the 
resulting String, check the Font you are using to display it...)

Cheers,

Marcelo Grossi

----- Original Message ----- 
From: "Marcelo Grossi" <[EMAIL PROTECTED]>
To: "ICS support mailing" <twsocket@elists.org>
Sent: Friday, July 21, 2006 11:22 AM
Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue


More precisely (http://en.wikipedia.org/wiki/UTF-8):

UTF8 Range        - n Bytes - Binary Representation (Info)
********************************************
000000-00007F - 1 Byte   - 0xxxxxxx (ASCII equivalence range)
000080-0007FF - 2 Bytes - 110xxxxx 10xxxxxx (Latin letters with diacritics +
Greek, Cyrillic, Armenian, Hebrew, Arabic, Syriac and Thaana alphabets)
000800-00FFFF - 3 Bytes - 1110xxxx 10xxxxxx 10xxxxxx (Multilingual Plane -
which contains virtually all characters in common use)
010000-10FFFF - 4 Bytes - 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (Other planes
of Unicode ... the rest)

Thanks a bunch, but I really can't find anything in that Jedi ... their
online help system even work?

Marcelo Grossi

----- Original Message ----- 
From: "Robert Chafer" <[EMAIL PROTECTED]>
To: "ICS support mailing" <twsocket@elists.org>
Sent: Friday, July 21, 2006 10:45 AM
Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue



the first 7 bits of UTF-8 are ASCII, it uses the top 128 characters to
represent all the other Unicode characters.  Take a look at the JEDI
library they have converters.

On Fri, 21 Jul 2006 10:25:17 -0300, you wrote:

>  Thank you all for your answers,
>
>     I found out the error. It was, as probably most of you realized so
> far,
>  me! : ) I read the UTF-8 specs on Wiki and it says clearly to my face:
> "uses
>  up to 4 bytes per character depending on the character ...". Dunno how I
>  missed that ..
>      So, what I have to do now is find a UTF-8 to ASCII converter (by
>  aproximation of course) or build one (wich I was already doing). Anyways,
>  thanks to all of you folks that took some time to answer me!
>
>  Really apreciate it!
>
>  Marcelo Grossi
>
>  ----- Original Message ----- 
>  From: "Francois PIETTE" <[EMAIL PROTECTED]>
>  To: "ICS support mailing" <twsocket@elists.org>
>  Sent: Friday, July 21, 2006 4:44 AM
>  Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue
>
>
>  >> With HTTP component, you always get the data exactly as the server
> sent
>  >> it. HTTP component does do any processing on the data itself. It is
>  >> stored
>  >> as is in the stream you provide for storage.
>
>  >    Then how come Mozilla Firefox doesn´t have this weird char problem?
>
>  Firefox is much more than a HTTP component. It has an engine which
> interpret
>  the document AND the header sent by the server.
>
>  > I just used a TMemoryStream instead of using my old TStringStream,
>  > debugged
>  > the contents of the Buffer and it is as buggy as it was.
>
>  How do you know it is buggy ? I'm sure the problem is that you don't
>  interpret the data as it is encoded. There are many many ways to
> represent
>  characters. Not only speaking about the code used (one byte, two bytes,
>  multiple bytes, varying number of bytes) but also character sets (mapping
>  between a given code and the character "image").
>
>  >    How come the server is sending me something and the browser
> something
>  > else?
>
>  The browser doesn't send anything. The browser interpret what the server
>  sent.
>  It may happend that the server doesn't send the same thing to your
> program
>  than it sends to the browser. Why ? Because a HTTP request is composed of
> an
>  URL but also a header with many kind of informations the client give to
> help
>  the server send the correct content.
>
>  Use a sniffer to compare the request the browser send (pay attention to
> the
>  header lines) and what the server returns. Build the same request with
> the
>  HTTP component and verify that the server send the exact same content (it
>  will for sure if the request is the same in all details).
>
>
>  > Because I trully don't believe that Mozilla Firefox is parsing
>  > that kind of data. It even doesn't respect the same amount of bytes per
>  > char
>  > ...). I don't get it.. Me stupid!!! 8/
>
>  I'm sure the browser parse the data and the header to show you the
> correct
>  page.
>
>  Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
>  --
>  [EMAIL PROTECTED]
>  http://www.overbyte.be
>
>
>  -- 
>  To unsubscribe or change your settings for TWSocket mailing list
>  please goto http://www.elists.org/mailman/listinfo/twsocket
>  Visit our website at http://www.overbyte.be
--

Rob Chafer
Silverfrost
-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to