More precisely (http://en.wikipedia.org/wiki/UTF-8):

UTF8 Range        - n Bytes - Binary Representation (Info)
********************************************
000000-00007F - 1 Byte   - 0xxxxxxx (ASCII equivalence range)
000080-0007FF - 2 Bytes - 110xxxxx 10xxxxxx (Latin letters with diacritics + 
Greek, Cyrillic, Armenian, Hebrew, Arabic, Syriac and Thaana alphabets)
000800-00FFFF - 3 Bytes - 1110xxxx 10xxxxxx 10xxxxxx (Multilingual Plane - 
which contains virtually all characters in common use)
010000-10FFFF - 4 Bytes - 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (Other planes 
of Unicode ... the rest)

Thanks a bunch, but I really can't find anything in that Jedi ... their 
online help system even work?

Marcelo Grossi

----- Original Message ----- 
From: "Robert Chafer" <[EMAIL PROTECTED]>
To: "ICS support mailing" <twsocket@elists.org>
Sent: Friday, July 21, 2006 10:45 AM
Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue



the first 7 bits of UTF-8 are ASCII, it uses the top 128 characters to
represent all the other Unicode characters.  Take a look at the JEDI
library they have converters.

On Fri, 21 Jul 2006 10:25:17 -0300, you wrote:

>  Thank you all for your answers,
>
>     I found out the error. It was, as probably most of you realized so 
> far,
>  me! : ) I read the UTF-8 specs on Wiki and it says clearly to my face: 
> "uses
>  up to 4 bytes per character depending on the character ...". Dunno how I
>  missed that ..
>      So, what I have to do now is find a UTF-8 to ASCII converter (by
>  aproximation of course) or build one (wich I was already doing). Anyways,
>  thanks to all of you folks that took some time to answer me!
>
>  Really apreciate it!
>
>  Marcelo Grossi
>
>  ----- Original Message ----- 
>  From: "Francois PIETTE" <[EMAIL PROTECTED]>
>  To: "ICS support mailing" <twsocket@elists.org>
>  Sent: Friday, July 21, 2006 4:44 AM
>  Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue
>
>
>  >> With HTTP component, you always get the data exactly as the server 
> sent
>  >> it. HTTP component does do any processing on the data itself. It is
>  >> stored
>  >> as is in the stream you provide for storage.
>
>  >    Then how come Mozilla Firefox doesn´t have this weird char problem?
>
>  Firefox is much more than a HTTP component. It has an engine which 
> interpret
>  the document AND the header sent by the server.
>
>  > I just used a TMemoryStream instead of using my old TStringStream,
>  > debugged
>  > the contents of the Buffer and it is as buggy as it was.
>
>  How do you know it is buggy ? I'm sure the problem is that you don't
>  interpret the data as it is encoded. There are many many ways to 
> represent
>  characters. Not only speaking about the code used (one byte, two bytes,
>  multiple bytes, varying number of bytes) but also character sets (mapping
>  between a given code and the character "image").
>
>  >    How come the server is sending me something and the browser 
> something
>  > else?
>
>  The browser doesn't send anything. The browser interpret what the server
>  sent.
>  It may happend that the server doesn't send the same thing to your 
> program
>  than it sends to the browser. Why ? Because a HTTP request is composed of 
> an
>  URL but also a header with many kind of informations the client give to 
> help
>  the server send the correct content.
>
>  Use a sniffer to compare the request the browser send (pay attention to 
> the
>  header lines) and what the server returns. Build the same request with 
> the
>  HTTP component and verify that the server send the exact same content (it
>  will for sure if the request is the same in all details).
>
>
>  > Because I trully don't believe that Mozilla Firefox is parsing
>  > that kind of data. It even doesn't respect the same amount of bytes per
>  > char
>  > ...). I don't get it.. Me stupid!!! 8/
>
>  I'm sure the browser parse the data and the header to show you the 
> correct
>  page.
>
>  Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
>  --
>  [EMAIL PROTECTED]
>  http://www.overbyte.be
>
>
>  -- 
>  To unsubscribe or change your settings for TWSocket mailing list
>  please goto http://www.elists.org/mailman/listinfo/twsocket
>  Visit our website at http://www.overbyte.be
--

Rob Chafer
Silverfrost
-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to