More precisely (http://en.wikipedia.org/wiki/UTF-8):
UTF8 Range - n Bytes - Binary Representation (Info) ******************************************** 000000-00007F - 1 Byte - 0xxxxxxx (ASCII equivalence range) 000080-0007FF - 2 Bytes - 110xxxxx 10xxxxxx (Latin letters with diacritics + Greek, Cyrillic, Armenian, Hebrew, Arabic, Syriac and Thaana alphabets) 000800-00FFFF - 3 Bytes - 1110xxxx 10xxxxxx 10xxxxxx (Multilingual Plane - which contains virtually all characters in common use) 010000-10FFFF - 4 Bytes - 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (Other planes of Unicode ... the rest) Thanks a bunch, but I really can't find anything in that Jedi ... their online help system even work? Marcelo Grossi ----- Original Message ----- From: "Robert Chafer" <[EMAIL PROTECTED]> To: "ICS support mailing" <twsocket@elists.org> Sent: Friday, July 21, 2006 10:45 AM Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue the first 7 bits of UTF-8 are ASCII, it uses the top 128 characters to represent all the other Unicode characters. Take a look at the JEDI library they have converters. On Fri, 21 Jul 2006 10:25:17 -0300, you wrote: > Thank you all for your answers, > > I found out the error. It was, as probably most of you realized so > far, > me! : ) I read the UTF-8 specs on Wiki and it says clearly to my face: > "uses > up to 4 bytes per character depending on the character ...". Dunno how I > missed that .. > So, what I have to do now is find a UTF-8 to ASCII converter (by > aproximation of course) or build one (wich I was already doing). Anyways, > thanks to all of you folks that took some time to answer me! > > Really apreciate it! > > Marcelo Grossi > > ----- Original Message ----- > From: "Francois PIETTE" <[EMAIL PROTECTED]> > To: "ICS support mailing" <twsocket@elists.org> > Sent: Friday, July 21, 2006 4:44 AM > Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue > > > >> With HTTP component, you always get the data exactly as the server > sent > >> it. HTTP component does do any processing on the data itself. It is > >> stored > >> as is in the stream you provide for storage. > > > Then how come Mozilla Firefox doesn´t have this weird char problem? > > Firefox is much more than a HTTP component. It has an engine which > interpret > the document AND the header sent by the server. > > > I just used a TMemoryStream instead of using my old TStringStream, > > debugged > > the contents of the Buffer and it is as buggy as it was. > > How do you know it is buggy ? I'm sure the problem is that you don't > interpret the data as it is encoded. There are many many ways to > represent > characters. Not only speaking about the code used (one byte, two bytes, > multiple bytes, varying number of bytes) but also character sets (mapping > between a given code and the character "image"). > > > How come the server is sending me something and the browser > something > > else? > > The browser doesn't send anything. The browser interpret what the server > sent. > It may happend that the server doesn't send the same thing to your > program > than it sends to the browser. Why ? Because a HTTP request is composed of > an > URL but also a header with many kind of informations the client give to > help > the server send the correct content. > > Use a sniffer to compare the request the browser send (pay attention to > the > header lines) and what the server returns. Build the same request with > the > HTTP component and verify that the server send the exact same content (it > will for sure if the request is the same in all details). > > > > Because I trully don't believe that Mozilla Firefox is parsing > > that kind of data. It even doesn't respect the same amount of bytes per > > char > > ...). I don't get it.. Me stupid!!! 8/ > > I'm sure the browser parse the data and the header to show you the > correct > page. > > Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html > -- > [EMAIL PROTECTED] > http://www.overbyte.be > > > -- > To unsubscribe or change your settings for TWSocket mailing list > please goto http://www.elists.org/mailman/listinfo/twsocket > Visit our website at http://www.overbyte.be -- Rob Chafer Silverfrost -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be