Robert Chafer wrote: > the first 7 bits of UTF-8 are ASCII, it uses the top 128 characters to > represent all the other Unicode characters. Take a look at the JEDI > library they have converters.
This easy to understand article may help as well: http://www.joelonsoftware.com/articles/Unicode.html --- Arno Garrels [TeamICS] http://www.overbyte.be/eng/overbyte/teamics.html > > On Fri, 21 Jul 2006 10:25:17 -0300, you wrote: > >> Thank you all for your answers, >> >> I found out the error. It was, as probably most of you realized >> so far, me! : ) I read the UTF-8 specs on Wiki and it says clearly >> to my face: "uses up to 4 bytes per character depending on the >> character ...". Dunno how I missed that .. >> So, what I have to do now is find a UTF-8 to ASCII converter (by >> aproximation of course) or build one (wich I was already doing). >> Anyways, thanks to all of you folks that took some time to answer >> me! >> >> Really apreciate it! >> >> Marcelo Grossi >> >> ----- Original Message ----- >> From: "Francois PIETTE" <[EMAIL PROTECTED]> >> To: "ICS support mailing" <twsocket@elists.org> >> Sent: Friday, July 21, 2006 4:44 AM >> Subject: Re: [twsocket] HttpCli UTF-8 Coding Issue >> >> >> >> With HTTP component, you always get the data exactly as the >> server sent >> it. HTTP component does do any processing on the >> data itself. It is >> stored >> >> as is in the stream you provide for storage. >> >> > Then how come Mozilla Firefox doesn´t have this weird char >> problem? >> >> Firefox is much more than a HTTP component. It has an engine which >> interpret the document AND the header sent by the server. >> >> > I just used a TMemoryStream instead of using my old TStringStream, >> > debugged >> > the contents of the Buffer and it is as buggy as it was. >> >> How do you know it is buggy ? I'm sure the problem is that you don't >> interpret the data as it is encoded. There are many many ways to >> represent characters. Not only speaking about the code used (one >> byte, two bytes, multiple bytes, varying number of bytes) but also >> character sets (mapping between a given code and the character >> "image"). >> >> > How come the server is sending me something and the browser >> something > else? >> >> The browser doesn't send anything. The browser interpret what the >> server sent. >> It may happend that the server doesn't send the same thing to your >> program than it sends to the browser. Why ? Because a HTTP request >> is composed of an URL but also a header with many kind of >> informations the client give to help the server send the correct >> content. >> >> Use a sniffer to compare the request the browser send (pay >> attention to the header lines) and what the server returns. Build >> the same request with the HTTP component and verify that the server >> send the exact same content (it will for sure if the request is the >> same in all details). >> >> >> > Because I trully don't believe that Mozilla Firefox is parsing >> > that kind of data. It even doesn't respect the same amount of >> bytes per > char >> > ...). I don't get it.. Me stupid!!! 8/ >> >> I'm sure the browser parse the data and the header to show you the >> correct page. >> >> Contribute to the SSL Effort. Visit >> http://www.overbyte.be/eng/ssl.html -- >> [EMAIL PROTECTED] >> http://www.overbyte.be >> >> >> -- >> To unsubscribe or change your settings for TWSocket mailing list >> please goto http://www.elists.org/mailman/listinfo/twsocket >> Visit our website at http://www.overbyte.be > -- > > Rob Chafer > Silverfrost -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be