Actually, I think Arno is correct, but it's a bit more complex than that: The entities conversion depend strictly on the local character set. That is, each character set *may* map differently (as Arno just discovered for the "cent" character between CP-1252 and CP-1251); there is no "universal" conversion, that is, because the entities represent semantically equivalent characters in differing representations, not specific character codes.
For this reason, the best solution is usually to use Unicode (UTF-8) in HTML output. If you specify UTF-8 as the content character set in the HTML header, then you only need to encode as entities the metacharacters: ampersand, non-breaking-space, and left- and right-angled brackets. As for HttpSrv.TextToHtmlText() method, it should take the content character set into consideration. However, if the mappings are too different, maintaining many different tables may not be practical. dZ. On Oct 9, 2008, at 05:09, Arno Garrels wrote: > Francois Piette wrote: >>> Or am I missing something? >> >> I think so. Using "html entities" make sure the correct character is >> represented whatever the character set or character code is used by >> the browser. > > That's correct, but the server maps the wrong HTML entities if it > doesn't run > in a locale that uses CP 1252! > > For example: > Currently char #162 is hard coded to represent the cent sign: > HTML Entity: 'cent' , { #162 cent sign > } > > In windows-1251 however #162 maps to the small kyrillic letter U > (short). > -- DZ-Jay [TeamICS] http://www.overbyte.be/eng/overbyte/teamics.html -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be