On Wed, Oct 01, 2003 at 02:46:14PM -0400, Gerard Samuel wrote: : : When I say that I don't know what characters Im expecting. : Im not talking about normal html entities, like &   < : Im talking about chinese/japanese/korean/taiwanese alphabet, numbers : (even punctuation if applicable). : Maybe Im thinking too hard, but trying to take far east languages : alphanumeric charaters into account, : seems like overkill. : Feel free to correct me.
Okay, I will. :-) There's two issues: input and output. HTML character references address the problem of displaying certain characters on a web browser. This is an output issue. When you get CJKV data, you are most likely getting it in some encoding. Different Asian languages use their own encoding sets. For example, if you get Japanese text, it will be encoded in JIS, Shift-JIS, EUC, or something Unicode. You *have* to determine the type of data and its encoding. This is an input issue. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php