dman wrote: > On Thu, May 23, 2002 at 06:23:40PM -0700, Daniel Quinlan wrote: > | Jason Baker <[EMAIL PROTECTED]> writes: > > | > denoting it. I don't read/speak Korean, so I have no idea what > | > exactly it is, but the characters are: ê´'ê³ > | > > | > (hope that comes through) > > It came through just fine, though I can't display it in my console. I > just found out that gvim can't display it either with my fontset. It > does handle UTF-8 well, though; and I double-checked the UTF-8 > decoding myself. (read the UTF-8 RFC some time. It's really short > and kinda cool) > > | As best I can tell, your string was "SPACE ea b4 91 ea b3 a0" which > | bears zero resemblence to any of the above, so I hope yours got > | corrupted on the way here. (Dude, don't send unquoted binary!) > > He didn't send unquoted binary. I looked at the raw message myself, > it was UTF-8 data transfered as Quoted-Printable. There's only 2 > characters there (not counting any potential whitespace). > > Once you decode the UTF-8 into Unicode, you get ad11 ace0 > (16-bit chars, not 8-bit). > > For checking it in SA the various IS0 encodings of korean should be > handled as well as UTF-8.
If Craig would work on the email parser I posted to the dev list instead of MIME-tools, it decodes all character sets (even embedded ones in headers) to UTF-8, making detecting alternate character set stuff infinitely easier. Matt. _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk