dman wrote:
> On Thu, May 23, 2002 at 06:23:40PM -0700, Daniel Quinlan wrote:
> | Jason Baker <[EMAIL PROTECTED]> writes:
> 
> | > denoting it.  I don't read/speak Korean, so I have no idea what
> | > exactly it is, but the characters are: ê´'ê³ 
> | > 
> | > (hope that comes through)
> 
> It came through just fine, though I can't display it in my console.  I
> just found out that gvim can't display it either with my fontset.  It
> does handle UTF-8 well, though; and I double-checked the UTF-8
> decoding myself.  (read the UTF-8 RFC some time.  It's really short
> and kinda cool)
> 
> | As best I can tell, your string was "SPACE ea b4 91 ea b3 a0" which
> | bears zero resemblence to any of the above, so I hope yours got
> | corrupted on the way here.  (Dude, don't send unquoted binary!)
> 
> He didn't send unquoted binary.  I looked at the raw message myself,
> it was UTF-8 data transfered as Quoted-Printable.  There's only 2
> characters there (not counting any potential whitespace).
> 
> Once you decode the UTF-8 into Unicode, you get  ad11 ace0
> (16-bit chars, not 8-bit).
> 
> For checking it in SA the various IS0 encodings of korean should be
> handled as well as UTF-8.

If Craig would work on the email parser I posted to the dev list instead 
of MIME-tools, it decodes all character sets (even embedded ones in 
headers) to UTF-8, making detecting alternate character set stuff 
infinitely easier.

Matt.


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to