T.B.:
> Hallo Everyone,
> I have a question: I want to implement an attachment filter
> ( mime_header_checks ), that filters special unicode "Format characters".
> Examples:
> 0x202E (right-to-left override)
> 0x202B (right-to-left embedding)
> 0x202D (left-to-right override)
> 0x202A (left-to-right embedding)
> 
> Complete list here (page 4):
> http://www.unicode.org/charts/PDF/U2000.pdf  (page 4)
> 
> You can look the reason up here:
> http://www.h-online.com/security/news/item/Backwards-Unicode-names-hides-malware-and-viruses-1242114.html
> 
> Any suggestions how to do that?

According to RFC 2183/2184, content-disposition names containing
non-ASCII content must be encoded as ASCII strings. 

This means you may need to handle content-disposition names that
violate RFC 2183/2184, besides correctly-encoded forms for UTF-8,
UTF-16, and so on. I am not sure that regular expressions are the
tool for this job.

        Wietse

Reply via email to