On 1/24/11 8:07 PM, Mark Martinec wrote:
Jeroen Geilman wrote:
Urgh. Which RFC are you reading ?
I quote:
Systems MUST NOT define mailboxes in such a way as to require the use
in SMTP of non-ASCII characters
True (tell it to generators of malicious mail or just incompetent sending sw).
This does not prevent illegal data to appear on the wire on the receiving MTA.
Receiving such data must not cause MTA or SQL to malfunction.
I see; I thought you meant it as database queries for such data should
process the lookup key as if it were valid.
If the SMTP side doesn't need to, it stands to reason a database lookup
doesn't either.
But I understand this could create the aforementioned problems if the
database output could not be guaranteed to be ASCII.
There is also an initiative to allow UTF-8 characters to appear in SMTP
(RFC 5336 and related documents). A malformed UTF-8 could easily
appear there, despite being prohibited. If an SQL database would
declare an e-mail address field of an UTF-8 data type, a lookup could
abort when given such invalid data.
How would this be processed ? If anything outside of US_ASCII is
encountered, the data is assumed to be UTF-8 ?
This would be possible since ASCII folds entirely into UTF-8's lower 7
bits, of course.
--
J.