[Note: I started writing this e-mail message before receiving your reply
 you sent privately, and this email message doesn't address the comments
 in your reply.]

Having downloaded tex4ht source, I see that \email magic is handled in
part by \Link[mailto:#1]{}{}.

In my previous message, I wrote that the closing ‘"’ appears where the
first space is.  Seeing that ‘#1’ usage makes me wonder whether a more
accurate characterization would be ‘after the first comma’ rather than
‘where the first space is’; but I've just now tried removing the spaces
and retaining commas, and in that case it produces correct HTML (with
still invalid mailto URI of course):

  <a 
  href="mailto:\protect \T1\textbraceleft 
Kim.Marriott,Peter.Moulder,Nathan.Hurst\protect \T1\textbraceright 
@infotech.monash.edu.au" >

So ‘where the first space is’ does seem to be the correct
characterization.

If instead I wrap the \email argument in a pair of braces while
retaining the spaces after the commas:

  \email{{\{Kim.Marriott, Peter.Moulder, [EMAIL PROTECTED]

then again we get correct HTML (with invalid mailto URI):

  <a 
  href="mailto:{\protect \T1\textbraceleft Kim.Marriott, Peter.Moulder, 
Nathan.Hurst\protect \T1\textbraceright @infotech.monash.edu.au}" >

My LaTeX skills are not good enough to know how tex4ht should be changed to,
say, give an error for the original construct (that lacked the protective brace
wrapping) let alone to behave the same as with the extra braces.  (The latter
would be nice given that a standard pdflatex run on the file appears to produce
correct results without the extra brace wrapping; but aborting with an error
message would also be reasonable behaviour.)


The other half of the \email problem is that the produced mailto URIs
have insufficient quoting.  According to the relevant RFC
(http://tools.ietf.org/html/rfc2368):

  all URL reserved characters in [the destination part of the URL] must
  be encoded: in particular, parentheses, commas, and the percent sign
  ("%"), which commonly occur in the "mailbox" syntax.

and gives the example of mailto:addr1%2C%20addr2 to represent a
destination of addr1, addr2.

See §2.2 of RFC1738 (http://tools.ietf.org/html/rfc1738) for "URL
reserved characters".  Note, however, that its second-last paragraph
isn't an entirely accurate summary of the rest of the section; in
particular, it's not necessary to escape [EMAIL PROTECTED]  One might also note 
that
RFC1738 says it's always safe to leave ASCII comma unencoded, despite
its encoding in the example I cited above (previous paragraph).

pjrm.


Reply via email to