On 2008-08-07, Moritz Barsnick <[EMAIL PROTECTED]> wrote: > I use the attached lines in my ~/.mailcap to use w3m as an HTML viewer. > I also have another entry (earlier within mailcap, not attached here) > which tests for a display and starts w3m in a separate xterm. > > One rule "dump"s the rendered HTML page in the mail view, which is very > often sufficient If I want to open it in a browsable way (with the > requested hyperlinks), I press "v" (<view-attachments>) and press > return (<view-attach>) on the HTML attachment. > > Let me capture this thread and ask a question: How can mutt tell w3m > which charset to use? In the attached mailcap, I force ISO-8859-1, but > that's just a hack. In true honesty, I get some ISO and some UTF-8 > coded HTML attachments. Assuming the attachment is correctly > MIME-denoted in the header (which is mostly the case), how can I pass > this on to w3m? Any ideas?
> text/html; w3m -T text/html -I ISO-8859-1 -o frame=0 -o meta_refresh=0 -o > auto_image=0 %s; needsterminal; \ > description=HTML Text; nametemplate=%s.html > text/html; w3m -T text/html -I ISO-8859-1 -o frame=0 -o meta_refresh=0 -o > auto_image=0 -dump %s; copiousoutput; \ > description=HTML Text; nametemplate=%s.html The last time I tried to do this, I couldn't get w3m to properly display some character sets, so I gave up trying with w3m as a browser and wrote a script around "w3m -dump". From my mailcap file: text/html; w3m %s; nametemplate=%s.html text/html; html2text %{charset} %s; \ nametemplate=%s.html; \ copiousoutput The core of the html2text script is this: charset="$1" # Implement the equivalent of mutt's charset-hook for these: # # charset-hook ^us-ascii$ windows-1252 # charset-hook ^iso-8859-1$ windows-1252 # case $charset in us-ascii) charset=windows-1252;; US-ASCII) charset=windows-1252;; iso-8859-1) charset=windows-1252;; ISO-8859-1) charset=windows-1252;; ks_c_5601-1987) charset=gb2312;; KS_C_5601-1987) charset=gb2312;; esac file="$2" w3m_args="-dump -T text/html -o frame=0 -o meta_refresh=0 -o auto_image=0 -I $charset -O $charset" w3m $w3m_args $file | iconv -c -f $charset -t ISO-8859-1//TRANSLIT Note that rather than have w3m perform the charset conversion, I told w3m to leave the charset unchanged and instead let iconv do the conversion. This is admittedly a hack, but it has worked well for displaying HTML mutt's pager. The //TRANSLIT suffix is important for me to be able to display M$ characters in my environment, and as I recall, w3m doesn't understand //TRANSLIT. Regards, Gary