Hi all,

I have been using mutt for some time on a VT510 terminal (similar to 
https://en.wikipedia.org/wiki/VT520), and enjoying it. I have a display_filter 
set to reduce the emoji and other exotic characters coming through to a 
readable portion that my display can handle.

The display does not support UTF-8, so I had LC_ALL="en_US.ISO8859-1" in my 
~/.profile. This worked well for mutt.

However today I received an email with the string "Don=E2=80=99t know when I 
will be there next.". This should display as something like "Don't know where I 
will be there next.". In my mutt terminal, it displayed:
> "Don???t know when I will be there next".

Thinking this was odd, I dove into my filter.sh script, and discovered that no 
end of hacking would enable me to filter out the '=E2=80=99' before display --- 
there seemed to be some amount of parsing before my filter got ahold of it. All 
that I could match on was '???', despite being able to edit the content of the 
mail itself, and see the string '=E2=80=99'. My filter line of significance is:
> output=`echo "$output" | sed "s/[’‘]/$(echo "27" | xxd -p -r)/g"`
This replaces 'smart quotes' with their ASCII equivalents.

I then discovered that by changing mutt to load with LC_ALL="en_US.UTF-8" that 
all was well. Thinking that this would be a matter of ensuring that the filter 
script had the right character support, I added "export LC_ALL="en_US.UTF-8"" 
to the top of my filter script, however this did nothing for me. After some 
messing around, it seemed that the only way to get mutt to support the 
filtering of my problematic string  was to call mutt itself with the required 
character encoding (UTF-8).

Is this correct and best-practice, or have I missed something here? My 
installation is currently working by using the 'export LC_ALL="en_US.UTF-8"' 
line in my ~/.profile, however this feels like bad practice, and if I've learnt 
one thing about character encoding, is that incorrect configurations can bite 
you at surprising times.

Is anyone able to shed any light on how best to support UTF-8 in filtering, if 
only able to display something like ISO8859-1?

Many thanks,

-- 
Joel Buckley

Reply via email to