On 6 Apr 2019, at 7:27, Randy Bush wrote:
i receive an email
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0)
Gecko/20100101 PostboxApp/6.1.13
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US
the text has funny space characters that i see if i save the text to
disk and look at it with less
<A0>0.<A0><A0> flo....: 2.31 2018.11.03
<A0><A0><A0><A0><A0><A0><A0><A0><A0><A0> 1.<A0><A0> CLIMATE
ACTION
<A0><A0><A0><A0><A0><A0><A0><A0><A0><A0><A0><A0><A0> * (N)ew
(M)odify (D)elete..: N
<A0><A0><A0><A0><A0><A0><A0><A0><A0><A0> 2. * NAME OF CLOUD:
cumulus
i presume the sender is thunderbird and they have created the text
with
some sort of windows encoding on a mac?
Apparently. I've seen this in mail recently as well and suspect it may
be a new default in Postbox or TBird or both to use windows-1252 instead
of UTF-8. Windows-1252 is a close relative of ISO 8859-1 (Latin-1) and
0xA0 is a non-breaking space in both.
how can i save the content as vanilla ascii text?
Well, technically you can't represent a non-breaking space in ASCII
because there's no such character defined in ASCII so there's no way to
convert text with 0xA0 characters into "vanilla ascii" text.
You can convert 0xA0 to 0x20 (ASCII space) by piping the text through:
tr '\240' ' '
However, doing the reverse could create a mess for anything that
understands the different between a regular space and a non-breaking
space.
Tools which support POSIX character classes and locales should treat
0xA0 as whitespace (class '[:space:]') if LC_ALL or LC_CTYPE is a
Latin-1 locale, e.g. en_US.ISO8859-1.
--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
_______________________________________________
mailmate mailing list
mailmate@lists.freron.com
https://lists.freron.com/listinfo/mailmate