Hi Graeme,

The text I wrote is seen by you as:
> Here's a sentence ending in two spaces.  And a second sentence after
> which I'll put two non-breaking spaces.??Perhaps that shows a difference
> when you view it.

Software can be written to cope with different conventions on
interpreting byte values as characters.  Originally, much was ASCII or
EBCDIC.  Today, the world has moved to the UTF-8 encoding of Unicode for
interchange except where there's a legacy requirement or specialist
niche.

Software can pick up what interpretation you desire from your ‘locale’
defined by environment variables.  There's a handy locale(1) program to
print out the settings seen by that process.

    $ locale
    LANG=en_GB.utf8
    LC_CTYPE="en_GB.utf8"
    LC_NUMERIC="en_GB.utf8"
    LC_TIME="en_GB.utf8"
    LC_COLLATE="en_GB.utf8"
    LC_MONETARY="en_GB.utf8"
    LC_MESSAGES="en_GB.utf8"
    LC_PAPER="en_GB.utf8"
    LC_NAME="en_GB.utf8"
    LC_ADDRESS="en_GB.utf8"
    LC_TELEPHONE="en_GB.utf8"
    LC_MEASUREMENT="en_GB.utf8"
    LC_IDENTIFICATION="en_GB.utf8"
    LC_ALL=
    $

The output is subtlely precise, e.g. those double-quotes are
significant.  The man page explains more.

Given environment variables are particular to a process, you could have
one program think it should use Brazilian Portugese for error messages
but stick with Great British English for formatting money.

My guess is your Thunderbird is in a locale which can't translate the
non-breaking space, Unicode U+00A0, in Terry's email and so substitutes
a ‘?’ to show something went awry.  Or it's been told to use
a particular locale in its settings.

What's the output of ‘locale’ for you in a terminal window?

> > $ printf '\u00a342\n'
> > £42
>
> Yes a pound sign.

I don't know of a character-set encoding which has £ but not
non-breaking space.  ISO 8859-1 has both.  So perhaps your terminal is
happy with Unicode and it's Thunderbird which has been told in some way
to not use Unicode?

One more terminal test, this one uses Unicode characters outside of
ISO 8859-1.  What do you see when you run this dc(1) command compared
to me?  I see a tick and a cross.

    $ dc -e 16i5469636B7320E29C9320E29C970AP
    Ticks ✓ ✗
    $

-- 
Cheers, Ralph.

-- 
  Next meeting: Online, Jitsi, Tuesday, 2025-01-07 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  https://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk

Reply via email to