Re: Lies in education [was Re: The "loop and a half"]

Peter J. Holzer Fri, 13 Oct 2017 03:13:24 -0700

On 2017-10-13 05:28, Gregory Ewing <greg.ew...@canterbury.ac.nz> wrote:
> Grant Edwards wrote:
>> On 2017-10-13, Stefan Ram <r...@zedat.fu-berlin.de> wrote:
>>>      1 byte
>>>
>>>      addressable unit of data storage large enough to hold
>>>      any member of the basic character set of the execution
>>>      environment«
>>>
>>>    ISO C standard
>
> Hmmm. So an architecture with memory addressed in octets
> and Unicode as the basic character set would have a
> char of 8 bits and a byte of 32 bits?


No, because a char is also "large enough to store any member of the
basic execution character set. (§6.2.5). A "byte" is just the amount of
storage a "char" occupies:

| The sizeof operator yields the size (in bytes) of its operand
[...]
| When applied to an operand that has type char, unsigned char, or signed
| char, (or a qualified version thereof) the result is 1. 
    (§6.5.3.4)

So if a C implementation used Unicode as the base character set, a byte
would have to be at least 21 bits, a char the same, and all other types
would have to be multiples of that. For any modern architecture that
would be rounded up to 32 bits. (I am quite certain that there was at
least one computer with a 21 bit word size, but I can't find it: Lots of
18 bit and 24 bit machines, but nothing in between.)

An implementation could also choose the BMP as the base character set
and the rest of Unicode as the extended character set. That would result
in a 16 bit byte and char (and most likely UTF-16 as the multibyte
character representation).


> Not only does "byte" not always mean "8 bits", but
> "char" isn't always short for "character"...

True. A character often occupies more space than a char, and you can
store non-character data in a char.

        hp


-- 
   _  | Peter J. Holzer    | Fluch der elektronischen Textverarbeitung:
|_|_) |                    | Man feilt solange an seinen Text um, bis
| |   | h...@hjp.at         | die Satzbestandteile des Satzes nicht mehr
__/   | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Lies in education [was Re: The "loop and a half"]

Reply via email to