Re: Byte swapping support

David Brown Thu, 14 Sep 2017 01:59:23 -0700

On 12/09/17 20:56, Michael Meissner wrote:
> On Tue, Sep 12, 2017 at 05:26:29PM +0200, David Brown wrote:
>> On 12/09/17 16:15, [email protected] wrote:
>>>
>>>> On Sep 12, 2017, at 5:32 AM, Jürg Billeter 
>>>> <[email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> To support applications that assume big-endian memory layout on little-
>>>> endian systems, I'm considering adding support for reversing the
>>>> storage order to GCC. In contrast to the existing scalar storage order
>>>> support for structs, the goal is to reverse the storage order for all
>>>> memory operations to achieve maximum compatibility with the behavior on
>>>> big-endian systems, as far as observable by the application.
>>>
>>> I've done this in the past by C++ type magic. As a general setting
>>> it doesn't make sense that I can see. As an attribute applied to a
>>> particular data item, it does. But I'm not sure why you'd put this in
>>> the compiler when programmers can do it easily enough by defining a "big
>>> endian int32" class, etc.
>>>
>>
>> Some people use the compiler for C rather than C++ ...
>>
>> If someone wants to improve on the endianness support in gcc, I can
>> think of a few ideas that /I/ think might be useful.  I have no idea how
>> difficult they might be to put in practice, and can't say if they would
>> be of interest to others.
>>
>> First, I would like to see endianness given as a named address space,
>> rather than as a type attribute.  A key point here is that named address
>> spaces are effectively qualifiers, like "const" and "volatile" - and you
>> can then use them in pointers:
> 
> When I gave the talk at the 2009 GCC summit on the named address support, I
> thought that it could be used to add endianess support.  In fact at one time, 
> I
> had a trial PowerPC compiler that added endianess support.  Unfortunately, 
> that
> was in 2009, and I lost the directory of the work.  I tried again a few years
> ago, but I didn't get far enough into it to get a working compiler before 
> being
> pulled back into work.
> 
> Back when I worked at Cygnus Solutions, we used to get requests to add endian
> support every so often, but nobody wanted to pay the cost that we were then
> quoting to add the support.  Now that named address support is in, it could be
> done better.
> 
> I suspect however, you want to do this at the higher tree level, adding in the
> endianess bits in a separate area than the named address support.  Or perhaps,
> growing the named address support, and adding several standard named 
> addresses.
> 
> The paper where I talked about the named address support was from the 2009 GCC
> summit.  You can download the proceedings of the 2009 summit from here (my
> paper is pages 67-74):
> https://en.wikipedia.org/wiki/GCC_Summit


It's nice to know I am not the only one who things named address spaces
are a logical way to go for endianness support.

I fully appreciate, of course, that even if named address spaces are
nicer for users, a balance must be found for the practicality of the
implementation.  If today's type attribute is easier to implement and
maintain, then that is entirely understandable.  This is not a big issue
as far as I am concerned - it does not make sense to implement it unless
someone has the time and inclination, or someone is willing to pay for
the time.

> 
>> big_endian uint32_t be_buffer[20];
>> little_endian uint32_t le_buffer[20];
>>
>> void copy_buffers(const big_endian uint32_t * src,
>>                      little_endian uint32_t * dest)
>> {
>>      for (int i = 0; i < 20; i++) {
>>              dest[i] = src[i];       // Swaps endianness on copy
>>      }
>> }
>>
>> That would also let you use them for scaler types, not just structs, and
>> you could use typedefs:
>>
>>      typedef big_endian uint32_t be_uint32_t;
>>
>>
>> Secondly, I would add more endian types.  As well as big_endian and
>> little_endian, I would add native_endian and reverse_endian.  These
>> could let you write a little clearer definitions sometimes.  And ideally
>> I would like mixed endian with big-endian 16-bit ordering and
>> little-endian ordering for bigger types (i.e., 0x87654321 would be
>> stored 0x43, 0x21, 0x87, 0x65).  That order matches some protocols, such
>> as Modbus.
> 
> It depends, you can add so many different combinations, that in the end you
> don't add the support you want because of th 53 other variants.
> 

I'd stick to the basic four, and perhaps add two more ("PDP endian", and
a sort of reverse PDP endian).  I can't see there being scope for so
very many different endiannesses, but I suppose one could mix in bit
endianness to the pile.

> Note, if you use it in named addresses, you are currently limited to 15 new
> keywords for adding named address support.  This can be grown, but you should
> know about the limit ahead of time.  Of course if you add it in a parallel,
> machine independent version, then you don't have to worry about the existing
> limits.

I am sure that if gcc started using more named address spaces as a way
of extending the compiler (an alternative to attributes), then the limit
here would be raised.

> 
> As I write this, another usage for named addresses occurs to me -- and that is
> restricting addressing for memory mapped I/O regions that don't allow certain
> types of accesses.
> 

That is certainly a possibility.  Today, memory mapped IO is usually
done using "volatile" accesses - and possibly "const volatile" for
read-only registers.  For some targets, however, that is not enough.
There are some that use different types of instructions for accessing
registers, and you may also want synchronisation instructions to enforce
ordering on the accesses (like the PowerPC's marvellously named "EIEIO"
instruction).  Another example would be cpu special registers.

>> Third, I'd like to be able to attach the attribute to particular
>> variables and to scalers, not just struct and union types.
>>
>> Forth, I would like type-punning through unions with different ordering
>> to be allowed.  I'd like to be able to define:
>>
>> union U {
>>     __attribute__((scalar_storage_order("big-endian"))) uint16_t
>>              protocol[32];
>>     __attribute__((scalar_storage_order("little-endian"))) struct {
>>      uint32_t id;
>>      uint16_t command;
>>      uint16_t param1;
>>      ...
>>     }
>> }
> 
> This definately requires support at the higher levels of the compiler.
> 
>> and then I could access data in little ordering in the structures, then
>> in 16-bit big-endian lumps via the "protocol" array.
> 
> One of the things you have to do is be prepared to do a full sweep of your
> backend to make sure you only used the named address memory functions and 
> don't
> use the traditional functions that pass 0 for the named address.
> 

I know nothing about the implementation in the compiler here, so if you
say this is harder to achieve, I believe you.  I know the kind of source
code I think it would be nice to write, and I know the kind of assembly
code I would expect to get out on a number of processors, but I have
very little idea about what should happen in the middle!

mvh.,

David

Re: Byte swapping support

Reply via email to