Zachary Turner wrote:

> I guess the same reason people would want any asm functions in C
> source code.  Sometimes it's just the best way to express something.
> Like in the example I mentioned, I could write 4 different functions
> in assembly, one for each size suffix, wrap them all up in a separate
> assembly language file but IMHO it's more readable, quicker to code,
> and more expressive to use a template switch like I've done.  C++ is
> built on the philosophy of giving you enough rope to hang yourself
> with.
> 
> I don't think there's a better way to express the selection of an
> instruction based on operand size than through a naked template
> specialization.
> 
> Using a .s file is more difficult to port across different compilers.
> Many compilers provide support for naked functions and it's easy to
> just use a #ifdef to check which compiler you're running on and define
> the appropriate naked declaration string.
> 
> Besides, it's supported for embedded architectures, it's frustrating
> because it feels like back in the days of a 386SX's where the
> processors had working FPUs on them but they were switched off "just
> because".  All the investment has already been done to add support for
> naked functions, so I think people should be "permitted" to use it,
> even if other people feel like they should be using something else.

I still don't get it.  A gcc asm version of this is

-------------------------------------------------------------------------
template<typename T> intptr_t scas(T *a, T val, int len);

template<> intptr_t scas<uint8_t>(uint8_t *a, uint8_t val, int len)
{
  intptr_t result;
  __asm__ ("rep scasb" : "=D"(result): "a"(val), "D"(a), "c"(len));
  return result;
}

template<typename T>
int find_first_nonzero_scas(T* x, int cnt)
{
    intptr_t  result = 0;
    result = scas<T>(x, 0, cnt);
    result -= reinterpret_cast<intptr_t>(x);
    result /= sizeof(T);
    return --result;
}
-------------------------------------------------------------------------

which, when instantiated, generates

int find_first_nonzero_scas<unsigned char>(unsigned char*, int):
        movq    %rdi, %rdx
        xorl    %eax, %eax
        movl    %esi, %ecx
        notq    %rdx
        rep scasb
        leaq    (%rdx,%rdi), %rax
        ret

How is this not better in every way ?

I can understand that you want something compatible with your source.  But
you said "I don't think anyone has ever presented a good example of where
[naked asms are] really really useful on x86 architectures."

Baffled,
Andrew.

Reply via email to