Zachary Turner wrote: > I guess the same reason people would want any asm functions in C > source code. Sometimes it's just the best way to express something. > Like in the example I mentioned, I could write 4 different functions > in assembly, one for each size suffix, wrap them all up in a separate > assembly language file but IMHO it's more readable, quicker to code, > and more expressive to use a template switch like I've done. C++ is > built on the philosophy of giving you enough rope to hang yourself > with. > > I don't think there's a better way to express the selection of an > instruction based on operand size than through a naked template > specialization. > > Using a .s file is more difficult to port across different compilers. > Many compilers provide support for naked functions and it's easy to > just use a #ifdef to check which compiler you're running on and define > the appropriate naked declaration string. > > Besides, it's supported for embedded architectures, it's frustrating > because it feels like back in the days of a 386SX's where the > processors had working FPUs on them but they were switched off "just > because". All the investment has already been done to add support for > naked functions, so I think people should be "permitted" to use it, > even if other people feel like they should be using something else.
I still don't get it. A gcc asm version of this is ------------------------------------------------------------------------- template<typename T> intptr_t scas(T *a, T val, int len); template<> intptr_t scas<uint8_t>(uint8_t *a, uint8_t val, int len) { intptr_t result; __asm__ ("rep scasb" : "=D"(result): "a"(val), "D"(a), "c"(len)); return result; } template<typename T> int find_first_nonzero_scas(T* x, int cnt) { intptr_t result = 0; result = scas<T>(x, 0, cnt); result -= reinterpret_cast<intptr_t>(x); result /= sizeof(T); return --result; } ------------------------------------------------------------------------- which, when instantiated, generates int find_first_nonzero_scas<unsigned char>(unsigned char*, int): movq %rdi, %rdx xorl %eax, %eax movl %esi, %ecx notq %rdx rep scasb leaq (%rdx,%rdi), %rax ret How is this not better in every way ? I can understand that you want something compatible with your source. But you said "I don't think anyone has ever presented a good example of where [naked asms are] really really useful on x86 architectures." Baffled, Andrew.