On Fri, Jun 12, 2009 at 11:32 AM, Paolo Bonzini<paolo.bonz...@gmail.com> wrote: >> This is one example, but it illustrates a general concept that I think >> is really useful and I personally have used numerous times for lots of >> other instructions than SCAS. If there is a way to achieve this >> without using a naked function then please advise. > > Keeping the __asm syntax, I'd be surprised if this did not work: > > template<typename T> > int find_first_nonzero_scas(T* x, int cnt) > { > int result = 0; > __asm { > xor eax, eax > mov edi, x > mov ecx, cnt > } > if (sizeof (T) == 1) > __asm { rep scasb; mov result, edi } > if (sizeof (T) == 2) > __asm { rep scasw; mov result, edi } > if (sizeof (T) == 4) > __asm { rep scasl; mov result, edi } > result -= reinterpret_cast<int>(x); > result /= sizeof(T); > return --result; > } > > Paolo >
Sorry about the asm syntax, I still haven't used inline assembly in gcc so I haven't looked at the syntax yet. I was just going to start porting over some code to work on gcc when I started looking into the naked issue. That being said, what you suggest will indeed work, and be optimized to be as efficient as the template method. It's what I'll probably end up doing as a fallback. But it's very ugly, and there are a couple of cases where I have much more inline assembly than in this particular example. So I have to litter segments of code like that all throughout the function. I suppose I could wrap it in a macro for readability, but its' nicer if it's just integrated with C++ like everything else. Its supported for many other platforms, it just seems a little odd to explicitly not support on the most common platform.