On Tue, 21 Apr 2015, Bruce Evans wrote:
On Mon, 20 Apr 2015, Konstantin Belousov wrote:
On Mon, Apr 13, 2015 at 04:04:45PM -0400, Jung-uk Kim wrote:
Please try the attached patch.
...
- __asm __volatile("xorl %k0,%k0;popcntq %1,%0"
- : "=&r" (result) : "rm" (elem));
...
+ __asm __volatile("xorl %k0, %k0; popcntq %1, %0"
+ : "=r" (count) : "m" (pc_map[field]));
...
Yes, this worked for me the same way as for you, the argument is taken
directly from memory, without temporary spill. Is this due to silly
inliner ? Whatever the reason is, I think a comment should be added
noting the subtlety.
Otherwise, looks fine.
Erm, this looks silly. It apparently works by making things too complicated
for the compiler to "optimize" (where one of the optimizations actually
gives pessimal spills). Its main changes are:
...
It works better to change the constraint to "r":
It's even sillier than that. The problem is not limited to this function.
clang seems to prefer memory whenever you use the "rm" constraint. The
silliest case is when you have a chain of simple asm functions. Say the
original popcntq (without the xorl):
return (popcntq(popcntq(popcntq(popcntq(popcntq(x))))));
gcc compiles this to 5 sequential popcntq instructions, but clang
spills the results of the first 4.
This is an old bug. clang does this on FreeBSD[9-11]. cc does this
on FreeBSD[10-11] (not on FreeBSD-9 since cc = gcc there.
Asms should always use "rm" if "m" works. Ones in cpufunc.h always
do except for lidt(), lldt() and ltr(). These 3 are fixed in my version.
So cpufunc.h almost always asks for the pessimization.
Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"