19.11.2014 3:36, Neil Horman ?????: > On Tue, Nov 18, 2014 at 05:52:27PM +0000, Bruce Richardson wrote: >> On Tue, Nov 18, 2014 at 12:46:19PM -0500, Neil Horman wrote: >>> On Tue, Nov 18, 2014 at 11:13:17PM +0600, Yerden Zhumabekov wrote: >>>> Everybody's up for the second option? :) >>>> >>> Crud, you're right, I didn't think about the header inclusion issue. Is it >>> worth adding the jump to enable the dynamic hash selection? >>> Neil >> Maybe for cases where SSE4.2 is not currently available, i.e. for generic >> builds. >> For builds where we have hardware support confirmed at compile time, just use >> the function from the header file. >> Does that make sense? >> > I'm not certain of that, as I don't think anything can be 'confirmed' at > compile > time. I.e. just because you have sse42 at compile time doesn't guarantee you > have it at run time with a DSO. If you have these as macros, you need to > enable > sse42 whereever you include the file so that the intrinsic works properly. > > an alternate option would be to not use the intrinsic, and craft some explicit > __asm__ statement that executes the right sse42 instructions. That way the > asm > is directly emitted, without requiring the -msse42 flag at all, and it will > just > work in all the files that call it.
Thanks for the discussion. To summarize it with my suggestions for 'v5': 1) replace intrinsics with asm code and give up including nmmintrin.h; 2) detect arch (EM64T flag) on runtime because crc32 for 64-bit operand doesn't work on 32-bit x86; 3) separate function prototypes (leaving them in header) and bodies, add to SRCS in Makefile. -- Sincerely, Yerden Zhumabekov State Technical Service Astana, KZ