Re: speed of double-precision divide

2010-01-24 Thread Tim Prince
Steve White wrote: I was under the misconception that each of these SSE operatons was meant to be accomplished in a single clock cycle (although I knew there are various other issues.) Current CPU architectures permit an SSE scalar or parallel multiply and add instruction to be issued on eac

Re: speed of double-precision divide

2010-01-24 Thread Richard Guenther
On Sun, Jan 24, 2010 at 10:32 PM, Steve White wrote: > Richard, > > Could you provide us with a good reference for the latencies and other > speed issues of SSE operations?  What I've found is scattered and hard > to compare. > > Frankly, I was under the misconception that each of these SSE operat

Re: speed of double-precision divide

2010-01-24 Thread Steve White
Richard, Could you provide us with a good reference for the latencies and other speed issues of SSE operations? What I've found is scattered and hard to compare. Frankly, I was under the misconception that each of these SSE operatons was meant to be accomplished in a single clock cycle (although

Re: speed of double-precision divide

2010-01-23 Thread Richard Guenther
On Sat, Jan 23, 2010 at 6:33 PM, Steve White wrote: > Hi, Andrew! > > Thanks for the suggestion, but it didn't make any difference for me. > Neither the speed nor the assembler was significantly altered. > > Which version of gcc did you use?  Mine is 4.4.1. > > I threw everything at it: >        g

Re: speed of double-precision divide

2010-01-23 Thread Steve White
Hi, Andrew! Thanks for the suggestion, but it didn't make any difference for me. Neither the speed nor the assembler was significantly altered. Which version of gcc did you use? Mine is 4.4.1. I threw everything at it: gcc -std=c99 -Wall -pedantic -O3 -ffast-math -mmmx -msse -msse2 -mf

Re: speed of double-precision divide

2010-01-23 Thread Richard Guenther
On Sat, Jan 23, 2010 at 5:47 PM, Steve White wrote: > Hi, > > I recently revised some speed tests of basic CPU operations. > There were a few surprises, but one was that, a test of double-precision > divide was a factor of ten slower when compiled with gcc than with the > Intel compiler icc. > > T

Re: speed of double-precision divide

2010-01-23 Thread Andrew Pinski
On Sat, Jan 23, 2010 at 8:47 AM, Steve White wrote: > gcc has this (gcc -std=c99 -O3 -msse2 -mfpmath=sse -lm -S dt.c) > icc has this (icc -Wall -w2 -fast -c dt.c) icc's -fast is equivalent to gcc's -ffast-math option which you did not supply so you comparing apples to oranges. Note supplying -ff