Hi
When caculating the cos/sinus, gcc generates a call to a complicated
routine that takes several thousand instructions to execute.
Suppose the value is stored in some XMM register, say xmm0 and the
result should be in another xmm register, say xmm1.
Why it doesn't generate:
movsd
Hi,
This question is not appropriate for this mailing list.
Please take any further discussions to the gcc-help mailing list.
On Sat, 2013-05-11 at 11:15 +0200, jacob navia wrote:
> Hi
>
> When caculating the cos/sinus, gcc generates a call to a complicated
> routine that takes several thousand
On Sat, 11 May 2013, jacob navia wrote:
Hi
When caculating the cos/sinus, gcc generates a call to a complicated routine
that takes several thousand instructions to execute.
Suppose the value is stored in some XMM register, say xmm0 and the result
should be in another xmm register, say xmm1.
Le 11/05/13 11:20, Oleg Endo a écrit :
Hi,
This question is not appropriate for this mailing list.
Please take any further discussions to the gcc-help mailing list.
On Sat, 2013-05-11 at 11:15 +0200, jacob navia wrote:
Hi
When caculating the cos/sinus, gcc generates a call to a complicated
ro
Le 11/05/13 11:30, Marc Glisse a écrit :
On Sat, 11 May 2013, jacob navia wrote:
Hi
When caculating the cos/sinus, gcc generates a call to a complicated
routine that takes several thousand instructions to execute.
Suppose the value is stored in some XMM register, say xmm0 and the
result sh
On 5/11/2013 5:42 AM, jacob navia wrote:
1) The fsin instruction is ONE instruction! The sin routine is (at
least) thousand instructions!
Even if the fsin instruction itself is "slow" it should be thousand
times faster than the
complicated routine gcc calls.
2) The FPU is at 64 bits ma
On Sat, May 11, 2013 at 09:34:37AM -0400, Robert Dewar wrote:
> On 5/11/2013 5:42 AM, jacob navia wrote:
>
> >1) The fsin instruction is ONE instruction! The sin routine is (at
> >least) thousand instructions!
> > Even if the fsin instruction itself is "slow" it should be thousand
> >times fas
As 1) only way is measure that. Compile following an we will see who is
rigth.
Right, probably you should have done that before posting
anything! (I leave the experiment up to you!)
cat "
#include
int main(){ int i;
double x=0;
double ret=0;
double f;
for(i=0;i<1000;i++){
On 5/11/2013 10:46 AM, Robert Dewar wrote:
As 1) only way is measure that. Compile following an we will see who is
rigth.
Right, probably you should have done that before posting
anything! (I leave the experiment up to you!)
And of course this experiment says nothing about accuracy!
Le 11/05/13 16:01, Ondřej Bílka a écrit :
As 1) only way is measure that. Compile following an we will see who is
rigth.
cat "
#include
int main(){ int i;
double x=0;
double ret=0;
double f;
for(i=0;i<1000;i++){
ret+=sin(x);
x+=0.3;
}
return ret;
}
" > sin.c
On 5/11/2013 11:20 AM, jacob navia wrote:
OK I did a similar thing. I just compiled sin(argc) in main.
The results prove that you were right. The single fsin instruction
takes longer than several HUNDRED instructions (calls, jumps
table lookup what have you)
Gone are the times when an fsin woul
On 05/11/2013 11:25 AM, Robert Dewar wrote:
On 5/11/2013 11:20 AM, jacob navia wrote:
OK I did a similar thing. I just compiled sin(argc) in main.
The results prove that you were right. The single fsin instruction
takes longer than several HUNDRED instructions (calls, jumps
table lookup what ha
Another interesting use-case for OpenACC and OpenMP is mixing both
standard annotations for the same loop:
// Compute matrix multiplication.
#pragma omp parallel for default(none) shared(A,B,C,size)
#pragma acc kernels pcopyin(A[0:size][0:size],B[0:size][0:size]) \
pcopyout(C[0:size][0:size])
Snapshot gcc-4.7-20130511 is now available on
ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20130511/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.
This snapshot has been generated from the GCC 4.7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches
Hi,
I'm considering adding a named address space to our private gcc port.
This address space is accessed using special instructions with a very
limited addressing mode "[index*8 + imm]" : it only supports an
index scaled by 64bit + an immediate. The issue here is that there is
no base register.
15 matches
Mail list logo