The copying out makes 50% difference (its better with copying) to the speed of 16384x16384 but no difference to 10000x10000 or 20000x20000.
That's wierd. Bill. On 18 May, 17:36, Martin Albrecht <[EMAIL PROTECTED]> wrote: > Hi, > > first, I recorded the different speed-ups in a small table for an overview in > the attachment (I think we've come a far way :-)) To disable the copying out > one needs to edit > > /* we copy the matrix first since it is only constant memory > overhead and improves data locality, if you remove it make sure > there are no speed regressions */ > /* C = _mzd_mul_m4rm_impl(C, A, B, 0, TRUE); */ > packedmatrix *Cbar = mzd_init(C->nrows, C->ncols); > Cbar = _mzd_mul_m4rm_impl(Cbar, A, B, 0, FALSE); > mzd_copy(C, Cbar); > mzd_free(Cbar); > return C; > > in strassen.c to > > /* we copy the matrix first since it is only constant memory > overhead and improves data locality, if you remove it make sure > there are no speed regressions */ > C = _mzd_mul_m4rm_impl(C, A, B, 0, TRUE); > return C; > > This disables the copying out. > > Martin > > PS: If I find some time later today I'll make some changes such that SSE2 can > be used more often, i.e. align each row at 16-byte borders if HAVE_SSE2 is > used. > > -- > name: Martin Albrecht > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > _www:http://www.informatik.uni-bremen.de/~malb > _jab: [EMAIL PROTECTED] > > timings.html > 2KDownload --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---