Hello folks, by accident I build my 3.4.2.alpha0 build as an SSE2 only build. So I had a change to play with it a little and check for performance regressions. Here are some basic benchmarks:
SSE2 vs. SSE3: * measurable difference for ZZ determinant (~10% slower with SSE2 for 300x300, 400x400, etc) * huge difference (factor *4* slower with SSE2) for RDF matrix matrix multiply (2000x2000, 3000x3000, 4000x4000) * tiniest difference for ZZ matrix matrix mutiply (~1-2% *faster *with* SSE2 over SSE3) Dan Drake reported in IRC that some of the doctests in the matrix directory ran 7% faster with SSE2 instead of SSE3. This rather perplexing result might be due to the SSE2 only Hammer ATLAS being significantly smaller in footprint in the cache (it certainly contains way fewer SSE instructions) and that this results in better cache locality and hence faster code for the matrix directory doctests. Overall these are some interesting developments. While the RDF matrix matrix multiplies did not surprise me one bit the small slowdown or tiny speedup for operations over ZZ (where some code uses multi modular arithmetic and hence ATLAS) is a little puzzling. Thoughts? Cheers, Michael --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to sage-devel-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---