Le 23/11/2014 19:09, Nathann Cohen a écrit :
Hello !What about likwid https://code.google.com/p/likwid ? It is free. Did somebody used it to measure cython code performances?Never tried vtune, nor likwid.What is the size of what you are sorting ? If it is small enough to fit in the caches, and better in the L1 cache, you can possibly improve something with your modification, but otherwise it is certainly memory bounded and you cannot do much...I sort many small arrays. Several kB, not more.You have to measure the bandwidth of your program. Vtune does this, possibly likwid too.Oh. Do you think that it can be used from outside the program, i.e. on a running Sage session ?
With Vtune it is possible (you start and stop collection with a call to the library (do not imagine I am paid by Intel :-) ,I am not.). With likwid, I don t know. But, ok, it seems your objects can stay in the L1 cache. In that case, there is a possibility to improve performances by doing what you propose to do. I had recently a problem like yours: an ode solver needs to solve linear systems, and I got a slight improvement by replacing the call to lapack by my own routine, probably because the lapack routine I called (which is called at many places in the program) imposes a "far" branch at each call... and for small systems (size 5), it is penalizing. But what about the quick sort? is it sure that the implementation cannot degenerate? it is well known all the efficiency can be lost if the "key" used for partition is not chosen as it should be... What about replacing the quick sort by an other method ? (the tree based one?).
t.d.
Nathann
-- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.
<<attachment: tdumont.vcf>>
