> code (a) : for (int i = 0; i< 1000000; i++) c[i] = a[i] * b[i]; > > code (b) : for (int i = 0; i< 1000; i++) for(int j = 0; j < b[i]; j++) > c[i] += a[i]; > > code (c) : for (int i = 0; i< 1000; i++) c[i] = HW_MUL(a[i], b[i]); > > I'm sure that code (b) will execute much longer that code (a) inside > qemu (sure that different that in real platform), and I'd like to > compute executing time for code (c) in some way. > So, how can I trap time information/calculation inside qemu?
You can't. As I've said before, any performance measurements you make inside qemu are completely meaningless. You may be able to say that executing 1000 iterations takes longer than 10 iterations of the same loop, but you didn't need a simulator to tell you that. Also, there's a good chance qemu will take almost exactly the same time to execute them because the execution time will be dominated by the translation time for the first iteration. You can not compare the cost of (say) add v.s. multiply, or of two different multiply instructions. The timings for the individual instructions (or any particular sequence of instructions) bear no relation whatsoever to the timings for the same sequence on real hardware. I'm repeating myself now, so I intend this to my my last response to this thread. Paul