Thanks for the confirmation. I'll look into it and discuss potential
solutions.
BTW, just curious, is there any particular reason for putting the code for
fetch in a .hh, instead of a .cc file?
Thanks!
Runjie
On Fri, Oct 26, 2012 at 3:42 PM, Nilay Vaish wrote:
> I understand now the problem th
I understand now the problem that you are trying to elucidate. I just
checked the fetch_impl.hh. If you look at line 889, it is doing exactly
what you have suggested. It might be that there is some thing wrong with
this code and it is not behaving as expected. You might want to take a
deeper di
Sorry for the confusion.
The numbers 60, 65 and 70 were part of the tick number each cycle started. I
removed some digits in the tick count to make each line shorter...
The complete trace looks like this:
33922322296000: system.switch_cpus.fetch: Running stage.
33922322296000: system.switch_cpus
On Wed, 24 Oct 2012, Runjie Zhang wrote:
Hi, Nilay
I agree with you that to fetch from icache every cycle, hit latency
don't have to be zero.
Here is a snap shot from the exec trace: (deleted some detail to
make it more clear) Icache hit latency is 1 cycle and fetch width is 4
(Ticks)
...60
Hi, Nilay
I agree with you that to fetch from icache every cycle, hit latency
don't have to be zero.
Here is a snap shot from the exec trace: (deleted some detail to
make it more clear) Icache hit latency is 1 cycle and fetch width is 4
(Ticks)
...60..: fetch: Running stage.
...60..: ...
On Mon, 22 Oct 2012, Runjie Zhang wrote:
Greetings,
I tried to write stressmarks in X86 assembly so that the simulated IPC or
O3CPU can hit N for a N-way out-of-order core. However, no matter how I
modify the assembly, the IPC could never reach 4 for a 4-way OoO core.
According to the execut
How about *not to* push cache latencies in to the queue? Though I am not
quite sure about if this is correct.
Regards,
Mahmood
On Mon, Oct 22, 2012 at 10:58 PM, Runjie Zhang wrote:
> Greetings,
>
> I tried to write stressmarks in X86 assembly so that the simulated IPC
> or O3CPU can hit N f
Greetings,
I tried to write stressmarks in X86 assembly so that the simulated IPC or
O3CPU can hit N for a N-way out-of-order core. However, no matter how I
modify the assembly, the IPC could never reach 4 for a 4-way OoO core.
According to the execution trace, icache stall was the trouble ma