How about *not to* push cache latencies in to the queue? Though I am not
quite sure about if this is correct.

Regards,
Mahmood



On Mon, Oct 22, 2012 at 10:58 PM, Runjie Zhang <rz...@virginia.edu> wrote:

> Greetings,
>
>   I tried to write stressmarks in X86 assembly so that the simulated IPC
> or O3CPU can hit N for a N-way out-of-order core. However, no matter how I
> modify the assembly, the IPC could never reach 4 for a 4-way OoO core.
>
>   According to the execution trace, icache stall was the trouble maker. In
> my case, even if the whole program fits in icache, the fetch unit still
> stalls for a few cycles between fetching 32 instructions over 8 cycles(I
> assume 32 X86 ADD instructions fill one cache line?). With Gem5 memory
> system (no Ruby), this latency is 2 cycles. With Ruby memory, this latency
> is 3 cycles.
>
>   So my questions are:
>
>   1. Since Gem5 does not accept a zero hit latency, is there a way to
> access icache every cycle without any stall? Let's assume there are no
> icache misses.
>
>   2. The icache hit latencies for both Ruby memory and Gem5 memory cores
> were 2 cycles, why the Ruby case experienced an extra cycle stall?
>
>   I was running Full System Gem5(changeset:   9305:ac608464be80) with X86
> ISA and single detailed CPU. For Ruby, I used MOESI_hammer protocol.
>
>
> Thanks!
>
> Runjie Zhang
> University of Virginia
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to