How about *not to* push cache latencies in to the queue? Though I am not quite sure about if this is correct.
Regards, Mahmood On Mon, Oct 22, 2012 at 10:58 PM, Runjie Zhang <rz...@virginia.edu> wrote: > Greetings, > > I tried to write stressmarks in X86 assembly so that the simulated IPC > or O3CPU can hit N for a N-way out-of-order core. However, no matter how I > modify the assembly, the IPC could never reach 4 for a 4-way OoO core. > > According to the execution trace, icache stall was the trouble maker. In > my case, even if the whole program fits in icache, the fetch unit still > stalls for a few cycles between fetching 32 instructions over 8 cycles(I > assume 32 X86 ADD instructions fill one cache line?). With Gem5 memory > system (no Ruby), this latency is 2 cycles. With Ruby memory, this latency > is 3 cycles. > > So my questions are: > > 1. Since Gem5 does not accept a zero hit latency, is there a way to > access icache every cycle without any stall? Let's assume there are no > icache misses. > > 2. The icache hit latencies for both Ruby memory and Gem5 memory cores > were 2 cycles, why the Ruby case experienced an extra cycle stall? > > I was running Full System Gem5(changeset: 9305:ac608464be80) with X86 > ISA and single detailed CPU. For Ruby, I used MOESI_hammer protocol. > > > Thanks! > > Runjie Zhang > University of Virginia > > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users