On Wed, 24 Oct 2012, Runjie Zhang wrote:
Hi, Nilay
I agree with you that to fetch from icache every cycle, hit latency
don't have to be zero.
Here is a snap shot from the exec trace: (deleted some detail to
make it more clear) Icache hit latency is 1 cycle and fetch width is 4
(Ticks)
...60..: ....fetch: Running stage.
...60..: ....fetch: Attempting to fetch from [tid:0]
...60..: ....fetch: [tid:0]: Adding instructions to queue to decode.
...60..: ....fetch: [tid:0]: Instruction PC 0x400ab7 (0) created [sn:5050].
...60..: ....fetch: [tid:0]: Instruction PC 0x400ab9 (0) created [sn:5051].
...60..: ....fetch: [tid:0]: Instruction PC 0x400abb (0) created [sn:5052].
...60..: ....fetch: [tid:0]: Instruction PC 0x400abd (0) created [sn:5053].
...60..: ....fetch: [tid:0]: Done fetching, reached fetch bandwidth
for this cycle.
What happened on cycles 61-64? Should not the fetch unit try to create
four instructions every cycles?
...65..: ....fetch: Running stage.
...65..: ....fetch: Attempting to fetch from [tid:0]
...65..: ....fetch: [tid:0]: Adding instructions to queue to decode.
...65..: ....fetch: [tid:0]: Issuing a pipelined I-cache access,
starting at PC (0x400abf=>0x400ac7).(0=>1).
...65..: ....fetch: [tid:0] Fetching cache line 0x400ac0 for addr 0x400ac0
...65..: ....fetch: Fetch: Doing instruction read.
...65..: ....fetch: [tid:0]: Doing Icache access.
What happened in the in-between cycles?
...70..: ....fetch: [tid:0] Waking up from cache miss.
...70..: ....fetch: Running stage.
...70..: ....fetch: Attempting to fetch from [tid:0]
...70..: ....fetch: [tid:0]: Icache miss is complete.
...70..: ....fetch: [tid:0]: Adding instructions to queue to decode.
...70..: ....fetch: [tid:0]: Instruction PC 0x400abf (0) created [sn:5054].
...70..: ....fetch: [tid:0]: Instruction PC 0x400ac1 (0) created [sn:5055].
...70..: ....fetch: [tid:0]: Instruction PC 0x400ac3 (0) created [sn:5056].
...70..: ....fetch: [tid:0]: Instruction PC 0x400ac5 (0) created [sn:5057].
...70..: ....fetch: [tid:0]: Done fetching, reached fetch bandwidth
for this cycle.
When entering cycle 65, the previous cache line has been consumed
so the fetch unit launched a pipelined icache access. However, this
access has latency of 1 so the fetch unit need to wait till cycle 70
to start to fetch again. This created a one cycle stall. If I
understand correctly, this latency could be covered it the pipelined
icache access were launched one cycle earlier (in cycle 60). Can I
configure that in Gem5?
One cycle earlier would mean cycle 64 and not cycle 60. You have
completely removed the trace for the in between cycles which is required
for understanding what was going in the fetch unit during those cycles.
I am not sure whether the Fetch flag is enough to study this
phenomenon. If not, please tell me what other flags should I use!
BTW, the O3CPUALL debug flag seems not working. I got error "invalid
debug flag 'O3CPUALL' ".
It is not working because you are using the wrong flag. The correct flag
is O3CPUAll.
--
Nilay
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users