Hi Arthur Perais,

Many thanks for explanation.

I have kept all the delay parameters to 1 except renameToIEWDelay=2.
Looks like there is already fetch queue between fetch and decode. This was
added to decouple fetch & decode stages.
http://reviews.gem5.org/r/2297/diff/1/

Based on above, I added a 32-entry decode queue between decode and rename.
To decouple decode & rename, I modified the decode_impl.hh so that decode
will block
only if the deocde queue is full. At present, decode stall in the very next
cycle of rename block. Instruction coming from fetch are stored in decode
queue first
and if rename is not bloked they get transferred to rename stage. Any
comment on this approach? I have attached the patch (apology since patch is
not formatted properly.)

Could you please give your feedback on below questions.

[Question 1]
But now, I see rename stage is getting block very frequently and due to
this decode queue fills very quickly.
Digging more, I found much of this is due to serializing stalls which in
turn is due to instructions like stxr, stlxr, wsr etc.
(store conditionals, serialize after and serialize before - rename_impl.hh,
line number 653.)
system.cpu.numCycles                         80631180
system.cpu.rename.IdleCycles                3847642
system.cpu.rename.BlockCycles            25363049
system.cpu.rename.serializeStallCycles  28221811
system.cpu.rename.RunCycles               10325847

But I did not find any of these instruction in objdump of dgemm benchmark,
which I am running on gem5 in full system mode.
Are they coming from linux? Any comment on modifying these instruction so
that they not produce serializing stalls?


[Question 2]
inst_queue.hh, line number 79, just before class InstructionQueue.
@todo: Make IQ able to handle multiple FU pools.

Support we have 1 Integer unit, 1 Floating Point and 1 Load-store. Does it
mean that IQ can handle only one of them per cycle ?
But from gem5.stats, it seems gem5 is executing multiple instructions per
cycle.
e.g. - system.cpu.iq.issued_per_cycle::2  8315985  10.41%  91.55%.
Am I missing something? Could you please explain what "Make IQ able to
handle multiple FU pools." means ?

Thanks in advance for your time and patience.

-- 
with regards,
Virendra Kumar Pathak

Attachment: 0001-adding-decodeQueue-between-decode-and-rename.patch
Description: Binary data

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to