If you could post it for review it would be a lot easier to
understand since the email seems to have stripped all indenting.


Thanks, 

Ali 

On 13.07.2012 12:47, Dibakar Gope wrote: 

> Hi
Nilay,
> 
> Sorry for late response, I din't check my emails since last
night :).
> 
> Anyway, so the checkviolations part that we are talking
about, that takes care of not having any CMP violation of coherence, but
it does not re-execute a load (not at the front of the commit queue) and
following younger insts upon receiving a snoop invalidation request, so
in my understanding it does not enforce the strict load-load ordering of
a stronger model. So i add couple of lines in checkSnoop: see the
changes below
> (1) the first if clause of checking the " // If there
are no loads in the LSQ we don't care" condition was wrong i guess in
the existing code, it actually was checking"If there are no loads in the
LSQ we don't care" with the "if (load_idx == loadTail)" clause. So with
an additional if clause, I make sure that if the snoop hits the front of
the load queue, then nothing need to be done.
> (2) further I add a
clause towards the end of checkSnoop () with needSC condition to check,
if the snoop hits a executed load that is not at the front of the queue,
reexecutes using ReExec (hopefully ReExec squashs all the younger insts
including that and re-fetches, as i understood from Ali's response)
> 
>
The other changes that I did to maintain SC is to add few more
constraints on the load queue to ensure store-load ordering, ie a load
in the load queue can not retire from ROB until and unless the committed
store instructions before that in the program order are exposed to the
memory system, as a result a load can still receive snoop invalidates
and need to be re-executed, if needed. I can post my changes to enforce
SC for review.
> 
> template
> void
> LSQUnit::checkSnoop(PacketPtr
pkt)
> {
> int load_idx = loadHead;
> 
> if (!cacheBlockMask) {
>
assert(dcachePort);
> Addr bs = dcachePort->peerBlockSize();
> 
> //
Make sure we actually got a size
> assert(bs != 0);
> 
> cacheBlockMask
= ~(bs - 1);
> }
> 
> // If there are no loads in the LSQ we don't
care
> if (load_idx == loadTail) {
> DPRINTF(LSQUnit, "loadHead: %d,
loadTail:%dn", loadHead, loadTail);
> //assert(0);
> return;
> }
> 
> //
If this is the only load in the LSQ we don't care
> if (loadTail ==
(load_idx + 1)) {
> DPRINTF(LSQUnit, "loadHead: %d, loadTail:%dn",
loadHead, loadTail);
> //assert(0);
> return;
> }
>
incrLdIdx(load_idx);
> DPRINTF(LSQUnit, "Got snoop for address %#xn",
pkt->getAddr());
> Addr invalidate_addr = pkt->getAddr() &
cacheBlockMask;
> while (load_idx != loadTail) {
> DynInstPtr ld_inst =
loadQueue[load_idx];
> 
> if (!ld_inst->effAddrValid ||
ld_inst->uncacheable()) {
> incrLdIdx(load_idx);
> continue;
> }
> 
>
Addr load_addr = ld_inst->physEffAddr & cacheBlockMask;
>
DPRINTF(LSQUnit, "-- inst [sn:%lli] load_addr: %#x to pktAddr:%#xn",
>
ld_inst->seqNum, load_addr, invalidate_addr);
> 
> if (load_addr ==
invalidate_addr) {
> if (ld_inst->possibleLoadViolation) {
>
DPRINTF(LSQUnit, "Conflicting load at addr %#x [sn:%lli]n",
>
ld_inst->physEffAddr, pkt->getAddr(), ld_inst->seqNum);
> 
> // Mark the
load for re-execution
> ld_inst->fault = new ReExec;
> } else {
> // If
a older load checks this and it's true
> // then we might have missed
the snoop
> // in which case we need to invalidate to be sure
>
ld_inst->hitExternalSnoop = true;
> 
> if (needsSC == true){
> 
>
ld_inst->fault = new ReExec;
> }
> }
> }
> incrLdIdx(load_idx);
> }
>
return;
> }
> 
> On 07/12/12, Nilay Vaish wrote:
> 
>> Dibakar, any
progress on this front? On Wed, 27 Jun 2012, Ali Saidi wrote: 
>> 
>>>
Hi Dibakar, I'm not saying that I believe this is correct for x86. It
seems like x86 does require more ordering than is currently provided by
the lsq. Hopefully someone with more x86 experience could chime in and
confirm that. The faulting mechanism needs an overhaul in the o3 cpu.
There shouldn't be any fundamental difference. Thanks, Ali On 27.06.2012
18:08, Dibakar Gope wrote: 
>>> 
>>>> Hi Ali, from this thread,
>>>
http://www.mail-archive.com/gem5-dev@gem5.org/msg00782.html [3], I get
an idea that a snoop invalidate will make a younger load and its
following younger instructions to re-execute, if only an older load in
the program order to the same cache block see an updated value. But I am
not still sure, if it obeys the load-load ordering of a stronger
consistency model other than ARM. Suppose for example, 
>>> 
>>>> C0 C1
St A Ld C St B Ld A
> 
>
_______________________________________________
> gem5-users mailing
list
> gem5-users@gem5.org
>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users




Links:
------
[1] mailto:gem5-users@gem5.org
[2]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[3]
http://www.mail-archive.com/gem5-dev@gem5.org/msg00782.html
[4]
mailto:gem5-users@gem5.org
[5]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[6]
mailto:gem5-users@gem5.org
[7]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[8]
mailto:gem5-users@gem5.org
[9]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to