Sure I will do that; let me see how can I make a diff file with all the changes (changes need to be made to obey store-load ordering of a stronger model too!) and post it for review.
Thanks, dibakar On 07/13/12, Ali Saidi wrote: > > > > > > > If you could post it for review it would be a lot easier to understand since > the email seems to have stripped all indenting. > > > > Thanks, > > Ali > > > > On 13.07.2012 12:47, Dibakar Gope wrote: > > > > > Hi Nilay, Sorry for late response, I din't check my emails since last night > > :). Anyway, so the checkviolations part that we are talking about, that > > takes care of not having any CMP violation of coherence, but it does not > > re-execute a load (not at the front of the commit queue) and following > > younger insts upon receiving a snoop invalidation request, so in my > > understanding it does not enforce the strict load-load ordering of a > > stronger model. So i add couple of lines in checkSnoop: see the changes > > below (1) the first if clause of checking the " // If there are no loads in > > the LSQ we don't care" condition was wrong i guess in the existing code, it > > actually was checking"If there are no loads in the LSQ we don't care" with > > the "if (load_idx == loadTail)" clause. So with an additional if clause, I > > make sure that if the snoop hits the front of the load queue, then nothing > > need to be done. (2) further I add a clause towards the end of checkSnoop > > () with needSC condition to check, if the snoop hits a executed load that is not at the front of the queue, reexecutes using ReExec (hopefully ReExec squashs all the younger insts including that and re-fetches, as i understood from Ali's response) The other changes that I did to maintain SC is to add few more constraints on the load queue to ensure store-load ordering, ie a load in the load queue can not retire from ROB until and unless the committed store instructions before that in the program order are exposed to the memory system, as a result a load can still receive snoop invalidates and need to be re-executed, if needed. I can post my changes to enforce SC for review. template void LSQUnit::checkSnoop(PacketPtr pkt) { int load_idx = loadHead; if (!cacheBlockMask) { assert(dcachePort); Addr bs = dcachePort->peerBlockSize(); // Make sure we actually got a size assert(bs != 0); cacheBlockMask = ~(bs - 1); } // If there are no loads in the LSQ we don't care if (load_idx == loadTail) { DPRINTF(LSQUnit, "loa dHead: %d, loadTail:%d\n", loadHead, loadTail); //assert(0); return; } // If this is the only load in the LSQ we don't care if (loadTail == (load_idx + 1)) { DPRINTF(LSQUnit, "loadHead: %d, loadTail:%d\n", loadHead, loadTail); //assert(0); return; } incrLdIdx(load_idx); DPRINTF(LSQUnit, "Got snoop for address %#x\n", pkt->getAddr()); Addr invalidate_addr = pkt->getAddr() & cacheBlockMask; while (load_idx != loadTail) { DynInstPtr ld_inst = loadQueue[load_idx]; if (!ld_inst->effAddrValid || ld_inst->uncacheable()) { incrLdIdx(load_idx); continue; } Addr load_addr = ld_inst->physEffAddr & cacheBlockMask; DPRINTF(LSQUnit, "-- inst [sn:%lli] load_addr: %#x to pktAddr:%#x\n", ld_inst->seqNum, load_addr, invalidate_addr); if (load_addr == invalidate_addr) { if (ld_inst->possibleLoadViolation) { DPRINTF(LSQUnit, "Conflicting load at addr %#x [sn:%lli]\n", ld_inst->physEffAddr, pkt->getAddr(), ld_inst->seqNum); // Mark the load for re-execution ld_inst->fault = new ReExec; } else { // If a older load checks this and it's true // then we might have missed the snoop // in which case we need to invalidate to be sure ld_inst->hitExternalSnoop = true; if (needsSC == true){ ld_inst->fault = new ReExec; } } } incrLdIdx(load_idx); } return; } On 07/12/12, Nilay Vaish wrote: > > > > > Dibakar, any progress on this front? On Wed, 27 Jun 2012, Ali Saidi > > > wrote: > > > > Hi Dibakar, I'm not saying that I believe this is correct for x86. It > > > > seems like x86 does require more ordering than is currently provided by > > > > the lsq. Hopefully someone with more x86 experience could chime in and > > > > confirm that. The faulting mechanism needs an overhaul in the o3 cpu. > > > > There shouldn't be any fundamental difference. Thanks, Ali On > > > > 27.06.2012 18:08, Dibakar Gope wrote: > > > > > Hi Ali, from this thread, > > > > http://www.mail-archive.com/gem5-dev@gem5.org/msg00782.html, I get an > > > > idea that a snoop invalidate will make a younger load and its following > > > > younger instructions to re-execute, if only an older load in the > > > > program order to the same cache block see an updated value. But I am > > > > not still sure, if it obeys the load-load ordering of a stronger > > > > consistency model other than ARM. Suppose for example, > > > > > C0 C1 St A Ld C St B Ld A > > > > > > > > > In the above scenario, if the memory order becomes Ld A -> St A -> St > > > > B -> Ld C and if C1 receives an invalidation for cache block A, before > > > > Ld A make it to the front of the commit queue, still checkViolations() > > > > code won't squash the Ld A and any younger instructions to maintain > > > > strong consistency. > > > > > My other doubt is that, can we make use of the > > > > squashDueToMemOrder() squash mechanism instead of using ReExec fault, > > > > if I want to squash the load A and younger instructions and re-fetch > > > > those again in the above scenario? ReExec waits for the faulted > > > > instruction to reach the front of the commit, is there any other > > > > fundamental difference of using ReExec in comparison to the > > > > squashDueToMemOrder() other than this? > > > > > Thanks, --Dibakar On 06/25/12, Ali Saidi wrote: > > > > ARM just requires load-load ordering (which is stronger than alpha). > > > > x86 to my knowledge requires all stores in the system to be visible in > > > > the same order. Ali On Jun 22, 2012, at 11:50 PM, Nilay wrote:What's > > > > the difference between ARM's load-load ordering and TSO? I am guessing > > > > in ARM not all instructions are flushed from pipe, but only those that > > > > are affected by the snoop. My understanding is that the O3 CPU flushes > > > > the entire pipeline when it sees that an instruction needs to execute > > > > again. Since instructions commit inorder, any load that gets squashed > > > > would mean that all subsequent loads are squashed as well. -- Nilay On > > > > Fri, June 22, 2012 8:47 am, Ali Saidi wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > HI > > > > > > > > > > > > > > > > > > > > > > Dibakar, I'd have to think carefully about it, but you may be right > > > > about TSO. I'd hope that someone who is more familiar with x86 could > > > > respond. Thanks, Ali On 22.06.2012 07:46, Dibakar Gope wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Ali, Thanks for the response. Ok, I got the point. I > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > thought that since the O3 attempts to support the TSO for X86 , so > > > > inherently this enforces/covers the regular load-load ordering present > > > > in any stronger consistency model. But if it inline with ARM's > > > > requirements,then does it not violate x86 and TSO's conventional > > > > load-load ordering? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > thanks, Dibakar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ gem5-users mailing list > > > > gem5-users@gem5.org <gem5-users@gem5.org> [1] > > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users[2]_______________________________________________ > > > > gem5-users mailing list gem5-users@gem5.org <gem5-users@gem5.org> [3] > > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users[4]_______________________________________________ > > > > > > > > > gem5-users mailing > > > > list > > > > > gem5-users@gem5.org <gem5-users@gem5.org> > > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users Links: ------ [1] > > > > mailto:gem5-users@gem5.org > > > > <gem5-users@gem5.org>(java_script:main.compose() [2] > > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [3] > > > > mailto:gem5-users@gem5.org > > > > <gem5-users@gem5.org>(java_script:main.compose() [4] > > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > > > _______________________________________________ gem5-users mailing list > > gem5-users@gem5.org <gem5-users@gem5.org> > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > _______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users