If you could post it for review it would be a lot easier to understand since the email seems to have stripped all indenting.
Thanks, Ali On 13.07.2012 12:47, Dibakar Gope wrote: > Hi Nilay, > > Sorry for late response, I din't check my emails since last night :). > > Anyway, so the checkviolations part that we are talking about, that takes care of not having any CMP violation of coherence, but it does not re-execute a load (not at the front of the commit queue) and following younger insts upon receiving a snoop invalidation request, so in my understanding it does not enforce the strict load-load ordering of a stronger model. So i add couple of lines in checkSnoop: see the changes below > (1) the first if clause of checking the " // If there are no loads in the LSQ we don't care" condition was wrong i guess in the existing code, it actually was checking"If there are no loads in the LSQ we don't care" with the "if (load_idx == loadTail)" clause. So with an additional if clause, I make sure that if the snoop hits the front of the load queue, then nothing need to be done. > (2) further I add a clause towards the end of checkSnoop () with needSC condition to check, if the snoop hits a executed load that is not at the front of the queue, reexecutes using ReExec (hopefully ReExec squashs all the younger insts including that and re-fetches, as i understood from Ali's response) > > The other changes that I did to maintain SC is to add few more constraints on the load queue to ensure store-load ordering, ie a load in the load queue can not retire from ROB until and unless the committed store instructions before that in the program order are exposed to the memory system, as a result a load can still receive snoop invalidates and need to be re-executed, if needed. I can post my changes to enforce SC for review. > > template > void > LSQUnit::checkSnoop(PacketPtr pkt) > { > int load_idx = loadHead; > > if (!cacheBlockMask) { > assert(dcachePort); > Addr bs = dcachePort->peerBlockSize(); > > // Make sure we actually got a size > assert(bs != 0); > > cacheBlockMask = ~(bs - 1); > } > > // If there are no loads in the LSQ we don't care > if (load_idx == loadTail) { > DPRINTF(LSQUnit, "loadHead: %d, loadTail:%dn", loadHead, loadTail); > //assert(0); > return; > } > > // If this is the only load in the LSQ we don't care > if (loadTail == (load_idx + 1)) { > DPRINTF(LSQUnit, "loadHead: %d, loadTail:%dn", loadHead, loadTail); > //assert(0); > return; > } > incrLdIdx(load_idx); > DPRINTF(LSQUnit, "Got snoop for address %#xn", pkt->getAddr()); > Addr invalidate_addr = pkt->getAddr() & cacheBlockMask; > while (load_idx != loadTail) { > DynInstPtr ld_inst = loadQueue[load_idx]; > > if (!ld_inst->effAddrValid || ld_inst->uncacheable()) { > incrLdIdx(load_idx); > continue; > } > > Addr load_addr = ld_inst->physEffAddr & cacheBlockMask; > DPRINTF(LSQUnit, "-- inst [sn:%lli] load_addr: %#x to pktAddr:%#xn", > ld_inst->seqNum, load_addr, invalidate_addr); > > if (load_addr == invalidate_addr) { > if (ld_inst->possibleLoadViolation) { > DPRINTF(LSQUnit, "Conflicting load at addr %#x [sn:%lli]n", > ld_inst->physEffAddr, pkt->getAddr(), ld_inst->seqNum); > > // Mark the load for re-execution > ld_inst->fault = new ReExec; > } else { > // If a older load checks this and it's true > // then we might have missed the snoop > // in which case we need to invalidate to be sure > ld_inst->hitExternalSnoop = true; > > if (needsSC == true){ > > ld_inst->fault = new ReExec; > } > } > } > incrLdIdx(load_idx); > } > return; > } > > On 07/12/12, Nilay Vaish wrote: > >> Dibakar, any progress on this front? On Wed, 27 Jun 2012, Ali Saidi wrote: >> >>> Hi Dibakar, I'm not saying that I believe this is correct for x86. It seems like x86 does require more ordering than is currently provided by the lsq. Hopefully someone with more x86 experience could chime in and confirm that. The faulting mechanism needs an overhaul in the o3 cpu. There shouldn't be any fundamental difference. Thanks, Ali On 27.06.2012 18:08, Dibakar Gope wrote: >>> >>>> Hi Ali, from this thread, >>> http://www.mail-archive.com/gem5-dev@gem5.org/msg00782.html [3], I get an idea that a snoop invalidate will make a younger load and its following younger instructions to re-execute, if only an older load in the program order to the same cache block see an updated value. But I am not still sure, if it obeys the load-load ordering of a stronger consistency model other than ARM. Suppose for example, >>> >>>> C0 C1 St A Ld C St B Ld A > > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users Links: ------ [1] mailto:gem5-users@gem5.org [2] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [3] http://www.mail-archive.com/gem5-dev@gem5.org/msg00782.html [4] mailto:gem5-users@gem5.org [5] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [6] mailto:gem5-users@gem5.org [7] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [8] mailto:gem5-users@gem5.org [9] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users