On Mon, Sep 23, 2013 at 05:25:19PM +0200, Stephane Eranian wrote: > > Its not just a broken threshold. When a PEBS event happens it can re-arm > > itself but only if you program a RESET value !0. We don't do that, so > > each counter should only ever fire once. > > > > We must do this because PEBS is broken on NHM+ in that the > > pebs_record::status is a direct copy of the overflow status field at > > time of the assist and if you use the RESET thing nothing will clear the > > status bits and you cannot demux the PEBS events back to the event that > > generated them. > > > Trying to understand this problem better. You are saying that in case you > are sampling multiple PEBS events there is a problem if you allow more > than one record per PEBS buffer because the overflow status is not reset > properly.
That is what I wrote; but I'm not entire sure that's correct. I think it will reset the overflow bits once it does an actual reset after the PEBS assist triggers, but see below. > For instance, if first record is caused by counter 0, ovfl_status=0x1, > then counter > is reset. Then, if counter 1 is the cause of the next record, then > that record has the > ovfl_status=0x3 instead of ovfl_status=0x2? Is that what you are saying? > > If so then yes, I agree this is a serious bug and we need to have Intel fix > it. But there's still the case where with 2 counters you can get: cnt0 overflows; sets status |= 1 << 0, arms PEBS0 assist cnt1 overflows; sets status |= 1 << 1, arms PEBS1 assist PEBS0 ready to trigger PEBS1 ready to trigger Cnt1 event -> PEBS1 trigger, writes entry with status := 0x03 Cnt0 event -> PEBS0 trigger, writes entry with status := 0x03 At which point you'll have 2 events with the same status overflow bits in 'reverse' order. If we'd set RESET, the second entry would have status : 0x01, which would be unambiguous again. But we'd still not know where to place the 0x03 entry. With more PEBSn counters enabled and a threshold > 1 the chance of having such scenarios is greatly increased. The threshold := 1 case tries to avoid these cases by getting them out as fast as possible and hopefully avoiding the second trigger. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/