On Tue, Oct 25, 2016 at 02:33:55PM +0200, Morten Br?rup wrote: > Comments at the end. > > Med venlig hilsen / kind regards > - Morten Br?rup > > > -----Original Message----- > > From: Bruce Richardson [mailto:bruce.richardson at intel.com] > > Sent: Tuesday, October 25, 2016 2:20 PM > > To: Morten Br?rup > > Cc: Adrien Mazarguil; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg > > Kuporosov > > Subject: Re: [dpdk-dev] mbuf changes > > > > On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Br?rup wrote: > > > Comments inline. > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce > > > > Richardson > > > > Sent: Tuesday, October 25, 2016 1:14 PM > > > > To: Adrien Mazarguil > > > > Cc: Morten Br?rup; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg > > > > Kuporosov > > > > Subject: Re: [dpdk-dev] mbuf changes > > > > > > > > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote: > > > > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote: > > > > > > Comments inline. > > > > > > > > > > > > Med venlig hilsen / kind regards > > > > > > - Morten Br?rup > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com] > > > > > > > Sent: Tuesday, October 25, 2016 11:39 AM > > > > > > > To: Bruce Richardson > > > > > > > Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz; > > > > > > > Oleg Kuporosov > > > > > > > Subject: Re: [dpdk-dev] mbuf changes > > > > > > > > > > > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson > > wrote: > > > > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith > > wrote: > > > > > > > [...] > > > > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup > > > > > > > <mb at smartsharesystems.com> wrote: > > > > > > > [...] > > > > > > > > > > > One other point I'll mention is that we need to have a > > > > > > > > discussion on how/where to add in a timestamp value into > > the > > > > > > > > mbuf. Personally, I think it can be in a union with the > > > > sequence > > > > > > > > number value, but I also suspect that 32-bits of a > > timestamp > > > > > > > > is not going to be enough for > > > > > > > many. > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > If we consider that timestamp representation should use > > > > nanosecond > > > > > > > granularity, a 32-bit value may likely wrap around too > > quickly > > > > > > > to be useful. We can also assume that applications requesting > > > > > > > timestamps may care more about latency than throughput, Oleg > > > > found > > > > > > > that using the second cache line for this purpose had a > > > > noticeable impact [1]. > > > > > > > > > > > > > > [1] http://dpdk.org/ml/archives/dev/2016-October/049237.html > > > > > > > > > > > > I agree with Oleg about the latency vs. throughput importance > > > > > > for > > > > such applications. > > > > > > > > > > > > If you need high resolution timestamps, consider them to be > > > > generated by the NIC RX driver, possibly by the hardware itself > > > > (http://w3new.napatech.com/features/time-precision/hardware-time- > > > > stamp), so the timestamp belongs in the first cache line. And I am > > > > proposing that it should have the highest possible accuracy, which > > > > makes the value hardware dependent. > > > > > > > > > > > > Furthermore, I am arguing that we leave it up to the > > application > > > > > > to > > > > keep track of the slowly moving bits (i.e. counting whole seconds, > > > > hours and calendar date) out of band, so we don't use precious > > space > > > > in the mbuf. The application doesn't need the NIC RX driver's fast > > > > path to capture which date (or even which second) a packet was > > > > received. Yes, it adds complexity to the application, but we can't > > > > set aside 64 bit for a generic timestamp. Or as a weird tradeoff: > > > > Put the fast moving 32 bit in the first cache line and the slow > > > > moving 32 bit in the second cache line, as a placeholder for the > > application to fill out if needed. > > > > Yes, it means that the application needs to check the time and > > > > update its variable holding the slow moving time once every second > > > > or so; but that should be doable without significant effort. > > > > > > > > > > That's a good point, however without a 64 bit value, elapsed time > > > > > between two arbitrary mbufs cannot be measured reliably due to > > not > > > > > enough context, one way or another the low resolution value is > > > > > also > > > > needed. > > > > > > > > > > Obviously latency-sensitive applications are unlikely to perform > > > > > lengthy buffering and require this but I'm not sure about all the > > > > > possible use-cases. Considering many NICs expose 64 bit > > timestaps, > > > > > I suggest we do not truncate them. > > > > > > > > > > I'm not a fan of the weird tradeoff either, PMDs will be tempted > > > > > to fill the extra 32 bits whenever they can and negate the > > > > > performance improvement of the first cache line. > > > > > > > > I would tend to agree, and I don't really see any convenient way to > > > > avoid putting in a 64-bit field for the timestamp in cache-line 0. > > > > If we are ok with having this overlap/partially overlap with > > > > sequence number, it will use up an extra 4B of storage in that > > cacheline. > > > > > > I agree about the lack of convenience! And Adrien certainly has a > > point about PMD temptations. > > > > > > However, I still don't think that a NICs ability to date-stamp a > > packet is sufficient reason to put a date-stamp in cache line 0 of the > > mbuf. Storing only the fast moving 32 bit in cache line 0 seems like a > > good compromise to me. > > > > > > Maybe you can find just one more byte, so it can hold 17 minutes with > > > nanosecond resolution. (I'm joking!) > > > > > > Please don't sacrifice the sequence number for the seconds/hours/days > > part a timestamp. Maybe it could be configurable to use a 32 bit or 64 > > bit timestamp. > > > > > Do you see both timestamp and sequence numbers being used together? I > > would have thought that apps would either use one or the other? > > However, your suggestion is workable in any case, to allow the sequence > > number to overlap just the high 32 bits of the timestamp, rather than > > the low. > > In our case, I can foresee sequence numbers used for packet processing and > timestamps for timing analysis (and possibly for packet capturing, when being > used). For timing analysis, we don?t need long durations, e.g. 4 seconds with > 32 bit nanosecond resolution suffices. And for packet capturing we are > perfectly capable of adding the slowly moving 32 bit of the timestamp to our > output data stream without fetching it from the mbuf. >
For the 32-bit timestamp case, it might be useful to have a right-shift value passed in to the ethdev driver. If we assume a NIC with nanosecond resolution, (or TSC value with resolution of that order of magnitude), then the app can choose to have 1 ns resolution with 4 second wraparound, or alternatively 4ns resolution with 16 second wraparound, or even microsecond resolution with wrap around of over an hour. The cost is obviously just a shift op in the driver code per packet - hopefully with multiple packets done at a time using vector operations. /Bruce