Hi Chuanyu, > -----Original Message----- > From: Chuanyu Xue <chuanyu....@uconn.edu> > Sent: Friday, December 22, 2023 11:04 AM > To: Su, Simei <simei...@intel.com> > Cc: Xing, Beilei <beilei.x...@intel.com>; chuanyu....@uconn.edu; > dev@dpdk.org; Zhang, Qi Z <qi.z.zh...@intel.com>; Lu, Wenzhuo > <wenzhuo...@intel.com> > Subject: RE: [PATCH] net/e1000: support launchtime feature > > Hi Simei, > Thank you so much for your review. > > >> > >> /* QAV Tx mode control register */ > >> #define E1000_I210_TQAVCTRL 0x3570 > >> +#define E1000_I210_LAUNCH_OS0 0x3578 > > > >What does this register mean? > > > > "LAUNCH_OS0" is defined as LaunchOffset register, which sets the base time > for launchtime. Based on i210 datasheet V3.7 Sec 7.2.2.2.3, the actual launch > time is computed as 32 * (LaunchOffset + LaunchTime). In this context, the > register is used to set the LaunchOffset later as 0.
OK, got it. Thanks for your explanation. > > >> > >> + if (igb_tx_timestamp_dynflag > 0) { > >> + tqavctrl = E1000_READ_REG(hw, E1000_I210_TQAVCTRL); > >> + tqavctrl |= E1000_TQAVCTRL_MODE; > >> + tqavctrl |= E1000_TQAVCTRL_FETCH_ARB; /* Fetch the queue most > >> empty, no Round Robin*/ > >> + tqavctrl |= E1000_TQAVCTRL_LAUNCH_TIMER_ENABLE; /* Enable > >> launch time */ > > > > In kernel driver, "E1000_TQAVCTRL_DATATRANTIM (BIT(9))" and > > "E1000_TQAVCTRL_FETCHTIME_DELTA (0xFFFF << 16)" are set, does it have > > some other intention here? > > "E1000_TQAVCTRL_DATATRANTIM" is same as > "E1000_TQAVCTRL_LAUNCH_TIMER_ENABLE" Yes, these two values are the same. > > "E1000_TQAVCTRL_FETCHTIME_DELTA" maximizes the data fetch time. > If "E1000_TQAVCTRL_FETCH_ARB" is set, there is no need to set this field, > because the arbitrary fetching prioritizes the most empty queue, regardless of > the fetch time. (referring Sec 7.2.7.5) > > I have also tested aligning with the kernel driver settings using (0xFFFF << > 16) > and omitting 'E1000_TQAVCTRL_FETCH_ARB', the launchtime feature also > worked as expected. However, the arbitrary fetch mode seems more suitable > as DPDK lacks an interface to set fetch delay, unlike in the kernel which can > be > configured (e.g., through 'Delta' in ETF Qdisc). Any suggestions here? Yes, it doesn't have an interface to set delay in DPDK. I agree with your approach. > > >> +static int > >> +eth_igb_read_clock(__rte_unused struct rte_eth_dev *dev, uint64_t > >> +*clock) { > >> + uint64_t systime_cycles; > >> + struct e1000_adapter *adapter = dev->data->dev_private; > >> + > >> + systime_cycles = igb_read_systime_cyclecounter(dev); > >> + uint64_t ns = rte_timecounter_update(&adapter->systime_tc, > >> systime_cycles); > > > >Do you also run "ptp timesync" when testing this launchtime feature? > > > > I used `rte_eth_timesync_enable` function during the test. I am not familiar > with the `ptp timesync` in DPDK. If you are referring to something else, could > you please guide me on how to test it? Do you use your own application or DPDK application to test this launchtime feature, for example, dpdk testpmd? > > >> > >> +/* Macro to compensate latency in launch time offloading*/ > >> +#define E1000_I210_LT_LATENCY 0x41F9 > > > >What does this value depend on? > > > > Through my test, I observed a constant latency between the launchtime and > the actual Tx time measured by the `rte_eth_timesync_read_tx_timestamp` > function. > I didn't find a description of this latency in the datasheet. > > In my test, the latency appears to be relative to the data rate, and > independent > from the packet size and throughput. The latency slightly changed in different > experiments, but in each experiment, it remained constant for all the Tx > packets. OK, got it. > I also tested this latency consistently on two different NICs (I210 GE-1T-X1, > I210 X1-V2). > > Here are some measurement results (in ns): > > +-----------+---------------+---------------+---------------+---------------+---------------+ > | Data Rate | Measurement 1 | Measurement 2 | Measurement 3 | > | Measurement 4 | Measurement 5 | > +-----------+---------------+---------------+---------------+---------------+---------------+ > | 10M | 14400 | 14544 | 14384 | > 14896 | 14336 | > +-----------+---------------+---------------+---------------+---------------+---------------+ > | 100M | 31016 | 31016 | 31000 | > 31000 | 31048 | > +-----------+---------------+---------------+---------------+---------------+---------------+ > | 1G | 16880 | 16880 | 16880 | 16880 > | 16880 | > +-----------+---------------+---------------+---------------+---------------+---------------+ > > Any suggestions here? Is it supposed to be embedded directly here or left to > the application level to compensate? I can fix it accordingly. I think it can be put here directly just as you do. Thanks, Simei > > - Chuanyu