Thursday, September 6, 2018 12:33 PM, Tom Barbette: Subject: RE: MLX5 should define the timestamp field in the doc > >It's true that it is a little bit a distortion of the original purpose. Here I >want to query the time from the device (ie, the device's current clock). Maybe >a new function in the API would be more suited? CCing Thomas Mojalon for that >part of the discussion.
yes, we cannot use the current API for that. > >I guess there is a case to query the device's timestamp to make our own >precise time computations. yes this is a valid use case. it will enable to implement ptp daemon for the clocks sync on top of DPDK. > >I also just saw that patch from two years ago that did not made it to the main >branch : http://mails.dpdk.org/archives/dev/2016-October/048810.html , I guess >it's because it is approximative in the time computation instead of a real >synchronization? But now timestamp is in the rte_mbuf, so it could also >technically go in. > i need to refresh my memory about this one (too long ago). anyway, for your case there is a way to go, just need an app to sync the clocks. i can help w/ the reviews and guidance on mlx5/ethdev if you wish to push such support upstream. >Tom From: Tom Barbette <barbe...@kth.se> Sent: Thursday, September 6, 2018 12:33 PM To: Shahaf Shuler <shah...@mellanox.com>; dev@dpdk.org; Alex Rosenbaum <al...@mellanox.com> Cc: Yongseok Koh <ys...@mellanox.com>; john.mcnam...@intel.com; marko.kovace...@intel.com; Thomas Monjalon <tho...@monjalon.net> Subject: RE: MLX5 should define the timestamp field in the doc It's true that it is a little bit a distortion of the original purpose. Here I want to query the time from the device (ie, the device's current clock). Maybe a new function in the API would be more suited? CCing Thomas Mojalon for that part of the discussion. I guess there is a case to query the device's timestamp to make our own precise time computations. I also just saw that patch from two years ago that did not made it to the main branch : http://mails.dpdk.org/archives/dev/2016-October/048810.html<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2016-October%2F048810.html&data=02%7C01%7Cshahafs%40mellanox.com%7Cc69ee4cf172f44d0918508d613dbd051%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636718232173908110&sdata=X3INk2WYSUHKPKRomoJMxpMQA2LZxZCqde4KQ%2BrKmho%3D&reserved=0> , I guess it's because it is approximative in the time computation instead of a real synchronization? But now timestamp is in the rte_mbuf, so it could also technically go in. Tom ________________________________ De : Shahaf Shuler <shah...@mellanox.com<mailto:shah...@mellanox.com>> Envoyé : jeudi 6 septembre 2018 11:07 À : Tom Barbette; dev@dpdk.org<mailto:dev@dpdk.org>; Alex Rosenbaum Cc : Yongseok Koh; john.mcnam...@intel.com<mailto:john.mcnam...@intel.com>; marko.kovace...@intel.com<mailto:marko.kovace...@intel.com> Objet : RE: MLX5 should define the timestamp field in the doc Wednesday, September 5, 2018 12:00 PM, Tom Barbette: >Actually I managed this patch to implement support for >rte_eth_timesync_read_time. I am not fully familiar w/ this API, but it looks like the timespec returned from this call is expected to be in real time values (i.e. seconds and nano seconds), at least this is what I see on the ptpclient example on the DPDK tree. > >Please tell me potential modifications, and if I shall submit it again as a >"normal" patch to dev ? > >--- [...] > } > > /** >+ * Get device current time >+ * >+ * @param dev >+ * Pointer to Ethernet device structure. >+ * >+ * @param[out] time >+ * Time output value. >+ * >+ * @return >+ * 0 if the time has correctly been set >+ */ >+int >+mlx5_timesync_read_time(struct rte_eth_dev *dev, struct timespec *time) >+{ >+ struct priv *priv = dev->data->dev_private; >+ struct ibv_values_ex values; >+ int err = 0; >+ >+ values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK; >+ if ((err = mlx5_glue->query_rt_values_ex(priv->ctx, &values)) != 0) { The use of this function will not bring you the outcome the API defines. see the man page of ibv_query_rt_values_ex: struct ibv_values_ex { uint32_t comp_mask; /* Compatibility mask that defines the query/queried fields [in/out] struct timespec raw_clock; /* HW raw clock */ }; enum ibv_values_mask { IBV_VALUES_MASK_RAW_CLOCK = 1 << 0, /* HW raw clock */ }; The output is the HW raw clock (just like you have in the mbuf). In order it to work the application needs to understand the PTP coefficients for the raw->real time conversion. this can be done, just need some more work. do you have a ptp daemon implemented to calc the coefficients? >+ DRV_LOG(WARNING, "Could not query time !"); >+ return err; >+ } >+ >+ *time = values.raw_clock; >+ return 0; >+} >+ >+ >+/** > * Get supported packet types. > * > * @param dev >diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c >index c7965e5..3c72f5b 100644 >--- a/drivers/net/mlx5/mlx5_glue.c >+++ b/drivers/net/mlx5/mlx5_glue.c >@@ -84,6 +84,13 @@ mlx5_glue_query_device_ex(struct ibv_context *context, > } > > static int >+mlx5_glue_query_rt_values_ex(struct ibv_context *context, >+ struct ibv_values_ex* values) >+{ >+ return ibv_query_rt_values_ex(context, values); >+} >+ >+static int > mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num, > struct ibv_port_attr *port_attr) > { >@@ -354,6 +361,7 @@ const struct mlx5_glue *mlx5_glue = &(const struct >mlx5_glue){ > .close_device = mlx5_glue_close_device, > .query_device = mlx5_glue_query_device, > .query_device_ex = mlx5_glue_query_device_ex, >+ .query_rt_values_ex = mlx5_glue_query_rt_values_ex, > .query_port = mlx5_glue_query_port, > .create_comp_channel = mlx5_glue_create_comp_channel, > .destroy_comp_channel = mlx5_glue_destroy_comp_channel, >diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h >index e584d36..0582e95 100644 >--- a/drivers/net/mlx5/mlx5_glue.h >+++ b/drivers/net/mlx5/mlx5_glue.h >@@ -54,6 +54,8 @@ struct mlx5_glue { > int (*query_device_ex)(struct ibv_context *context, > const struct ibv_query_device_ex_input *input, > struct ibv_device_attr_ex *attr); >+ int (*query_rt_values_ex)(struct ibv_context *context, >+ struct ibv_values_ex *values); > int (*query_port)(struct ibv_context *context, uint8_t port_num, > struct ibv_port_attr *port_attr); > struct ibv_comp_channel *(*create_comp_channel) >-- >2.7.4 > > >________________________________________ >De : Shahaf Shuler <shah...@mellanox.com<mailto:shah...@mellanox.com>> >Envoyé : mercredi 5 septembre 2018 10:18 >À : Tom Barbette; dev@dpdk.org<mailto:dev@dpdk.org>; Alex Rosenbaum >Cc : Yongseok Koh; john.mcnam...@intel.com<mailto:john.mcnam...@intel.com>; >marko.kovace...@intel.com<mailto:marko.kovace...@intel.com> >Objet : RE: MLX5 should define the timestamp field in the doc > >Thanks for the details. > >The use case is clear. We will take it internally to see when we can support >it. >AFAIK we cannot read the internal time from userspace. > >Adding also AlexR to comment > >From: Tom Barbette <barbe...@kth.se<mailto:barbe...@kth.se>> >Sent: Wednesday, September 5, 2018 10:11 AM >To: Shahaf Shuler <shah...@mellanox.com<mailto:shah...@mellanox.com>>; >dev@dpdk.org<mailto:dev@dpdk.org> >Cc: Yongseok Koh <ys...@mellanox.com<mailto:ys...@mellanox.com>>; >john.mcnam...@intel.com<mailto:john.mcnam...@intel.com>; >marko.kovace...@intel.com<mailto:marko.kovace...@intel.com> >Subject: RE: MLX5 should define the timestamp field in the doc > >Thanks for your answer Shahaf ! > >We're trying to measure the latency of packets going through various service >chains inside individual "server". Eg. we can see that on Server 1, the >latency for the service chain handling HTTP packets is ~800ns (+ max and mins, >tail latency, etc). What we do now is to timestamp packets right after they >are received, and compute the difference with the timestamp just before they >are sent. Over a cluster this shows us where the latency is happening. > >We would like this "box" latency to include the time spent in queues, and for >that the hardware timestamp seems fit-for-purpose as it would timestamp the >packets before the software queues. Moreover, as we use batching, we lose a >lot of precision as we timestamp a whole batch at once. > >I'm pretty sure this use case is of interest for many others. Tail latency is >of the essence nowadays, and finding where packets get delayed precisely is >important. > >Instead of converting the timestamp to real time, in this very use case it >seems the Mellanox card could actually be our unique source of time, we just >need to be able to convert ticks to seconds. > >Any chance we can run an equivalent of mlx5_read_internal_timer >(https://elixir.bootlin.com/linux/v4.18.5/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L623) > from userspace ? Are these registers also mapped, or can be done so with a >few changes? With only that we can actually derive the frequency and the >offset easily. > >Tom