According to Intel Developer's Manual: "The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executed before reading the counter. Simi- larly, subsequent instructions may begin execution before the read operation is performed. If software requires RDTSC to be executed only after all previous instruc- tions have completed locally, it can either use RDTSCP (if the processor supports that instruction) or execute the sequence LFENCE;RDTSC."
So add a rte_rdtsc_precise function that do a memory barrier before rdtsc to synchronize operations and ensure that the TSC read is done at the expected place. Signed-off-by: Didier Pallard <didier.pallard at 6wind.com> --- Call rte_mb() and rte_rdtsc() rather than duplicating rte_rdtsc function. Use r/w memory barrier instead of lfence to serialize both load and stores. lib/librte_eal/common/include/rte_cycles.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/lib/librte_eal/common/include/rte_cycles.h b/lib/librte_eal/common/include/rte_cycles.h index cc6fe71..e91edf8 100644 --- a/lib/librte_eal/common/include/rte_cycles.h +++ b/lib/librte_eal/common/include/rte_cycles.h @@ -76,6 +76,7 @@ extern "C" { #include <stdint.h> #include <rte_debug.h> +#include <rte_atomic.h> #ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT /** Global switch to use VMWARE mapping of TSC instead of RDTSC */ @@ -128,6 +129,19 @@ rte_rdtsc(void) } /** + * Read the TSC register precisely where function is called. + * + * @return + * The TSC for this lcore. + */ +static inline uint64_t +rte_rdtsc_precise(void) +{ + rte_mb(); + return(rte_rdtsc()); +} + +/** * Get the measured frequency of the RDTSC counter * * @return -- 1.7.10.4