According to Intel Developer's Manual:
"The RDTSC instruction is not a serializing instruction. It does not necessarily wait
until all previous instructions have been executed before reading the counter. Simi-
larly, subsequent instructions may begin execution before the read operation is
performed. If software requires RDTSC to be executed only after all previous instruc-
tions have completed locally, it can either use RDTSCP (if the processor supports that
instruction) or execute the sequence LFENCE;RDTSC."
So add a rte_rdtsc_precise function that do a memory barrier before rdtsc to
synchronize operations and ensure that the TSC read is done at the expected place.
Use r/w memory barrier instead of lfence to serialize both loads and stores.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Reviewed-by: François-Frédéric Ozog <ff@ozog.com>
Reviewed-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
#include <stdint.h>
#include <rte_debug.h>
+#include <rte_atomic.h>
#ifdef RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT
/** Global switch to use VMWARE mapping of TSC instead of RDTSC */
return tsc.tsc_64;
}
+/**
+ * Read the TSC register precisely where function is called.
+ *
+ * @return
+ * The TSC for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+ rte_mb();
+ return rte_rdtsc();
+}
+
/**
* Get the measured frequency of the RDTSC counter
*