Hi Jerin
On 11/2/2017 4:57 PM, Jia He Wrote:
Hi, Jerin
please see my performance test below
On 11/2/2017 3:04 AM, Jerin Jacob Wrote:
[...]
Should it be like instead?
+#else
+ *old_head = __atomic_load_n(&r->cons.head, __ATOMIC_ACQUIRE);
+ const uint32_t prod_tail = __atomic_load_n(&r->prod.tail,
__ATOMIC_ACQUIRE);
It would be nice to see how much overhead it gives.ie back to back
__ATOMIC_ACQUIRE.
I can NOT test ring_perf_autotest in our server because of the
something wrong in PMU counter.
All the return value of rte_rdtsc is 0 with and without your provided
ko module. I am still
investigating the reason.
Hi Jerin
As for the root cause of rte_rdtsc issue, it might be due to the pmu
counter frequency is too low
in our arm64 server("Amberwing" from qualcom)
[586990.057779] arch_timer_get_cntfrq()=20000000
Only 20MHz instead of 100M/200MHz, and CNTFRQ_EL0 is not even writable
in kernel space.
Maybe the code in ring_perf_autotest needs to be changed?
e.g.
printf("SC empty dequeue: %.2F\n",
(double)(sc_end-sc_start) / iterations);
printf("MC empty dequeue: %.2F\n",
(double)(mc_end-mc_start) / iterations);
Otherwise it is always 0 if the time difference divides by iterations.
--
Cheers,
Jia