Hi Jerin

On 11/2/2017 4:57 PM, Jia He Wrote:

Hi, Jerin
please see my performance test below
On 11/2/2017 3:04 AM, Jerin Jacob Wrote:
[...]
Should it be like instead?

+#else
+        *old_head = __atomic_load_n(&r->cons.head, __ATOMIC_ACQUIRE);
+        const uint32_t prod_tail = __atomic_load_n(&r->prod.tail,
__ATOMIC_ACQUIRE);
It would be nice to see how much overhead it gives.ie back to back
__ATOMIC_ACQUIRE.
I can NOT test ring_perf_autotest in our server because of the something wrong in PMU counter. All the return value of rte_rdtsc is 0 with and without your provided ko module. I am still
investigating the reason.


Hi Jerin

As for the root cause of rte_rdtsc issue, it might be due to the pmu counter frequency is too low

in our arm64 server("Amberwing" from qualcom)

[586990.057779] arch_timer_get_cntfrq()=20000000

Only 20MHz instead of 100M/200MHz, and CNTFRQ_EL0 is not even writable in kernel space.

Maybe the code in ring_perf_autotest needs to be changed?

e.g.

    printf("SC empty dequeue: %.2F\n",
            (double)(sc_end-sc_start) / iterations);
    printf("MC empty dequeue: %.2F\n",
            (double)(mc_end-mc_start) / iterations);

Otherwise it is always 0 if the time difference divides by iterations.


--
Cheers,
Jia

Reply via email to