On Wed, Apr 22, 2020 at 9:04 PM <jer...@marvell.com> wrote:
> This patch set contains
> ~~~~~~~~~~~~~~~~~~~~~~~~
>
> # The native implementation of common trace format(CTF)[1] based tracer
> # Public API to create the trace points.
> # Add tracepoints to eal, ethdev, mempool, eventdev and cryptodev
> library for tracing support
> # A unit test case
> # Performance test case to measure the trace overhead. (See eal/trace:
> # add trace performance test cases, patch)
> # Programmers guide for Trace support(See doc: add trace library guide,
> # patch)
>
> # Tested OS:
> ~~~~~~~~~~~
> - Linux
> - FreeBSD
>
> # Tested open source CTF trace viewers
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> - Babeltrace
> - Tracecompass
>
> # Trace overhead comparison with LTTng
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> trace overhead data on x86:[2]
> # 236 cycles with LTTng(>100ns)
> # 18 cycles(7ns) with Native DPDK CTF emitter.(See eal/trace: add trace
> # performance test cases patch)
>
> trace overhead data on arm64:
> #  312  cycles to  1100 cycles with LTTng based on the class of arm64  CPU.
> #  11 cycles to 13 cycles with Native DPDK CTF emitter based on the
> class of arm64 CPU.
>
> 18 cycles(on x86) vs 11 cycles(on arm64) is due to rdtsc() overhead in
> x86. It seems  rdtsc takes around 15cycles in x86.
>
> More details:
> ~~~~~~~~~~~~~
>
> # The Native DPDK CTF trace support does not have any dependency on
> third-party library.
> The generated output file is compatible with LTTng as both are using
> CTF trace format.
>
> The performance gain comes from:
> 1) exploit dpdk worker thread usage model to avoid atomics and use per
> core variables
> 2) use hugepage,
> 3) avoid a lot function pointers in fast-path etc
> 4) avoid unaligned store for arm64 etc
>
> Features:
> ~~~~~~~~~
> - No specific limit on the events. A string-based event like rte_log
> for pattern matching
> - Dynamic enable/disable support.
> - Instructmention overhead is ~1 cycle. i.e cost of adding the code
> wth out using trace feature.
> - Timestamp support for all the events using DPDK rte_rtdsc
> - No dependency on another library. Clean room native implementation of CTF.
>
> Functional test case:
> a) echo "trace_autotest" | sudo ./build/app/test/dpdk-test  -c 0x3 --trace=.*
>
> The above command emits the following trace events
> <code>
>         uint8_t i;
>
>         rte_trace_lib_eal_generic_void();
>         rte_trace_lib_eal_generic_u64(0x10000000000000);
>         rte_trace_lib_eal_generic_u32(0x10000000);
>         rte_trace_lib_eal_generic_u16(0xffee);
>         rte_trace_lib_eal_generic_u8(0xc);
>         rte_trace_lib_eal_generic_i64(-1234);
>         rte_trace_lib_eal_generic_i32(-1234567);
>         rte_trace_lib_eal_generic_i16(12);
>         rte_trace_lib_eal_generic_i8(-3);
>         rte_trace_lib_eal_generic_string("my string");
>         rte_trace_lib_eal_generic_function(__func__);
>
> </code>
>
> Install babeltrace package in Linux and point the generated trace file
> to babel trace. By default trace file created under
> <user>/dpdk-traces/time_stamp/
>
> example:
> # babeltrace /root/dpdk-traces/rte-2020-02-15-PM-02-56-51 | more
>
> [13:27:36.138468807] (+?.?????????) lib.eal.generic.void: { cpu_id =0, name = 
> "dpdk-test" }, { }
> [13:27:36.138468851] (+0.000000044) lib.eal.generic.u64: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 4503599627370496 }
> [13:27:36.138468860] (+0.000000009) lib.eal.generic.u32: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 268435456 }
> [13:27:36.138468934] (+0.000000074) lib.eal.generic.u16: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 65518 }
> [13:27:36.138468949] (+0.000000015) lib.eal.generic.u8: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 12 }
> [13:27:36.138468956] (+0.000000007) lib.eal.generic.i64: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -1234 }
> [13:27:36.138468963] (+0.000000007) lib.eal.generic.i32: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -1234567 }
> [13:27:36.138469024] (+0.000000061) lib.eal.generic.i16: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 12 }
> [13:27:36.138469044] (+0.000000020) lib.eal.generic.i8: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -3 }
> [13:27:36.138469051] (+0.000000007) lib.eal.generic.string: { cpu_id = 0, 
> name = "dpdk-test" }, { str = "my string" }
> [13:27:36.138469203] (+0.000000152) lib.eal.generic.func: { cpu_id = 0, name 
> = "dpdk-test" }, { func = "test_trace_points" }
>
> # There is a  GUI based trace viewer available in Windows, Linux and  Mac.
> It is called as tracecompass.(https://www.eclipse.org/tracecompass/)
>
> The example screenshot and Histogram of above DPDK trace using
> Tracecompass.
>
> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG

This series is quite big and did not get a lot of comments/reviews:
especially the tracepoints added to important subsystems.

- fixed some typos, some missed renames in commit logs and
intermediate patches, some nits on coding style and reworded comments.
- added the unit tests in MAINTAINERS.
- moved the trace documentation a bit earlier in the documentation
index (with EAL and core libraries).
- had a go with test-null.sh with traces off/on (and enabling fp
traces too), but did not do a good check on the performance impact.
- the series compile on all my targets (and I did a per patch check),

This new framework is promising but still experimental, what is
important now is getting feedback.


Thanks a lot for this work.

For the series:
Acked-by: David Marchand <david.march...@redhat.com>

Series applied.

-- 
David Marchand

Reply via email to