trace: implement trace register API

Jerin Jacob Mon, 23 Mar 2020 11:41:10 -0700

On Mon, Mar 23, 2020 at 10:14 PM Mattias Rönnblom
<mattias.ronnb...@ericsson.com> wrote:
>
> On 2020-03-23 16:08, Jerin Jacob wrote:
> > On Mon, Mar 23, 2020 at 8:13 PM Mattias Rönnblom
> > <mattias.ronnb...@ericsson.com> wrote:
> >> On 2020-03-23 14:37, Jerin Jacob wrote:
> >>>>> +     }
> >>>>> +
> >>>>> +     /* Initialize the trace point */
> >>>>> +     if (rte_strscpy(tp->name, name, TRACE_POINT_NAME_SIZE) < 0) {
> >>>>> +             trace_err("name is too long");
> >>>>> +             rte_errno = E2BIG;
> >>>>> +             goto free;
> >>>>> +     }
> >>>>> +
> >>>>> +     /* Copy the field data for future use */
> >>>>> +     if (rte_strscpy(tp->ctf_field, field, TRACE_CTF_FIELD_SIZE) < 0) {
> >>>>> +             trace_err("CTF field size is too long");
> >>>>> +             rte_errno = E2BIG;
> >>>>> +             goto free;
> >>>>> +     }
> >>>>> +
> >>>>> +     /* Clear field memory for the next event */
> >>>>> +     memset(field, 0, TRACE_CTF_FIELD_SIZE);
> >>>>> +
> >>>>> +     /* Form the trace handle */
> >>>>> +     *handle = sz;
> >>>>> +     *handle |= trace.nb_trace_points << __RTE_TRACE_FIELD_ID_SHIFT;
> >>>>> +     *handle |= (uint64_t)level << __RTE_TRACE_FIELD_LEVEL_SHIFT;
> >>>> If *handle would be a struct, you could use a bitfield instead, and much
> >>>> simplify this code.
> >>> I thought that initially, Two reasons why I did not do that
> >>> 1) The flags have been used in fastpath, I prefer to work with flags
> >>> in fastpath so that
> >> Is it really that obvious that flags are faster than bitfield
> >> operations? I think most modern architectures have machine instructions
> >> for bitfield manipulation.
> > Add x86 maintainers.
> >
> > There were comments in ml about bitfield inefficiency usage with x86.
> >
> > https://protect2.fireeye.com/v1/url?k=2bd2d3ad-7706d931-2bd29336-8631fc8bdea5-8a1bf17ed26f6ce6&q=1&e=0c620ac5-c028-44d9-a4e8-e04057940075&u=http%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F16482%2F
> >
> > Search for: Bitfileds are efficient on Octeon. What's about other CPUs
> > you have in
> > mind? x86 is not as efficient.
>
>
> I thought both ARM and x86 had bitfield access instructions, but it
> looks like I was wrong about x86. x86_64 GCC seems to convert bitfield
> read to 'shr' and 'and', just like an open-coded bitfield. Bitfield
> write requires more instructions.


Yes. ARM64 has bitfield access instructions. considering x86,
it is better to avoid bitfields.

See below,

>
>
> > Thoughts from x86 folks.
> >
> >>> there is no performance impact using bitfields from the compiler _if any_.
> >>> 2) In some of the places, I can simply operate on APIs like
> >>> __atomic_and_fetch() with flags.
> >> I think you may still use such atomic operations. Just convert the
> >> struct to a uint64_t, which will essentially be a no-operation, and fire
> >> away.
> > Not sure, We think about the atomic "and" and fetch here.
> > That memcpy may translate additional load/store based on the compiler
> > optimization level.(say compiled with -O0)
>
>
> I would be surprised if that happened on anything but -O0. At least
> modern GCC on ARM and x86_64 don't seem to add any loads or stores.
>
>
> I assume you are not suggesting we should optimize for -O0.

No. I was just mentining that, we can not assume  the code generation
with -O0. Anyway considering the above point, lets use flags.

Re: [dpdk-dev] [PATCH v1 03/32] eal/trace: implement trace register API

Reply via email to