On Mon, Mar 23, 2020 at 10:14 PM Mattias Rönnblom <mattias.ronnb...@ericsson.com> wrote: > > On 2020-03-23 16:08, Jerin Jacob wrote: > > On Mon, Mar 23, 2020 at 8:13 PM Mattias Rönnblom > > <mattias.ronnb...@ericsson.com> wrote: > >> On 2020-03-23 14:37, Jerin Jacob wrote: > >>>>> + } > >>>>> + > >>>>> + /* Initialize the trace point */ > >>>>> + if (rte_strscpy(tp->name, name, TRACE_POINT_NAME_SIZE) < 0) { > >>>>> + trace_err("name is too long"); > >>>>> + rte_errno = E2BIG; > >>>>> + goto free; > >>>>> + } > >>>>> + > >>>>> + /* Copy the field data for future use */ > >>>>> + if (rte_strscpy(tp->ctf_field, field, TRACE_CTF_FIELD_SIZE) < 0) { > >>>>> + trace_err("CTF field size is too long"); > >>>>> + rte_errno = E2BIG; > >>>>> + goto free; > >>>>> + } > >>>>> + > >>>>> + /* Clear field memory for the next event */ > >>>>> + memset(field, 0, TRACE_CTF_FIELD_SIZE); > >>>>> + > >>>>> + /* Form the trace handle */ > >>>>> + *handle = sz; > >>>>> + *handle |= trace.nb_trace_points << __RTE_TRACE_FIELD_ID_SHIFT; > >>>>> + *handle |= (uint64_t)level << __RTE_TRACE_FIELD_LEVEL_SHIFT; > >>>> If *handle would be a struct, you could use a bitfield instead, and much > >>>> simplify this code. > >>> I thought that initially, Two reasons why I did not do that > >>> 1) The flags have been used in fastpath, I prefer to work with flags > >>> in fastpath so that > >> Is it really that obvious that flags are faster than bitfield > >> operations? I think most modern architectures have machine instructions > >> for bitfield manipulation. > > Add x86 maintainers. > > > > There were comments in ml about bitfield inefficiency usage with x86. > > > > https://protect2.fireeye.com/v1/url?k=2bd2d3ad-7706d931-2bd29336-8631fc8bdea5-8a1bf17ed26f6ce6&q=1&e=0c620ac5-c028-44d9-a4e8-e04057940075&u=http%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F16482%2F > > > > Search for: Bitfileds are efficient on Octeon. What's about other CPUs > > you have in > > mind? x86 is not as efficient. > > > I thought both ARM and x86 had bitfield access instructions, but it > looks like I was wrong about x86. x86_64 GCC seems to convert bitfield > read to 'shr' and 'and', just like an open-coded bitfield. Bitfield > write requires more instructions.
Yes. ARM64 has bitfield access instructions. considering x86, it is better to avoid bitfields. See below, > > > > Thoughts from x86 folks. > > > >>> there is no performance impact using bitfields from the compiler _if any_. > >>> 2) In some of the places, I can simply operate on APIs like > >>> __atomic_and_fetch() with flags. > >> I think you may still use such atomic operations. Just convert the > >> struct to a uint64_t, which will essentially be a no-operation, and fire > >> away. > > Not sure, We think about the atomic "and" and fetch here. > > That memcpy may translate additional load/store based on the compiler > > optimization level.(say compiled with -O0) > > > I would be surprised if that happened on anything but -O0. At least > modern GCC on ARM and x86_64 don't seem to add any loads or stores. > > > I assume you are not suggesting we should optimize for -O0. No. I was just mentining that, we can not assume the code generation with -O0. Anyway considering the above point, lets use flags.