On Fri, Sep 20, 2024 at 12:57 PM Mattias Rönnblom <mattias.ronnb...@ericsson.com> wrote: > > This patch set represent an attempt to improve and extend the RTE > bitops API, in particular for functions that operate on individual > bits. > > All new functionality is exposed to the user as generic selection > macros, delegating the actual work to private (__-marked) static > inline functions. Public functions (e.g., rte_bit_set32()) would just > be bloating the API. Such generic selection macros will here be > referred to as "functions", although technically they are not. > > The legacy <rte_bitops.h> rte_bit_relaxed_*() functions is replaced > with two new families: > > rte_bit_[test|set|clear|assign|flip]() which provides no memory > ordering or atomicity guarantees, but does provide the best > performance. The performance degradation resulting from the use of > volatile (e.g., forcing loads and stores to actually occur and in the > number specified) and atomic (e.g., LOCK-prefixed instructions on x86) > may be significant. rte_bit_[test|set|clear|assign|flip]() may be > used with volatile word pointers, in which case they guarantee > that the program-level accesses actually occur. > > rte_bit_atomic_*() which provides atomic bit-level operations, > including the possibility to specifying memory ordering constraints > (or the lack thereof). > > The atomic functions take non-_Atomic pointers, to be flexible, just > like the GCC builtins and default <rte_stdatomic.h>. The issue with > _Atomic APIs is that it may well be the case that the user wants to > perform both non-atomic and atomic operations on the same word. > > Having _Atomic-marked addresses would complicate supporting atomic > bit-level operations in the bitset API (proposed in a different RFC > patchset), and potentially other APIs depending on RTE bitops for > atomic bit-level ops). Either one needs two bitset variants, one > _Atomic bitset and one non-atomic one, or the bitset code needs to > cast the non-_Atomic pointer to an _Atomic one. Having a separate > _Atomic bitset would be bloat and also prevent the user from both, in > some situations, doing atomic operations against a bit set, while in > other situations (e.g., at times when MT safety is not a concern) > operating on the same objects in a non-atomic manner. > > Unlike rte_bit_relaxed_*(), individual bits are represented by bool, > not uint32_t or uint64_t. The author found the use of such large types > confusing, and also failed to see any performance benefits. > > A set of functions rte_bit_*_assign() are added, to assign a > particular boolean value to a particular bit. > > All new functions have properly documented semantics. > > All new functions operate on both 32 and 64-bit words, with type > checking. > > _Generic allow the user code to be a little more impact. Have a > type-generic atomic test/set/clear/assign bit API also seems > consistent with the "core" (word-size) atomics API, which is generic > (both GCC builtins and <rte_stdatomic.h> are). > > The _Generic versions avoids having explicit unsigned long versions of > all functions. If you have an unsigned long, it's safe to use the > generic version (e.g., rte_set_bit()) and _Generic will pick the right > function, provided long is either 32 or 64 bit on your platform (which > it is on all DPDK-supported ABIs). > > The generic rte_bit_set() is a macro, and not a function, but > nevertheless has been given a lower-case name. That's how C11 does it > (for atomics, and other _Generic), and <rte_stdatomic.h>. Its address > can't be taken, but it does not evaluate its parameters more than > once. > > C++ doesn't support generic selection. In C++ translation units the > _Generic macros are replaced with overloaded functions, implemented by > means of a huge, complicated C macro mess. > > Mattias Rönnblom (7): > buildtools/chkincs: relax C linkage requirement > dpdk: use C linkage only where appropriate > eal: extend bit manipulation functionality > eal: add unit tests for bit operations > eal: add atomic bit operations > eal: add unit tests for atomic bit access functions > eal: extend bitops to handle volatile pointers > > app/test/packet_burst_generator.h | 8 +- > app/test/test_bitops.c | 416 +++++++++- > app/test/virtual_pmd.h | 4 +- > buildtools/chkincs/chkextern.py | 88 ++ > buildtools/chkincs/meson.build | 21 +- > doc/guides/rel_notes/release_24_11.rst | 17 + > drivers/bus/auxiliary/bus_auxiliary_driver.h | 8 +- > drivers/bus/cdx/bus_cdx_driver.h | 8 +- > drivers/bus/dpaa/include/fsl_qman.h | 8 +- > drivers/bus/fslmc/bus_fslmc_driver.h | 8 +- > drivers/bus/pci/bus_pci_driver.h | 8 +- > drivers/bus/pci/rte_bus_pci.h | 8 +- > drivers/bus/platform/bus_platform_driver.h | 8 +- > drivers/bus/vdev/bus_vdev_driver.h | 8 +- > drivers/bus/vmbus/bus_vmbus_driver.h | 8 +- > drivers/bus/vmbus/rte_bus_vmbus.h | 8 +- > drivers/dma/cnxk/cnxk_dma_event_dp.h | 8 +- > drivers/dma/ioat/ioat_hw_defs.h | 4 +- > drivers/event/dlb2/rte_pmd_dlb2.h | 8 +- > drivers/mempool/dpaa2/rte_dpaa2_mempool.h | 6 +- > drivers/net/avp/rte_avp_fifo.h | 8 +- > drivers/net/bonding/rte_eth_bond.h | 4 +- > drivers/net/i40e/rte_pmd_i40e.h | 8 +- > drivers/net/mlx5/mlx5_trace.h | 8 +- > drivers/net/ring/rte_eth_ring.h | 4 +- > drivers/net/vhost/rte_eth_vhost.h | 8 +- > drivers/raw/ifpga/afu_pmd_core.h | 8 +- > drivers/raw/ifpga/afu_pmd_he_hssi.h | 6 +- > drivers/raw/ifpga/afu_pmd_he_lpbk.h | 6 +- > drivers/raw/ifpga/afu_pmd_he_mem.h | 6 +- > drivers/raw/ifpga/afu_pmd_n3000.h | 6 +- > drivers/raw/ifpga/rte_pmd_afu.h | 4 +- > drivers/raw/ifpga/rte_pmd_ifpga.h | 4 +- > examples/ethtool/lib/rte_ethtool.h | 8 +- > examples/qos_sched/main.h | 4 +- > examples/vm_power_manager/channel_manager.h | 8 +- > lib/acl/rte_acl_osdep.h | 8 - > lib/bbdev/rte_bbdev.h | 8 +- > lib/bbdev/rte_bbdev_op.h | 8 +- > lib/bbdev/rte_bbdev_pmd.h | 8 +- > lib/bpf/bpf_def.h | 9 - > lib/compressdev/rte_comp.h | 4 +- > lib/compressdev/rte_compressdev.h | 6 +- > lib/compressdev/rte_compressdev_internal.h | 8 +- > lib/compressdev/rte_compressdev_pmd.h | 8 +- > lib/cryptodev/cryptodev_pmd.h | 8 +- > lib/cryptodev/cryptodev_trace.h | 8 +- > lib/cryptodev/rte_crypto.h | 8 +- > lib/cryptodev/rte_crypto_asym.h | 8 - > lib/cryptodev/rte_crypto_sym.h | 8 +- > lib/cryptodev/rte_cryptodev.h | 8 +- > lib/cryptodev/rte_cryptodev_trace_fp.h | 4 +- > lib/dispatcher/rte_dispatcher.h | 8 +- > lib/dmadev/rte_dmadev.h | 8 + > lib/eal/arm/include/rte_atomic_32.h | 4 +- > lib/eal/arm/include/rte_atomic_64.h | 8 +- > lib/eal/arm/include/rte_byteorder.h | 8 +- > lib/eal/arm/include/rte_cpuflags_32.h | 8 - > lib/eal/arm/include/rte_cpuflags_64.h | 8 - > lib/eal/arm/include/rte_cycles_32.h | 4 +- > lib/eal/arm/include/rte_cycles_64.h | 4 +- > lib/eal/arm/include/rte_io.h | 8 - > lib/eal/arm/include/rte_io_64.h | 8 +- > lib/eal/arm/include/rte_memcpy_32.h | 8 +- > lib/eal/arm/include/rte_memcpy_64.h | 23 +- > lib/eal/arm/include/rte_pause.h | 8 - > lib/eal/arm/include/rte_pause_32.h | 6 +- > lib/eal/arm/include/rte_pause_64.h | 8 +- > lib/eal/arm/include/rte_power_intrinsics.h | 8 - > lib/eal/arm/include/rte_prefetch_32.h | 8 +- > lib/eal/arm/include/rte_prefetch_64.h | 8 +- > lib/eal/arm/include/rte_rwlock.h | 4 +- > lib/eal/arm/include/rte_spinlock.h | 6 +- > lib/eal/freebsd/include/rte_os.h | 8 - > lib/eal/include/bus_driver.h | 8 +- > lib/eal/include/dev_driver.h | 8 - > lib/eal/include/eal_trace_internal.h | 8 +- > lib/eal/include/generic/rte_atomic.h | 8 + > lib/eal/include/generic/rte_byteorder.h | 8 + > lib/eal/include/generic/rte_cpuflags.h | 8 + > lib/eal/include/generic/rte_cycles.h | 8 + > lib/eal/include/generic/rte_io.h | 8 + > lib/eal/include/generic/rte_memcpy.h | 8 + > lib/eal/include/generic/rte_pause.h | 8 + > .../include/generic/rte_power_intrinsics.h | 8 + > lib/eal/include/generic/rte_prefetch.h | 8 + > lib/eal/include/generic/rte_rwlock.h | 8 +- > lib/eal/include/generic/rte_spinlock.h | 8 + > lib/eal/include/generic/rte_vect.h | 8 + > lib/eal/include/rte_alarm.h | 4 +- > lib/eal/include/rte_bitmap.h | 8 +- > lib/eal/include/rte_bitops.h | 768 +++++++++++++++++- > lib/eal/include/rte_branch_prediction.h | 8 - > lib/eal/include/rte_bus.h | 8 +- > lib/eal/include/rte_class.h | 4 +- > lib/eal/include/rte_common.h | 8 +- > lib/eal/include/rte_compat.h | 8 - > lib/eal/include/rte_dev.h | 8 +- > lib/eal/include/rte_devargs.h | 8 +- > lib/eal/include/rte_eal_trace.h | 4 +- > lib/eal/include/rte_errno.h | 4 +- > lib/eal/include/rte_fbarray.h | 8 +- > lib/eal/include/rte_keepalive.h | 6 +- > lib/eal/include/rte_mcslock.h | 8 +- > lib/eal/include/rte_memory.h | 8 +- > lib/eal/include/rte_pci_dev_feature_defs.h | 8 - > lib/eal/include/rte_pci_dev_features.h | 8 - > lib/eal/include/rte_per_lcore.h | 8 - > lib/eal/include/rte_pflock.h | 8 +- > lib/eal/include/rte_random.h | 4 +- > lib/eal/include/rte_seqcount.h | 8 +- > lib/eal/include/rte_seqlock.h | 8 +- > lib/eal/include/rte_service.h | 8 +- > lib/eal/include/rte_service_component.h | 4 +- > lib/eal/include/rte_stdatomic.h | 5 +- > lib/eal/include/rte_string_fns.h | 17 +- > lib/eal/include/rte_tailq.h | 6 +- > lib/eal/include/rte_ticketlock.h | 8 +- > lib/eal/include/rte_time.h | 6 +- > lib/eal/include/rte_trace.h | 8 +- > lib/eal/include/rte_trace_point.h | 8 +- > lib/eal/include/rte_trace_point_register.h | 8 +- > lib/eal/include/rte_uuid.h | 8 +- > lib/eal/include/rte_version.h | 6 +- > lib/eal/include/rte_vfio.h | 8 +- > lib/eal/linux/include/rte_os.h | 8 - > lib/eal/loongarch/include/rte_atomic.h | 6 +- > lib/eal/loongarch/include/rte_byteorder.h | 4 +- > lib/eal/loongarch/include/rte_cpuflags.h | 8 - > lib/eal/loongarch/include/rte_cycles.h | 4 +- > lib/eal/loongarch/include/rte_io.h | 8 - > lib/eal/loongarch/include/rte_memcpy.h | 4 +- > lib/eal/loongarch/include/rte_pause.h | 8 +- > .../loongarch/include/rte_power_intrinsics.h | 8 - > lib/eal/loongarch/include/rte_prefetch.h | 8 +- > lib/eal/loongarch/include/rte_rwlock.h | 4 +- > lib/eal/loongarch/include/rte_spinlock.h | 6 +- > lib/eal/ppc/include/rte_atomic.h | 6 +- > lib/eal/ppc/include/rte_byteorder.h | 6 +- > lib/eal/ppc/include/rte_cpuflags.h | 8 - > lib/eal/ppc/include/rte_cycles.h | 8 +- > lib/eal/ppc/include/rte_io.h | 8 - > lib/eal/ppc/include/rte_memcpy.h | 4 +- > lib/eal/ppc/include/rte_pause.h | 8 +- > lib/eal/ppc/include/rte_power_intrinsics.h | 8 - > lib/eal/ppc/include/rte_prefetch.h | 8 +- > lib/eal/ppc/include/rte_rwlock.h | 4 +- > lib/eal/ppc/include/rte_spinlock.h | 8 +- > lib/eal/riscv/include/rte_atomic.h | 8 +- > lib/eal/riscv/include/rte_byteorder.h | 8 +- > lib/eal/riscv/include/rte_cpuflags.h | 8 - > lib/eal/riscv/include/rte_cycles.h | 4 +- > lib/eal/riscv/include/rte_io.h | 8 - > lib/eal/riscv/include/rte_memcpy.h | 4 +- > lib/eal/riscv/include/rte_pause.h | 8 +- > lib/eal/riscv/include/rte_power_intrinsics.h | 8 - > lib/eal/riscv/include/rte_prefetch.h | 8 +- > lib/eal/riscv/include/rte_rwlock.h | 4 +- > lib/eal/riscv/include/rte_spinlock.h | 6 +- > lib/eal/windows/include/pthread.h | 6 +- > lib/eal/windows/include/regex.h | 8 +- > lib/eal/windows/include/rte_os.h | 8 - > lib/eal/windows/include/rte_windows.h | 8 - > lib/eal/x86/include/rte_atomic.h | 25 +- > lib/eal/x86/include/rte_byteorder.h | 16 +- > lib/eal/x86/include/rte_cpuflags.h | 8 - > lib/eal/x86/include/rte_cycles.h | 8 +- > lib/eal/x86/include/rte_io.h | 8 +- > lib/eal/x86/include/rte_pause.h | 7 +- > lib/eal/x86/include/rte_power_intrinsics.h | 8 - > lib/eal/x86/include/rte_prefetch.h | 8 +- > lib/eal/x86/include/rte_rwlock.h | 6 +- > lib/eal/x86/include/rte_spinlock.h | 9 +- > lib/ethdev/ethdev_driver.h | 8 +- > lib/ethdev/ethdev_pci.h | 8 +- > lib/ethdev/ethdev_trace.h | 8 +- > lib/ethdev/ethdev_vdev.h | 8 +- > lib/ethdev/rte_cman.h | 8 - > lib/ethdev/rte_dev_info.h | 8 - > lib/ethdev/rte_eth_ctrl.h | 8 - > lib/ethdev/rte_ethdev.h | 8 +- > lib/ethdev/rte_ethdev_trace_fp.h | 4 +- > lib/eventdev/event_timer_adapter_pmd.h | 8 - > lib/eventdev/eventdev_pmd.h | 8 +- > lib/eventdev/eventdev_pmd_pci.h | 8 +- > lib/eventdev/eventdev_pmd_vdev.h | 8 +- > lib/eventdev/eventdev_trace.h | 8 +- > lib/eventdev/rte_event_crypto_adapter.h | 8 +- > lib/eventdev/rte_event_eth_rx_adapter.h | 8 +- > lib/eventdev/rte_event_eth_tx_adapter.h | 8 +- > lib/eventdev/rte_event_ring.h | 8 +- > lib/eventdev/rte_event_timer_adapter.h | 8 +- > lib/eventdev/rte_eventdev.h | 8 +- > lib/eventdev/rte_eventdev_trace_fp.h | 4 +- > lib/graph/rte_graph_model_mcore_dispatch.h | 8 +- > lib/graph/rte_graph_worker.h | 6 +- > lib/gso/rte_gso.h | 6 +- > lib/hash/rte_fbk_hash.h | 8 +- > lib/hash/rte_hash_crc.h | 8 +- > lib/hash/rte_jhash.h | 8 +- > lib/hash/rte_thash.h | 8 +- > lib/hash/rte_thash_gfni.h | 8 +- > lib/ip_frag/rte_ip_frag.h | 8 +- > lib/ipsec/rte_ipsec.h | 8 +- > lib/log/rte_log.h | 8 +- > lib/lpm/rte_lpm.h | 8 +- > lib/member/rte_member.h | 8 +- > lib/member/rte_member_sketch.h | 6 +- > lib/member/rte_member_sketch_avx512.h | 8 +- > lib/member/rte_member_x86.h | 4 +- > lib/member/rte_xxh64_avx512.h | 6 +- > lib/mempool/mempool_trace.h | 8 +- > lib/mempool/rte_mempool_trace_fp.h | 4 +- > lib/meter/rte_meter.h | 8 +- > lib/mldev/mldev_utils.h | 8 +- > lib/mldev/rte_mldev_core.h | 8 - > lib/mldev/rte_mldev_pmd.h | 8 +- > lib/net/rte_dtls.h | 8 - > lib/net/rte_ecpri.h | 8 - > lib/net/rte_esp.h | 8 - > lib/net/rte_ether.h | 8 +- > lib/net/rte_geneve.h | 8 - > lib/net/rte_gre.h | 8 - > lib/net/rte_gtp.h | 8 - > lib/net/rte_higig.h | 8 - > lib/net/rte_ib.h | 8 - > lib/net/rte_icmp.h | 8 - > lib/net/rte_l2tpv2.h | 8 - > lib/net/rte_macsec.h | 8 - > lib/net/rte_mpls.h | 8 - > lib/net/rte_net.h | 8 +- > lib/net/rte_pdcp_hdr.h | 8 - > lib/net/rte_ppp.h | 8 - > lib/net/rte_sctp.h | 8 - > lib/net/rte_tcp.h | 8 - > lib/net/rte_tls.h | 8 - > lib/net/rte_udp.h | 8 - > lib/net/rte_vxlan.h | 10 - > lib/node/rte_node_eth_api.h | 8 +- > lib/node/rte_node_ip4_api.h | 8 +- > lib/node/rte_node_ip6_api.h | 6 +- > lib/node/rte_node_udp4_input_api.h | 8 +- > lib/pci/rte_pci.h | 8 +- > lib/pdcp/rte_pdcp.h | 8 +- > lib/pipeline/rte_pipeline.h | 8 +- > lib/pipeline/rte_port_in_action.h | 8 +- > lib/pipeline/rte_swx_ctl.h | 8 +- > lib/pipeline/rte_swx_extern.h | 8 - > lib/pipeline/rte_swx_ipsec.h | 8 +- > lib/pipeline/rte_swx_pipeline.h | 8 +- > lib/pipeline/rte_swx_pipeline_spec.h | 8 +- > lib/pipeline/rte_table_action.h | 8 +- > lib/port/rte_port.h | 8 - > lib/port/rte_port_ethdev.h | 8 +- > lib/port/rte_port_eventdev.h | 8 +- > lib/port/rte_port_fd.h | 8 +- > lib/port/rte_port_frag.h | 8 +- > lib/port/rte_port_ras.h | 8 +- > lib/port/rte_port_ring.h | 8 +- > lib/port/rte_port_sched.h | 8 +- > lib/port/rte_port_source_sink.h | 8 +- > lib/port/rte_port_sym_crypto.h | 8 +- > lib/port/rte_swx_port.h | 8 - > lib/port/rte_swx_port_ethdev.h | 8 +- > lib/port/rte_swx_port_fd.h | 8 +- > lib/port/rte_swx_port_ring.h | 8 +- > lib/port/rte_swx_port_source_sink.h | 8 +- > lib/rawdev/rte_rawdev.h | 6 +- > lib/rawdev/rte_rawdev_pmd.h | 8 +- > lib/rcu/rte_rcu_qsbr.h | 8 +- > lib/regexdev/rte_regexdev.h | 8 +- > lib/ring/rte_ring.h | 6 +- > lib/ring/rte_ring_core.h | 8 - > lib/ring/rte_ring_elem.h | 8 +- > lib/ring/rte_ring_hts.h | 4 +- > lib/ring/rte_ring_peek.h | 4 +- > lib/ring/rte_ring_peek_zc.h | 4 +- > lib/ring/rte_ring_rts.h | 4 +- > lib/sched/rte_approx.h | 8 +- > lib/sched/rte_pie.h | 8 +- > lib/sched/rte_red.h | 8 +- > lib/sched/rte_sched.h | 8 +- > lib/sched/rte_sched_common.h | 6 +- > lib/security/rte_security.h | 8 +- > lib/security/rte_security_driver.h | 6 +- > lib/stack/rte_stack.h | 8 +- > lib/table/rte_lru.h | 8 - > lib/table/rte_lru_arm64.h | 8 +- > lib/table/rte_lru_x86.h | 8 - > lib/table/rte_swx_hash_func.h | 8 - > lib/table/rte_swx_keycmp.h | 8 +- > lib/table/rte_swx_table.h | 8 - > lib/table/rte_swx_table_em.h | 8 +- > lib/table/rte_swx_table_learner.h | 8 +- > lib/table/rte_swx_table_selector.h | 8 +- > lib/table/rte_swx_table_wm.h | 8 +- > lib/table/rte_table.h | 8 - > lib/table/rte_table_acl.h | 8 +- > lib/table/rte_table_array.h | 8 +- > lib/table/rte_table_hash.h | 8 +- > lib/table/rte_table_hash_cuckoo.h | 8 +- > lib/table/rte_table_hash_func.h | 24 +- > lib/table/rte_table_lpm.h | 8 +- > lib/table/rte_table_lpm_ipv6.h | 8 +- > lib/table/rte_table_stub.h | 8 +- > lib/telemetry/rte_telemetry.h | 8 +- > lib/vhost/rte_vdpa.h | 8 +- > lib/vhost/rte_vhost.h | 8 +- > lib/vhost/rte_vhost_async.h | 8 +- > lib/vhost/rte_vhost_crypto.h | 4 +- > lib/vhost/vdpa_driver.h | 8 +- > 311 files changed, 2257 insertions(+), 1362 deletions(-) > create mode 100755 buildtools/chkincs/chkextern.py
There are still unresolved comments on the first patch of the series. However, I preferred to postpone this subject so that we can get the headers cleanup and the new bitops API in rc1. I skipped this first patch and dropped the check added by 1ee492bdc4ff ("buildtools/chkincs: check missing C++ guards"). Thanks for the big cleanup on DPDK headers, let's finish the work on the headers check in rc2. Series applied. -- David Marchand