On Thu, Mar 13, 2025 at 03:41:49PM +0800, Kuan-Wei Chiu wrote: > On Thu, Mar 13, 2025 at 12:29:13AM +0800, Kuan-Wei Chiu wrote: > > On Wed, Mar 12, 2025 at 11:51:12AM -0400, Yury Norov wrote: > > > On Tue, Mar 11, 2025 at 03:24:14PM -0700, H. Peter Anvin wrote: > > > > On March 11, 2025 3:01:30 PM PDT, Yury Norov <yury.no...@gmail.com> > > > > wrote: > > > > >On Sun, Mar 09, 2025 at 11:48:26PM +0800, Kuan-Wei Chiu wrote: > > > > >> On Fri, Mar 07, 2025 at 12:07:02PM -0800, H. Peter Anvin wrote: > > > > >> > On March 7, 2025 11:53:10 AM PST, David Laight > > > > >> > <david.laight.li...@gmail.com> wrote: > > > > >> > >On Fri, 07 Mar 2025 11:30:35 -0800 > > > > >> > >"H. Peter Anvin" <h...@zytor.com> wrote: > > > > >> > > > > > > >> > >> On March 7, 2025 10:49:56 AM PST, Andrew Cooper > > > > >> > >> <andrew.coop...@citrix.com> wrote: > > > > >> > >> >> (int)true most definitely is guaranteed to be 1. > > > > >> > >> > > > > > >> > >> >That's not technically correct any more. > > > > >> > >> > > > > > >> > >> >GCC has introduced hardened bools that intentionally have bit > > > > >> > >> >patterns > > > > >> > >> >other than 0 and 1. > > > > >> > >> > > > > > >> > >> >https://gcc.gnu.org/gcc-14/changes.html > > > > >> > >> > > > > > >> > >> >~Andrew > > > > >> > >> > > > > >> > >> Bit patterns in memory maybe (not that I can see the Linux > > > > >> > >> kernel using them) but > > > > >> > >> for compiler-generated conversations that's still a given, or > > > > >> > >> the manager isn't C > > > > >> > >> or anything even remotely like it. > > > > >> > >> > > > > >> > > > > > > >> > >The whole idea of 'bool' is pretty much broken by design. > > > > >> > >The underlying problem is that values other than 'true' and > > > > >> > >'false' can > > > > >> > >always get into 'bool' variables. > > > > >> > > > > > > >> > >Once that has happened it is all fubar. > > > > >> > > > > > > >> > >Trying to sanitise a value with (say): > > > > >> > >int f(bool v) > > > > >> > >{ > > > > >> > > return (int)v & 1; > > > > >> > >} > > > > >> > >just doesn't work (see https://www.godbolt.org/z/MEndP3q9j) > > > > >> > > > > > > >> > >I really don't see how using (say) 0xaa and 0x55 helps. > > > > >> > >What happens if the value is wrong? a trap or exception?, good > > > > >> > >luck recovering > > > > >> > >from that. > > > > >> > > > > > > >> > > David > > > > >> > > > > > >> > Did you just discover GIGO? > > > > >> > > > > >> Thanks for all the suggestions. > > > > >> > > > > >> I don't have a strong opinion on the naming or return type. I'm > > > > >> still a > > > > >> bit confused about whether I can assume that casting bool to int > > > > >> always > > > > >> results in 0 or 1. > > > > >> > > > > >> If that's the case, since most people prefer bool over int as the > > > > >> return type and some are against introducing u1, my current plan is > > > > >> to > > > > >> use the following in the next version: > > > > >> > > > > >> bool parity_odd(u64 val); > > > > >> > > > > >> This keeps the bool return type, renames the function for better > > > > >> clarity, and avoids extra maintenance burden by having just one > > > > >> function. > > > > >> > > > > >> If I can't assume that casting bool to int always results in 0 or 1, > > > > >> would it be acceptable to keep the return type as int? > > > > >> > > > > >> Would this work for everyone? > > > > > > > > > >Alright, it's clearly a split opinion. So what I would do myself in > > > > >such case is to look at existing code and see what people who really > > > > >need parity invent in their drivers: > > > > > > > > > > bool parity_odd > > > > >static inline int parity8(u8 val) - - > > > > >static u8 calc_parity(u8 val) - - > > > > >static int odd_parity(u8 c) - + > > > > >static int saa711x_odd_parity - + > > > > >static int max3100_do_parity - - > > > > >static inline int parity(unsigned x) - - > > > > >static int bit_parity(u32 pkt) - - > > > > >static int oa_tc6_get_parity(u32 p) - - > > > > >static u32 parity32(__le32 data) - - > > > > >static u32 parity(u32 sample) - - > > > > >static int get_parity(int number, - - > > > > > int size) > > > > >static bool i2cr_check_parity32(u32 v, + - > > > > > bool parity) > > > > >static bool i2cr_check_parity64(u64 v) + - > > > > >static int sw_parity(__u64 t) - - > > > > >static bool parity(u64 value) + - > > > > > > > > > >Now you can refer to that table say that int parity(uXX) is what > > > > >people want to see in their drivers. > > > > > > > > > >Whichever interface you choose, please discuss it's pros and cons. > > > > >What bloat-o-meter says for each option? What's maintenance burden? > > > > >Perf test? Look at generated code? > > > > > > > > > >I personally for a macro returning boolean, something like I > > > > >proposed at the very beginning. > > > > > > > > > >Thanks, > > > > >Yury > > > > > > > > Also, please at least provide a way for an arch to opt in to using the > > > > builtins, which seem to produce as good results or better at least on > > > > some architectures like x86 and probably with CPU options that imply > > > > fast popcnt is available. > > > > > > Yeah. And because linux/bitops.h already includes asm/bitops.h > > > the simplest way would be wrapping generic implementation with > > > the #ifndef parity, similarly to how we handle find_next_bit case. > > > > > > So: > > > 1. Kuan-Wei, please don't invent something like ARCH_HAS_PARITY; > > > 2. This may, and probably should, be a separate follow-up series, > > > likely created by corresponding arch experts. > > > > > I saw discussions in the previous email thread about both > > __builtin_parity and x86-specific implementations. However, from the > > discussion, I learned that before considering any optimization, we > > should first ask: which driver or subsystem actually cares about parity > > efficiency? If someone does, I can help with a micro-benchmark to > > provide performance numbers, but I don't have enough domain knowledge > > to identify hot paths where parity efficiency matters. > > > IMHO, > > If parity is never used in any hot path and we don't care about parity: > > Then benchmarking its performance seems meaningless. In this case, a > function with a u64 argument would suffice, and we might not even need > a macro to optimize for different types—especially since the macro > requires special hacks to avoid compiler warnings. Also, I don't think > code size matters here. If it does, we should first consider making > parity a non-inline function in a .c file rather than an inline > function/macro in a header. > > If parity is used in a hot path: > > We need different handling for different type sizes. As previously > discussed, x86 assembly might use different instructions for u8 and > u16. This may sound stubborn, but I want to ask again: should we > consider using parity8/16/32/64 interfaces? Like in the i3c driver > example, if we only have a single parity macro that selects an > implementation based on type size, users must explicitly cast types. > If future users also need parity in a hot path, they might not be aware > of this requirement and end up generating suboptimal code. Since we > care about efficiency and generated code, why not follow hweight() and > provide separate implementations for different sizes? > It seems no one will reply to my two emails. So, I have summarized different interface approaches. If there is a next version, I will send it after the merge window closes.
Interface 1: Single Function Description: bool parity_odd(u64) Pros: Minimal maintenance cost Cons: Difficult to integrate with architecture-specific implementations due to the inability to optimize for different argument sizes Opinions: Jiri supports this approach Interface 2: Single Macro Description: parity_odd() macro Pros: Allows type-specific implementation Cons: Requires hacks to avoid warnings; users may need explicit casting; potential sub-optimal code on 32-bit x86 Opinions: Yury supports this approach Interface 3: Multiple Functions Description: bool parity_odd8/16/32/64() Pros: No need for explicit casting; easy to integrate architecture-specific optimizations; except for parity8(), all functions are one-liners with no significant code duplication Cons: More functions may increase maintenance burden Opinions: Only I support this approach Regards, Kuan-Wei