On Tue, 8 Aug 2023, Jeff Law wrote:

> That was my thinking at one time.  Then we started looking at the distros and
> found enough crc implementations in there to change my mind about the overall
> utility.

The ones I'm familiar with are all table-based and look impossible to
pattern-match (and hence already fairly efficient comparable to bitwise
loop in Coremark).

> If we need to do something to make it more useful, we're certainly open to
> that.

So... just provide a library? A library code is easier to develop and audit,
it can be released independently, people can use it with their compiler of
choice. Not everything needs to be in libgcc.

> > - they overlap multiple CLMUL chains to make the loop throughput-bound
> >    rather than latency-bound. The typical unroll factor is about 4x-8x.
> We do have the ability to build longer chains.  We actually use that in the
> coremark benchmark where the underlying primitives are 8-bit CRCs that are
> composed into 16/32 bit CRCs.

I'm talking about factoring a long chain into multiple independent chains
for latency hiding.

> > Hence, I am concerned that proposed __builtin_crc is not useful for FOSS
> > that actually needs high-performance CRC (the Linux kernel, compression
> > and image libraries).
> > 
> > I think this crosses the line of "cheating in benchmarks" and not something
> > we should do in GCC.
> Certianly not the intention.  The intention is to provide a useful builtin_crc

Useful to whom? The Linux kernel? zlib, bzip2, xz-utils? ffmpeg?
These consumers need high-performance blockwise CRC, offering them
a latency-bound elementwise CRC primitive is a disservice. And what
should they use as a fallback when __builtin_crc is unavailable?

> while at the same time putting one side of the infrastructure we need for
> automatic detection of CRC loops and turning them into table lookups or
> CLMULs.
> 
> With that in mind I'm certain Mariam & I would love feedback on a builtin API
> that would be more useful.

I think offering a conventional library for CRC has substantial advantages.

Alexander

Reply via email to