On 31/10/2024 22:01, Pádraig Brady wrote:
On 31/10/2024 18:35, Sam Russell wrote:
  > `cksum -a crc32` could be added I suppose to select the current 
implementation in gnulib

both versions are CRC32 though, and then if you look at the iSCSI/SCTP version 
they use CRC32-C which uses a totally different polynomial, not just a reversed 
order one like gzip

Right. So if presenting what's in gnulib now from coreutils,
we would be better to distinguish which crc32 we're referring to.
I see 'crc32b' is commonly used for the current gnulib implementation.
So I can add `cksum -a crc32b` very easily now to coreutils.
I'll send a patch tomorrow for that.

coreutils `cksum -a crc32b` implementation attached.

cheers,
Pádraig
From dd125eb1bf985d982098f2ea7a2c564a6b0beca6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com>
Date: Fri, 1 Nov 2024 15:11:55 +0000
Subject: [PATCH] cksum: add support for --algorithm=crc32b

  $ echo -n '123456789' | cksum --raw -a crc32b | basenc --base16
  CBF43926

* bootstrap.conf: Explicitly depend on the crc module.
* doc/coreutils.texi (cksum): Add "crc32b" as an argument to -a.
* src/cksum.c (crc32b_sum_stream): A new function similar to
crc_sum_stream, but which does not include the length in
the CRC calculation.
* src/cksum.h: Add crc32b_sum_stream prototype.
* src/digest.c: Add "crc32b" as an argument to -a.
* tests/cksum/cksum.sh: Refactor to test both crc and crc32b.
* tests/cksum/cksum-a.sh: Add "crc32b" case.
* tests/cksum/cksum-base64.pl: Likewise.
* tests/misc/read-errors.sh: Likewise.
* NEWS: Mention the new feature.
---
 NEWS                        |  3 +++
 bootstrap.conf              |  1 +
 doc/coreutils.texi          |  8 +++++---
 src/cksum.c                 | 38 ++++++++++++++++++++++++++++++++++++
 src/cksum.h                 |  3 +++
 src/digest.c                | 18 +++++++++++------
 tests/cksum/cksum-a.sh      |  1 +
 tests/cksum/cksum-base64.pl |  5 +++--
 tests/cksum/cksum.sh        | 39 +++++++++++++++++++------------------
 tests/misc/read-errors.sh   |  1 +
 10 files changed, 87 insertions(+), 30 deletions(-)

diff --git a/NEWS b/NEWS
index 75263d027..b43f03e7f 100644
--- a/NEWS
+++ b/NEWS
@@ -39,6 +39,9 @@ GNU coreutils NEWS                                    -*- outline -*-
 
 ** New Features
 
+  cksum -a now supports the "crc32b" option, which calculates the CRC
+  of the input as defined by ITU V.42, as used in gzip for example.
+
   ls now supports the --sort=name option,
   to explicitly select the default operation of sorting by file name.
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 451d32404..be9b8b7fd 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -63,6 +63,7 @@ gnulib_modules="
   config-h
   configmake
   copy-file-range
+  crc
   crypto/md5
   crypto/sha1
   crypto/sha256
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 60a07b2ef..4673ba814 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4051,7 +4051,8 @@ and the file name unless no arguments were given.
 The 32-bit CRC used is based on the polynomial used
 for CRC error checking in the ISO/IEC 8802-3:1996 standard (Ethernet).
 Similar output formats are used for the other legacy checksums
-selectable with @option{--algorithm=sysv} or @option{--algorithm=bsd},
+selectable with @option{--algorithm=crc32b}, and
+@option{--algorithm=sysv} or @option{--algorithm=bsd}
 detailed at @ref{sum invocation}.
 
 @item Tagged output format
@@ -4100,6 +4101,7 @@ Supported legacy checksums (which are not supported by @option{--check}):
 @samp{sysv}      equivalent to @command{sum -s}
 @samp{bsd}       equivalent to @command{sum -r}
 @samp{crc}       equivalent to @command{cksum} (the default)
+@samp{crc32b}    only available through @command{cksum}
 @end example
 
 Supported more modern digest algorithms are:
@@ -4151,7 +4153,7 @@ as the length is automatically determined when checking.
 Print only the unencoded raw binary digest for a single input.
 Do not output the file name or anything else.
 Use network byte order (big endian) where applicable:
-for @samp{bsd}, @samp{crc}, and @samp{sysv}.
+for @samp{bsd}, @samp{crc}, @samp{crc32b}, and @samp{sysv}.
 This option works only with a single input.
 Unlike other output formats, @command{cksum} provides no way to
 @option{--check} a @option{--raw} checksum.
@@ -4225,7 +4227,7 @@ a checksum inconsistent with the associated file, or if no valid
 line is found, @command{cksum} exits with nonzero status.  Otherwise,
 it exits successfully.
 The @command{cksum} command does not support @option{--check}
-with the older @samp{sysv}, @samp{bsd}, or @samp{crc} algorithms.
+with the older @samp{sysv}, @samp{bsd}, @samp{crc} or @samp{crc32b} algorithms.
 
 @item --ignore-missing
 @opindex --ignore-missing
diff --git a/src/cksum.c b/src/cksum.c
index a97ffc26f..a977bf296 100644
--- a/src/cksum.c
+++ b/src/cksum.c
@@ -135,6 +135,7 @@ main (void)
 #else /* !CRCTAB */
 
 # include "cksum.h"
+# include "crc.h"
 
 /* Number of bytes to read at once.  */
 # define BUFLEN (1 << 16)
@@ -242,6 +243,43 @@ crc_sum_stream (FILE *stream, void *resstream, uintmax_t *length)
   return 0;
 }
 
+/* Calculate the crc32b checksum and length in bytes of stream STREAM.
+   Return -1 on error, 0 on success.  */
+
+int
+crc32b_sum_stream (FILE *stream, void *resstream, uintmax_t *reslen)
+{
+  uint32_t buf[BUFLEN / sizeof (uint32_t)];
+  uint32_t crc = 0;
+  uintmax_t len = 0;
+  size_t bytes_read;
+
+  if (!stream || !resstream || !reslen)
+    return -1;
+
+  while ((bytes_read = fread (buf, 1, BUFLEN, stream)) > 0)
+    {
+      if (len + bytes_read < len)
+        {
+          errno = EOVERFLOW;
+          return -1;
+        }
+      len += bytes_read;
+
+      crc = crc32_update (crc, (char const *)buf, bytes_read);
+
+      if (feof (stream))
+        break;
+    }
+
+  unsigned int crc_out = crc;
+  memcpy (resstream, &crc_out, sizeof crc_out);
+
+  *reslen = len;
+
+  return ferror (stream) ? -1 : 0;
+}
+
 /* Print the checksum and size to stdout.
    If ARGS is true, also print the FILE name.  */
 
diff --git a/src/cksum.h b/src/cksum.h
index 58e9310b9..f8b279973 100644
--- a/src/cksum.h
+++ b/src/cksum.h
@@ -6,6 +6,9 @@ extern bool cksum_debug;
 extern int
 crc_sum_stream (FILE *stream, void *resstream, uintmax_t *length);
 
+extern int
+crc32b_sum_stream (FILE *stream, void *resstream, uintmax_t *length);
+
 extern void
 output_crc (char const *file, int binary_file, void const *digest, bool raw,
             bool tagged, unsigned char delim, bool args, uintmax_t length)
diff --git a/src/digest.c b/src/digest.c
index 37910ede6..399a97a56 100644
--- a/src/digest.c
+++ b/src/digest.c
@@ -290,6 +290,7 @@ enum Algorithm
   bsd,
   sysv,
   crc,
+  crc32b,
   md5,
   sha1,
   sha224,
@@ -302,24 +303,24 @@ enum Algorithm
 
 static char const *const algorithm_args[] =
 {
-  "bsd", "sysv", "crc", "md5", "sha1", "sha224",
+  "bsd", "sysv", "crc", "crc32b", "md5", "sha1", "sha224",
   "sha256", "sha384", "sha512", "blake2b", "sm3", nullptr
 };
 static enum Algorithm const algorithm_types[] =
 {
-  bsd, sysv, crc, md5, sha1, sha224,
+  bsd, sysv, crc, crc32b, md5, sha1, sha224,
   sha256, sha384, sha512, blake2b, sm3,
 };
 ARGMATCH_VERIFY (algorithm_args, algorithm_types);
 
 static char const *const algorithm_tags[] =
 {
-  "BSD", "SYSV", "CRC", "MD5", "SHA1", "SHA224",
+  "BSD", "SYSV", "CRC", "CRC32B", "MD5", "SHA1", "SHA224",
   "SHA256", "SHA384", "SHA512", "BLAKE2b", "SM3", nullptr
 };
 static int const algorithm_bits[] =
 {
-  16, 16, 32, 128, 160, 224,
+  16, 16, 32, 32, 128, 160, 224,
   256, 384, 512, 512, 256, 0
 };
 
@@ -333,6 +334,7 @@ static sumfn cksumfns[]=
   bsd_sum_stream,
   sysv_sum_stream,
   crc_sum_stream,
+  crc32b_sum_stream,
   md5_sum_stream,
   sha1_sum_stream,
   sha224_sum_stream,
@@ -347,6 +349,7 @@ static digest_output_fn cksum_output_fns[]=
   output_bsd,
   output_sysv,
   output_crc,
+  output_crc,
   output_file,
   output_file,
   output_file,
@@ -530,6 +533,7 @@ DIGEST determines the digest algorithm and default output format:\n\
   sysv      (equivalent to sum -s)\n\
   bsd       (equivalent to sum -r)\n\
   crc       (equivalent to cksum)\n\
+  crc32b    (only available through cksum)\n\
   md5       (equivalent to md5sum)\n\
   sha1      (equivalent to sha1sum)\n\
   sha224    (equivalent to sha224sum)\n\
@@ -790,7 +794,7 @@ split_3 (char *s, size_t s_len,
       ptrdiff_t algo_tag = algorithm_from_tag (s + i);
       if (algo_tag >= 0)
         {
-          if (algo_tag <= crc)
+          if (algo_tag <= crc32b)
             return false;  /* We don't support checking these older formats.  */
           cksum_algorithm = algo_tag;
         }
@@ -1506,9 +1510,11 @@ main (int argc, char **argv)
     case bsd:
     case sysv:
     case crc:
+    case crc32b:
         if (do_check && algorithm_specified)
           error (EXIT_FAILURE, 0,
-                 _("--check is not supported with --algorithm={bsd,sysv,crc}"));
+                 _("--check is not supported with "
+                   "--algorithm={bsd,sysv,crc,crc32b}"));
         break;
     default:
         break;
diff --git a/tests/cksum/cksum-a.sh b/tests/cksum/cksum-a.sh
index a8da17bee..da3d034ad 100755
--- a/tests/cksum/cksum-a.sh
+++ b/tests/cksum/cksum-a.sh
@@ -44,6 +44,7 @@ while read algo prog mode; do
       bsd) ;;
       sysv) ;;
       crc) ;;
+      crc32b) ;;
       *) cksum --check --algorithm=$algo out-c || fail=1 ;;
     esac
 
diff --git a/tests/cksum/cksum-base64.pl b/tests/cksum/cksum-base64.pl
index a037a1628..88e8cad33 100755
--- a/tests/cksum/cksum-base64.pl
+++ b/tests/cksum/cksum-base64.pl
@@ -29,6 +29,7 @@ my @pairs =
    ['sysv', "0 0 f"],
    ['bsd', "00000     0 f"],
    ['crc', "4294967295 0 f"],
+   ['crc32b', "0 0 f"],
    ['md5', "1B2M2Y8AsgTpgAmY7PhCfg=="],
    ['sha1', "2jmj7l5rSw0yVb/vlWAYkK/YBwk="],
    ['sha224', "0UoCjCo6K8lHYQK7KII0xBWisB+CjqYqxbPkLw=="],
@@ -43,7 +44,7 @@ my @pairs =
 # Use the hard-coded "f" as file name.
 sub fmt ($$) {
   my ($h, $v) = @_;
-  $h !~ m{^(sysv|bsd|crc)$} and $v = uc($h). " (f) = $v";
+  $h !~ m{^(sysv|bsd|crc|crc32b)$} and $v = uc($h). " (f) = $v";
   # BLAKE2b is inconsistent:
   $v =~ s{BLAKE2B}{BLAKE2b};
   return "$v"
@@ -59,7 +60,7 @@ my @Tests =
    (map {my ($h,$v)= @$_; my $o=fmt $h,$v;
          ["chk-".$h, "--check --strict", {IN=>$o},
           {AUX=>{f=>''}}, {OUT=>"f: OK\n"}]}
-      grep { $_->[0] !~ m{^(sysv|bsd|crc)$} } @pairs),
+      grep { $_->[0] !~ m{^(sysv|bsd|crc|crc32b)$} } @pairs),
 
    # For digests ending in "=", ensure --check fails if any "=" is removed.
    (map {my ($h,$v)= @$_; my $o=fmt $h,$v;
diff --git a/tests/cksum/cksum.sh b/tests/cksum/cksum.sh
index 62a4dd424..3665a070d 100755
--- a/tests/cksum/cksum.sh
+++ b/tests/cksum/cksum.sh
@@ -19,56 +19,57 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ cksum printf
 
+
 returns_ 1 cksum missing || fail=1
 
+# Pass in expected crc and crc32b for file "in"
+# Sets fail=1 upon failure
+crc_check() {
+  for crct in crc crc32b; do
+    cksum -a $crct in > out || fail=1
+    case "$crct" in crc) crce="$1";; crc32b) crce="$2";; esac
+    size=$(stat -c %s in) || framework_failure_
+    printf '%s\n' "$crce $size in" > exp || framework_failure_
+    compare exp out || fail=1
+  done
+}
+
+
+# Check complete range of bytes
 {
   for offset in $(seq -1 6); do
     env printf $(env printf '\\%03o' $(seq 0 $offset));
     env printf $(env printf '\\%03o' $(seq 0 255));
   done
 } > in || framework_failure_
-
-cksum in > out || fail=1
-printf '%s\n' '4097727897 2077 in' > exp || framework_failure_
-compare exp out || fail=1
+crc_check 4097727897 559400337
 
 # Make sure crc is correct for files larger than 128 bytes (4 fold pclmul)
 {
   env printf $(env printf '\\%03o' $(seq 0 130));
 } > in || framework_failure_
-
-cksum in > out || fail=1
-printf '%s\n' '3800919234 131 in' > exp || framework_failure_
-compare exp out || fail=1
+crc_check 3800919234 3739179551
 
 # Make sure crc is correct for files larger than 32 bytes
 # but <128 bytes (1 fold pclmul)
 {
   env printf $(env printf '\\%03o' $(seq 0 64));
 } > in || framework_failure_
-
-cksum in > out || fail=1
-printf '%s\n' '796287823 65 in' > exp || framework_failure_
-compare exp out || fail=1
+crc_check 796287823 1086353368
 
 # Make sure crc is still handled correctly when next 65k buffer is read
 # (>32 bytes more than 65k)
 {
   seq 1 12780
 } > in || framework_failure_
-
-cksum in > out || fail=1
-printf '%s\n' '3720986905 65574 in' > exp || framework_failure_
-compare exp out || fail=1
+crc_check 3720986905 388883562
 
 # Make sure crc is still handled correctly when next 65k buffer is read
 # (>=128 bytes more than 65k)
 {
   seq 1 12795
 } > in || framework_failure_
+crc_check 4278270357 2796628507
 
-cksum in > out || fail=1
-printf '%s\n' '4278270357 65664 in' > exp || framework_failure_
-compare exp out || fail=1
 
 Exit $fail
diff --git a/tests/misc/read-errors.sh b/tests/misc/read-errors.sh
index 3f1e0c42c..ae9184c27 100755
--- a/tests/misc/read-errors.sh
+++ b/tests/misc/read-errors.sh
@@ -27,6 +27,7 @@ cat .
 cksum -a blake2b .
 cksum -a bsd .
 cksum -a crc .
+cksum -a crc32b .
 cksum -a md5 .
 cksum -a sha1 .
 cksum -a sha224 .
-- 
2.47.0

Reply via email to