On 29/10/17 00:28, Assaf Gordon wrote: > Hello, > > On 2017-10-24 04:09 AM, Bruno Haible wrote: >>>> Indeed, we don't have many crypto experts on the gnulib mailing >>>> lists. Therefore, I too would like the crypto experts on the >>>> libgcrypt or openssl mailing lists to evaluate the SM3 algorithm >>>> and your code first, before we can accept it in gnulib. >>> >>> libgcrypt has merged SM3 hash function. See >>> https://dev.gnupg.org/rC4423bf3cc4432b9bfe801ff74cb05e6f0dd3eccd > > I see SM3 has been added to gnulib and libgcrypt, > but I would suggest we wait for further evaluation before adding it to > coreutils. > > libgcrypt specifically say: "Thorough understanding of > applied cryptography is required to use Libgcrypt." > and Werner Koch explicitly acknowledged that they are willing to include > known weak algorithms in libgcrypt because non-experts should never use > it directly: > https://lists.gnupg.org/pipermail/gcrypt-devel/2017-October/004287.html > > I'm assuming the same applies to gnulib - but not to coreutils. > > > > > I won't claim to have any understanding of cryptography or hashing, > but a cursory look at the code raises three red flags: > > First, > It is *very* similar to sha256. > Not just the function structure, but almost line-for-line identical. > It could almost be said that SM3 is just a modified sha256 with slightly > different primitives. > (try "diff -u lib/sm3.c lib/sha256.c" to see what I mean). > > sha256 was published in 2001, and while it has not yet been compromised, > I would naively think it is not wise to offer a new hashing algorithm in > 2017 that's based on sha256. > I think it is already generally recommended to move away from sha-256 > to at least sha-512 (or sha-3). > > > > Second, > All SHA-2 hash functions use initialization values that > are based on the cube roots of the first 62 primes. > In lib/sha256.c it is the "sha256_round_constants" variable. > In lib/sm3.c these initialization values are the same 32bit values, > rotated: > > /* SM3 round constants */ > #define T(j) sm3_round_constants[j] > static const uint32_t sm3_round_constants[64] = { > 0x79cc4519UL, 0xf3988a32UL, 0xe7311465UL, 0xce6228cbUL, > 0x9cc45197UL, 0x3988a32fUL, 0x7311465eUL, 0xe6228cbcUL, > 0xcc451979UL, 0x988a32f3UL, 0x311465e7UL, 0x6228cbceUL, > 0xc451979cUL, 0x88a32f39UL, 0x11465e73UL, 0x228cbce6UL, > 0x9d8a7a87UL, 0x3b14f50fUL, 0x7629ea1eUL, 0xec53d43cUL, > 0xd8a7a879UL, 0xb14f50f3UL, 0x629ea1e7UL, 0xc53d43ceUL, > 0x8a7a879dUL, 0x14f50f3bUL, 0x29ea1e76UL, 0x53d43cecUL, > 0xa7a879d8UL, 0x4f50f3b1UL, 0x9ea1e762UL, 0x3d43cec5UL, > 0x7a879d8aUL, 0xf50f3b14UL, 0xea1e7629UL, 0xd43cec53UL, > 0xa879d8a7UL, 0x50f3b14fUL, 0xa1e7629eUL, 0x43cec53dUL, > 0x879d8a7aUL, 0x0f3b14f5UL, 0x1e7629eaUL, 0x3cec53d4UL, > 0x79d8a7a8UL, 0xf3b14f50UL, 0xe7629ea1UL, 0xcec53d43UL, > 0x9d8a7a87UL, 0x3b14f50fUL, 0x7629ea1eUL, 0xec53d43cUL, > 0xd8a7a879UL, 0xb14f50f3UL, 0x629ea1e7UL, 0xc53d43ceUL, > 0x8a7a879dUL, 0x14f50f3bUL, 0x29ea1e76UL, 0x53d43cecUL, > 0xa7a879d8UL, 0x4f50f3b1UL, 0x9ea1e762UL, 0x3d43cec5UL, > > > > > Third, > The 64 rounds ("Compression function") in sha256 seem to > rotate the internal state variables equally (a..h) > in the R macro definition [1] and usage [2]. > > [1] https://opengrok.housegordon.com/source/xref/gnulib/lib/sha256.c#484 > [2] https://opengrok.housegordon.com/source/xref/gnulib/lib/sha256.c#504 > > In sm3, it seems from a cursory look that the first 128 bits > and the last 128 bits of internal states are kept mostly separated > (variable a..d and e..h) based on the R macro [3] and usage [4]. > > [3] https://opengrok.housegordon.com/source/xref/gnulib/lib/sm3.c#377 > [4] https://opengrok.housegordon.com/source/xref/gnulib/lib/sm3.c#413 > > > > Again, I'm not a cryptography expert, and perhaps this is secure. > But it does look a bit suspicious. > > > --- > > It's one thing to add it to a library, where it can not be accidentally > used by non-expert users, but it's a different thing to provide an > executable that will be shipped by default on all future GNU/Linux > machine. > > I would humbly suggest we wait until other crypto experts evaluate > 'sm3', and decide to offer it in their programs (preferably openssl / > libressl) as a general-purpose program.
Thanks for looking at this Assaf. I do agree that a general command released to everyone has a higher bar than routines being available in a library. As I mentioned previously inclusion in openssl should be a minimum prerequisite, and even then making sm3sum generally available should be carefully considered. cheers, Pádraig