Hiya, Are we all set with this one now?
I won't put it on the Dec17 IESG telechat (as I've already got a bunch of stuff on that one) but will tee it up for Jan7 if nobody tells me otherwise. Thanks, S. On 20/11/15 12:29, Simon Josefsson wrote: > Executive Summary: I have submitted > https://tools.ietf.org/html/draft-josefsson-scrypt-kdf-04 > that (hopefully) resolve the gen-art and secdir feedback. > > Paul Kyzivat <pkyzi...@alum.mit.edu> writes: > >> This draft is on the right track but has open issues, described in the >> review. > > Hello Paul. I am sorry for the delay in answering your thorough (much > appreciated!) review. Your review was emailed to the wrong > @tools.ietf.org address, but I can't blame that for my delay since I was > notified of your review through the secdir review. > >> (Issue-1): Intended status >> >> The intended status of this document is Informational. I question why >> it is not a normative document. As best I can tell, this is the formal >> specification of the algorithm. Those that use it would presumably >> want to claim conformance to it. The introduction describes this as an >> alternative to other KDF functions, including only one with an RFC >> reference - RFC2898. That one is also informational, but it is a >> restatement of an algorithm specified elsewhere, so that RFC can be >> viewed as an informational supplement to the actual definition. The >> same is not true of this document. >> >> (Of course changing this to a normative document would require >> significant changes, including adding 2119 language. And it probably >> could then not be handled as an AD-sponsored document.) > > As I understand it, the IETF tradition is that descriptions of crypto > algorithms that were invented elsewhere are not published as Standards > Tracks documents. So this is in line with that tradition. > >> (Issue-2): Type mismatch on call to scryptROMix >> >> The scryptROMix function calls scryptBlockMix with an octet vector of >> length 128*r octets. But the definition of scryptBlockMix specifies >> that the input argument should be a vector of 2*r 64-octet blocks. >> >> Clearly these don't match. One way to make them match would be to >> divide the single 128*r octet vector into two 64-octet vectors, and >> then to treat r as 2 inside of scryptBlockMix. I don't know if that is >> the intent. > > I believe this is a presentation issue. The intention is indeed that > conversion between these formats is transparent. In performant > implementations, B will refers to the same memory area. The document > was confusing here, and I believe the document was further problematic > since it described the inputs as an array of elements, rather than a > concatenation of octets. It now reads: > > Input: > B[0] || B[1] || ... || B[2 * r - 1] > Input octet string (of size 128 * r octets), > treated as 2 * r 64-octet blocks. > > Output: > B'[0] || B'[1] || ... || B'[2 * r - 1] > Output octet string. > > I hope this improves this aspect. > >> (Issue-3): Definition of Integerify >> >> The Integerify function is not clearly defined. The following is given >> in the document: >> >> j = Integerify (X) mod N >> where Integerify (B[0] ... B[2 * r - 1]) is defined >> as the result of interpreting B[2 * r - 1] as a >> little-endian integer. >> >> I can make no sense of this definition of Integerify. The description >> implies that B must be an array containing elements up to index >> 2*r-1. But the definition of B is "Input octet vector of length 128 * >> r octets". Taking the definition literally, B[2*r-1] must be an octet, >> and 2*r must be less than 128. That seems like nonsense to me. >> >> I found the following in the [SCRYPT] paper: >> >> "We expect that for reasons of performance and simplicity, >> implementors will restrict N to being a power of 2, in which case >> the function Integerify can be replaced by reading the first (or >> last) machine-length word from a k-bit block." >> >> Simply reading a machine-length word ignores the differences between >> little-endian and big-endian machines, and machines with different >> word sizes. Conveniently, [SALSA20SPEC] defines a littleendian >> function that yields a 32-bit integer from four bytes. That should be >> sufficient bits for computing "j". So Integerify(X) could be defined >> as: >> >> littleendian(X[0],X[1],X[2],X[3]) >> >> or >> >> littleendian(X[128*r-4],X[128*r-3],X[128*r-2],X[128*r-1]) >> >> (I don't think it matters which, as long as everyone does it the same way.) >> >> In any case, the language is ambiguous and needs to be clarified. > > Right. I have changed this into: > > j = Integerify (X) mod N > where Integerify (X) is defined as the result of > interpreting the last four octets of X as a little- > endian integer, i.e.: > littleendian(X[128*r-4], X[128*r-3], > X[128*r-2], X[128*r-1]) > >> ------------- >> Minor issues: >> >> (Issue-4): Identifiers reused for different meanings >> >> In scrypt, "B" is an array of "p" vectors, each of which is 128*r >> octets. In scryptROMix, "B" is a single vector of 128*r octets. In >> scryptBlockMix, "B" is a vector of 2*r 64-octet blocks. > > I'm not sure that changing variable names here is a good idea. For any > performant implementation, these variables will refer to the same memory > area and thus a consistent variable name helps to signal that. It is > just the interpretation of that memory area that is different in these > functions. > >> In both scrypt and scryptROMix "r" is the same block size >> parameter. But in scryptROMix it is only used in the (broken) >> definition of Integerify. > > It is used in the variable descriptions too, to indicate length of the > variables. > >> In scryptBlockMix "r" is (apparently, if I have figured things out) >> always 2, and unrelated to the other "r". > > No, r is the same throughout. > >> The document would be clearer if distinct identifiers were used for >> each unique concept. > > I believe the identifier refers to unique concepts. For B, what differs > is how that memory area is interpreted in each algorithm description. > > Can you think of some way to make this more clear, if the changes I've > made now aren't sufficient? I believe some changes I have made already > make this somewhat clearer though. > >> For those identifiers whose value is intended to be constant and >> common across all the functions (such as "N"), it would be better to >> define them once, globally. >> >> (Issue-5): Confusing/misleading names/definitions of identifiers >> >> The "Block size" parameter ("r") does not denote the size of a >> block. It is a factor in the size of blocks, varying from function to >> function. Exactly what concept it denotes, and how one would choose >> it, isn't clear to me. >> >> The definition of the "N" parameter (CPU/Memory cost parameter) isn't >> especially clear. It appears that increasing N increases the cost both >> of CPU and memory. But the "p" (parallelization) parameter acts as a >> multiplier on N, also increasing the cost. It is far from clear how >> one would choose appropriate values for N and p. For a given value of >> N*p, is it better for N to be large, or p to be large? >> >> I suggest that more thought be given to what these things mean in the >> context of this application, and then choose identifier names and >> descriptions accordingly. It may be better to refactor these some >> other way. > > This came from the secdir review as well -- I have added a section > "Scrypt parameters" to discuss this. > > 2. Scrypt Parameters > > The scrypt function takes several parameters. The passphrase P is > typically a human-chosen password. The salt is normally uniquely and > randomly generated [RFC4086]. The parameter r ("blockSize") specify > the block size. The CPU/Memory cost parameter N ("costParameter") > must be larger than 1, a power of 2 and less than 2^(128 * r / 8). > The parallelization parameter p ("parallelizationParameter"), a > positive integer less than or equal to ((2^32-1) * 32) / (128 * r). > The intended output length dkLen in octets of the derived key > ("keyLength"); a positive integer less than or equal to (2^32 - 1) * > 32. > > Users of scrypt can tune the parameters N, r, and p according to the > amount of memory and computing power available, the latency-bandwidth > product of the memory subsystem, and the amount of parallelism > desired. At the current time, taking r=8 and p=1 appears to yield > good results, but as memory latency and CPU parallelism increase it > is likely that the optimum values for both r and p will increase. > Note also that since the computations of SMix are independent, a > large value of p can be used to increase the computational cost of > scrypt without increasing the memory usage; so we can expect scrypt > to remain useful even if the growth rates of CPU power and memory > capacity diverge. > >> The ASN.1 in section 6 assigns names to several of these >> identifiers. It would be helpful to readers if the names used in >> defining the algorithms were also the names used here. > > I've added these names to the "Scrypt Parameters" section above. > >> (Issue-6): Dubious stability of references >> >> I looked for prior discussion of this draft, and found some on the >> saag mailing list regarding the references. >> >> The definition of the Salsa20 hash function in >> http://cr.yp.to/snuffle/spec.pdf seems clear enough, but is the >> document reference stable? It might be safer to replicate the >> definition in this document (in an appendix) with attribution. It >> doesn't appear that there is any copyright in the referenced >> document. > > I have included the brief C snippet that defines the function in the > section, which hopefully is sufficient clear for implementers (together > with the already included test vector) to transpose it into something > working in any language. This allowed me to move those two references > to informative ones. > >> I'll also note that call to this hash function in scryptBlockMix is to >> "Salsa", not Salso20. It ought to be consistent with the definition. > > This is explained in the section: > > Below, Salsa(T) corresponds to the Salsa20/8 Core function applied to > the octet vector T. > >> ------------------------ >> Nits/editorial comments: >> >> (Issue-7): IdNits reported errors >> >> IdNits reports: >> >> -- Obsolete informational reference (is this intentional?): RFC 5208 >> (Obsoleted by RFC 5958) >> >> This is triggered by the following in section 6: >> >> "To be usable in PKCS#8 [RFC5208] and Asymmetric Key Packages >> [RFC5958] the following extension of the PBES2-KDFs type is needed." >> >> Since RFC5958 is referenced, and it obsoletes RFC5208, what is the >> reason for referencing both? > > The reason was that a lot of protocols refers to 'PKCS#8' and RFC 5208 > is the IETF-canonical reference for that term, and a way to illustrate > that the scrypt ASN.1 schema is compliant with "old" PKCS#8. RFC 5958 > obsoletes RFC 5208 but the identifier 'PKCS#8' was lost in the process. > > I'm fine with dropping 5208 as a reference, but I do believe it provides > slightly more value to the reader. > >> (There is also an error about 2119, but it is bogus, triggered by the >> use of OPTIONAL in the ASN.1.) >> >> This completes my review. > > Thank you! > > /Simon > _______________________________________________ Gen-art mailing list Gen-art@ietf.org https://www.ietf.org/mailman/listinfo/gen-art