Hi Richard, Certainly a BitStream is beyond what I initially thought.
But it could be of some use to play with it, is it available somewhere with a friendly open source license? I might give it a try using it for my generator (tests are already passing). Your example is almost complete, but if I understand it you're adding the size of the checksum instead of the checksum itself. Nonetheless I think its readability is superior, and I guess that it is better perfomance and memorywise. Also it could be useful for Base58Encoder I'm expermenting with. Encoding is "simple" but requires the use of a really large integer, I'm stuck at the decoding part now. After that there is a Base32 encoder (to implement a Bech32 encoder as well). So I might use it in my road of experimentation with these matters. Unless I diverge from this and abandon it as it normally happens. :/ Regarding this part: > This reminds me of a lesson I learned many years > ago: STRINGS ARE WRONG. (Thank you, designers of > Burroughs Extended Algol!) When trees aren't the > answer, streams often are. Can you provide more context to this? I wouldn't mind if it is in a separate thread or a blog post of your own. Thanks in advance. Esteban A. Maringolo 2018-03-05 21:55 GMT-03:00 Richard O'Keefe <rao...@gmail.com>: > I note that the specification in question does not > deal with arbitrary bit strings but with "entropies" > that are 128 to 256 bits long and a multiple of 32 bits. > 4 to 8 bits are copied from the front to the end. > (So selecting *this* bit field can be done by taking > the first byte of a ByteArray.) This makes the sequence > 132 to 264 bits. This is then chopped into 11 bit > subsequences. These are not arbitrary subsequences and > they are not taken in arbitrary order. They are a stream. > > My own Smalltalk library include BitInputStream and > BitOutputStream, wrapping byte streams. So we could > do something like > > ent := aByteArray size * 8. "BIP-39 ENT" > cs := ent // 32. "BIP-39 CS" > foo := ByteArray new: (ent + cs) // 8. > o := BitOutputStream on: foo writeStream. > i := BitInputStream on: aByteArray readStream. > 1 to: ent do: [:x | o nextPut: i next]. > i reset. > o nextUnsigned: cs put: (i nextUnsigned: cs). > i close. > o close. > i := BitInputStream on: foo readStream. > ans := (1 to: (ent + cs) // 11) collect: [:x | > WordList at: 1 + (i nextUnsigned: 11)]. > > Stare at this for a bit, and you realise that you don't > actually need the working byte array foo. > ByteArray > methods for: 'bitcoin' > mnemonic > |ent csn i t| > (ent between: 128 and: 256) > ifFalse: [self error: 'wrong size for BIP-39']. > cs := ent // 32. > n := (ent + cs) // 11. > i := BitInputStream on: (ReadStream on: self). > t := i nextUnsigned: cs. > i reset. > ^(1 to: n) collect: [:index | > WordList at: 1 + (index = n > ifTrue: [((i nextUnsigned: 11 - cs) bitShift: cs) bitOr: t] > ifFalse: [i nextUnsigned: 11])] > > My BitInputStream and BitOutputStream classes are, > um, not really mature. They aren't *completely* > naive, but they could be a lot better, and in > particular, BitInputStream>>nextUnsigned: and > BitOutputStream>>nextUnsigned:put: are definitely > suboptimal. I put this out there just to suggest > that there is a completely different way of thinking > about the problem. (Actually, this isn't *entirely* > unlike using Erlang bit syntax.) > > Bit*Streams are useful enough to justify primitive > support. (Which my classes don't have yet. I did > say they are not mature...) > > This reminds me of a lesson I learned many years > ago: STRINGS ARE WRONG. (Thank you, designers of > Burroughs Extended Algol!) When trees aren't the > answer, streams often are. > > > > > On 6 March 2018 at 07:21, Esteban A. Maringolo <emaring...@gmail.com> wrote: >> >> 2018-03-05 14:02 GMT-03:00 Stephane Ducasse <stepharo.s...@gmail.com>: >> > On Sun, Mar 4, 2018 at 9:43 PM, Esteban A. Maringolo >> > <emaring...@gmail.com> wrote: >> >> 2018-03-04 17:15 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>: >> >>> Bits are actually numbered from right to left (seen from how they are >> >>> printed). >> >> >> >> I understand bit operations, used it extensively with IP address eons >> >> ago. >> >> >> >> But if a spec says: "Take the first n bits from the hash", it means >> >> the first significant bits. >> >> so in 2r100101111 the first 3 bits are "100" and not "111". >> > >> > naive question: why? >> >> Because it says so. >> "A checksum is generated by taking the first ENT / 32 bits of its >> SHA256 hash. This checksum is appended to the end of the initial >> entropy. >> Next, these concatenated bits are split into groups of 11 bits, each >> encoding a number from 0-2047, serving as an index into a wordlist. >> Finally, we convert these numbers into words and use the joined words >> as a mnemonic sentence." [1]. >> >> > To me it looks like a lousy specification. >> >> It might be, I can't really tell. >> >> But such manipulation could be useful if you are knitting different >> parts of a binary packet whose boundaries are not at byte level, but >> bit instead. So you can take "these 5 bits, concatenate with this >> other 7, add 13 zero bits, then 1 followed by the payload". I'm >> assuming a non real case here though, my use case was fulfilled >> already. >> >> Regards! >> >> -- >> Esteban A. Maringolo >> >> [1] >> https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki#generating-the-mnemonic >> >