Hi Richard,

Certainly a BitStream is beyond what I initially thought.

But it could be of some use to play with it, is it available somewhere
with a friendly open source license?

I might give it a try using it for my generator (tests are already
passing). Your example is almost complete, but if I understand it
you're adding the size of the checksum instead of the checksum itself.
Nonetheless I think its readability is superior, and I guess that it
is better perfomance and memorywise.

Also it could be useful for Base58Encoder I'm expermenting with.
Encoding is "simple" but requires the use of a really large integer,
I'm stuck at the decoding part now.
After that there is a Base32 encoder (to implement a Bech32 encoder as well).

So I might use it in my road of experimentation with these matters.
Unless I diverge from this and abandon it as it normally happens. :/

Regarding this part:
> This reminds me of a lesson I learned many years
> ago: STRINGS ARE WRONG.  (Thank you, designers of
> Burroughs Extended Algol!)  When trees aren't the
> answer, streams often are.

Can you provide more context to this? I wouldn't mind if it is in a
separate thread or a blog post of your own.

Thanks in advance.

Esteban A. Maringolo

2018-03-05 21:55 GMT-03:00 Richard O'Keefe <rao...@gmail.com>:
> I note that the specification in question does not
> deal with arbitrary bit strings but with "entropies"
> that are 128 to 256 bits long and a multiple of 32 bits.
> 4 to 8 bits are copied from the front to the end.
> (So selecting *this* bit field can be done by taking
> the first byte of a ByteArray.)  This makes the sequence
> 132 to 264 bits.  This is then chopped into 11 bit
> subsequences.  These are not arbitrary subsequences and
> they are not taken in arbitrary order.  They are a stream.
>
> My own Smalltalk library include BitInputStream and
> BitOutputStream, wrapping byte streams.  So we could
> do something like
>
>    ent := aByteArray size * 8.  "BIP-39 ENT"
>    cs  := ent // 32.            "BIP-39 CS"
>    foo := ByteArray new: (ent + cs) // 8.
>    o   := BitOutputStream on: foo writeStream.
>    i   := BitInputStream on: aByteArray readStream.
>    1 to: ent do: [:x | o nextPut: i next].
>    i reset.
>    o nextUnsigned: cs put: (i nextUnsigned: cs).
>    i close.
>    o close.
>    i := BitInputStream on: foo readStream.
>    ans := (1 to: (ent + cs) // 11) collect: [:x |
>       WordList at: 1 + (i nextUnsigned: 11)].
>
> Stare at this for a bit, and you realise that you don't
> actually need the working byte array foo.
> ByteArray
>   methods for: 'bitcoin'
>     mnemonic
>       |ent csn i t|
>       (ent between: 128 and: 256)
>         ifFalse: [self error: 'wrong size for BIP-39'].
>       cs  := ent // 32.
>       n   := (ent + cs) // 11.
>       i   := BitInputStream on: (ReadStream on: self).
>       t   := i nextUnsigned: cs.
>       i reset.
>       ^(1 to: n) collect: [:index |
>         WordList at: 1 + (index = n
>           ifTrue:  [((i nextUnsigned: 11 - cs) bitShift: cs) bitOr: t]
>           ifFalse: [i nextUnsigned: 11])]
>
> My BitInputStream and BitOutputStream classes are,
> um, not really mature.  They aren't *completely*
> naive, but they could be a lot better, and in
> particular, BitInputStream>>nextUnsigned: and
> BitOutputStream>>nextUnsigned:put: are definitely
> suboptimal.  I put this out there just to suggest
> that there is a completely different way of thinking
> about the problem.  (Actually, this isn't *entirely*
> unlike using Erlang bit syntax.)
>
> Bit*Streams are useful enough to justify primitive
> support.  (Which my classes don't have yet.  I did
> say they are not mature...)
>
> This reminds me of a lesson I learned many years
> ago: STRINGS ARE WRONG.  (Thank you, designers of
> Burroughs Extended Algol!)  When trees aren't the
> answer, streams often are.
>
>
>
>
> On 6 March 2018 at 07:21, Esteban A. Maringolo <emaring...@gmail.com> wrote:
>>
>> 2018-03-05 14:02 GMT-03:00 Stephane Ducasse <stepharo.s...@gmail.com>:
>> > On Sun, Mar 4, 2018 at 9:43 PM, Esteban A. Maringolo
>> > <emaring...@gmail.com> wrote:
>> >> 2018-03-04 17:15 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>:
>> >>> Bits are actually numbered from right to left (seen from how they are
>> >>> printed).
>> >>
>> >> I understand bit operations, used it extensively with IP address eons
>> >> ago.
>> >>
>> >> But if a spec says: "Take the first n bits from the hash", it means
>> >> the first significant bits.
>> >> so in 2r100101111 the first 3 bits are "100" and not "111".
>> >
>> > naive question: why?
>>
>> Because it says so.
>> "A checksum is generated by taking the first ENT / 32 bits of its
>> SHA256 hash. This checksum is appended to the end of the initial
>> entropy.
>> Next, these concatenated bits are split into groups of 11 bits, each
>> encoding a number from 0-2047, serving as an index into a wordlist.
>> Finally, we convert these numbers into words and use the joined words
>> as a mnemonic sentence." [1].
>>
>> > To me it looks like a lousy specification.
>>
>> It might be, I can't really tell.
>>
>> But such manipulation could be useful if you are knitting different
>> parts of a binary packet whose boundaries are not at byte level, but
>> bit instead. So you can take "these 5 bits, concatenate with this
>> other 7, add 13 zero bits, then 1 followed by the payload". I'm
>> assuming a non real case here though, my use case was fulfilled
>> already.
>>
>> Regards!
>>
>> --
>> Esteban A. Maringolo
>>
>> [1]
>> https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki#generating-the-mnemonic
>>
>

Reply via email to