I love doing binary codecs for Raku[1]! How you approach this really
depends on what formats and protocols you want to create Raku modules for.
The first thing you need to be able to do is test if your codec is
correct. It is notoriously easy to make a tiny mistake in a protocol
implementation and (especially for binary protocols) miss it entirely
because it only happens in certain edge cases.
If the format or protocol in question is open and has one or more public
test suites, you're in good shape. Raku gives a lot of power for
refactoring tests to be very clean, and I've had good success doing this
with several formats.
If there is no public test suite, but you can find RFCs or other
detailed specs, you can often bootstrap a bespoke test suite from the
examples in the spec documents. Failing that, sometimes you can find
sites (even Wikipedia, for the most common formats) that have
known-correct examples to start with, or have published reverse
engineering of files or captured data.
If the format is truly proprietary, you'll be getting lots of reverse
engineering practice of your own. 😉
Now that you have some way of testing correctness, you'll want to be
able to diagnose the incorrect bits. Make sure you have some way of
presenting easily-readable text expansions of the binary format, because
just comparing raw buffer contents can be rather tedious (though I admit
to having found bugs in a public test suite by spending so much time
staring at the buffers I could tell they'd messed up a translation in a
way that made the test always pass). If the format or protocol has an
official text translation/diagnostic/debug format -- CBOR, BSON,
Protobuf, etc. all have these -- so much the better, you should support
that format as soon as practical.
Once you get down to the nitty-gritty of writing the codec, I find it is
very important to make it work before making it fast. There is a lot of
room for tuning Raku code, but it is WAY easier to get things going in
the right direction by starting off with idiomatic Raku -- given/when,
treating the data buffer as if it was a normal Array (Positional
really), and so on.
Make sure that with every protocol feature that you add, that you make
tests newly pass, and (I find at least) that you write the coding and
decoding bits at the same time, so you can check that you can round-trip
data successfully. For the love of all that is good, don't implement
any obtuse features before the core features are rock solid and pass the
test suite with nary a hiccup.
After that, when you think you're ready to optimize, write performance
/tests/ first. Make sure you test with data that will both use your
codec in a typical manner, and also test out all the odd corners.
You're looking for things that seem weirdly slow; this usually indicates
a thinko like copying the entire buffer each time you read a byte from
it, or somesuch.
Once you've got the obvious performance kinks worked out, come by and
ask again, and we can give you further advice from there. Or heck, just
come visit us on IRC (#raku at Libera.chat), and we'll be happy to
help. (Do stick around for a while though, because traffic varies
strongly by time of day and day of week.)
Best Regards,
Geoff (japhb)
[1] I'm a bit of a nut for it, really. In the distant past, I wrapped
C libraries to get the job done, but more recently I've done them as
plain Raku code (and sometimes NQP, the language that Rakudo is written in).
I've written some of the binary format codecs for Raku:
* https://github.com/japhb/CBOR-Simple
<https://github.com/japhb/CBOR-Simple>
* https://github.com/japhb/BSON-Simple
<https://github.com/japhb/BSON-Simple>
* https://github.com/japhb/Terminal-ANSIParser
<https://github.com/japhb/Terminal-ANSIParser>
* https://github.com/japhb/TinyFloats
<https://github.com/japhb/TinyFloats>
Modified or tuned others:
* https://github.com/samuraisam/p6-pb/commits?author=japhb
<https://github.com/samuraisam/p6-pb/commits?author=japhb>
* https://github.com/japhb/serializer-perf
<https://github.com/japhb/serializer-perf>
* (Lots of stuff spread across various Cro
<https://github.com/croservices> repositories)
Added a spec extension for an existing standardized format (CBOR):
* https://github.com/japhb/cbor-specs/blob/main/capture.md
<https://github.com/japhb/cbor-specs/blob/main/capture.md>
And I think I forgot a few things. 😁