I love doing binary codecs for Raku[1]!  How you approach this really depends on what formats and protocols you want to create Raku modules for.

The first thing you need to be able to do is test if your codec is correct.  It is notoriously easy to make a tiny mistake in a protocol implementation and (especially for binary protocols) miss it entirely because it only happens in certain edge cases.

If the format or protocol in question is open and has one or more public test suites, you're in good shape.  Raku gives a lot of power for refactoring tests to be very clean, and I've had good success doing this with several formats.

If there is no public test suite, but you can find RFCs or other detailed specs, you can often bootstrap a bespoke test suite from the examples in the spec documents.  Failing that, sometimes you can find sites (even Wikipedia, for the most common formats) that have known-correct examples to start with, or have published reverse engineering of files or captured data.

If the format is truly proprietary, you'll be getting lots of reverse engineering practice of your own. 😉

Now that you have some way of testing correctness, you'll want to be able to diagnose the incorrect bits.  Make sure you have some way of presenting easily-readable text expansions of the binary format, because just comparing raw buffer contents can be rather tedious (though I admit to having found bugs in a public test suite by spending so much time staring at the buffers I could tell they'd messed up a translation in a way that made the test always pass).  If the format or protocol has an official text translation/diagnostic/debug format -- CBOR, BSON, Protobuf, etc. all have these -- so much the better, you should support that format as soon as practical.

Once you get down to the nitty-gritty of writing the codec, I find it is very important to make it work before making it fast. There is a lot of room for tuning Raku code, but it is WAY easier to get things going in the right direction by starting off with idiomatic Raku -- given/when, treating the data buffer as if it was a normal Array (Positional really), and so on.

Make sure that with every protocol feature that you add, that you make tests newly pass, and (I find at least) that you write the coding and decoding bits at the same time, so you can check that you can round-trip data successfully.  For the love of all that is good, don't implement any obtuse features before the core features are rock solid and pass the test suite with nary a hiccup.

After that, when you think you're ready to optimize, write performance /tests/ first.  Make sure you test with data that will both use your codec in a typical manner, and also test out all the odd corners.  You're looking for things that seem weirdly slow; this usually indicates a thinko like copying the entire buffer each time you read a byte from it, or somesuch.

Once you've got the obvious performance kinks worked out, come by and ask again, and we can give you further advice from there.  Or heck, just come visit us on IRC (#raku at Libera.chat), and we'll be happy to help.  (Do stick around for a while though, because traffic varies strongly by time of day and day of week.)

Best Regards,


Geoff (japhb)


[1]  I'm a bit of a nut for it, really.  In the distant past, I wrapped C libraries to get the job done, but more recently I've done them as plain Raku code (and sometimes NQP, the language that Rakudo is written in).

I've written some of the binary format codecs for Raku:

 * https://github.com/japhb/CBOR-Simple
   <https://github.com/japhb/CBOR-Simple>
 * https://github.com/japhb/BSON-Simple
   <https://github.com/japhb/BSON-Simple>
 * https://github.com/japhb/Terminal-ANSIParser
   <https://github.com/japhb/Terminal-ANSIParser>
 * https://github.com/japhb/TinyFloats
   <https://github.com/japhb/TinyFloats>

Modified or tuned others:

 * https://github.com/samuraisam/p6-pb/commits?author=japhb
   <https://github.com/samuraisam/p6-pb/commits?author=japhb>
 * https://github.com/japhb/serializer-perf
   <https://github.com/japhb/serializer-perf>
 * (Lots of stuff spread across various Cro
   <https://github.com/croservices> repositories)

Added a spec extension for an existing standardized format (CBOR):

 * https://github.com/japhb/cbor-specs/blob/main/capture.md
   <https://github.com/japhb/cbor-specs/blob/main/capture.md>

And I think I forgot a few things.  😁


Reply via email to