Re: about binary protocol porting

Geoffrey Broadwell Mon, 03 Jan 2022 19:15:48 -0800

I love doing binary codecs for Raku[1]! How you approach this reallydepends on what formats and protocols you want to create Raku modules for.

The first thing you need to be able to do is test if your codec iscorrect. It is notoriously easy to make a tiny mistake in a protocolimplementation and (especially for binary protocols) miss it entirelybecause it only happens in certain edge cases.

If the format or protocol in question is open and has one or more publictest suites, you're in good shape. Raku gives a lot of power forrefactoring tests to be very clean, and I've had good success doing thiswith several formats.

If there is no public test suite, but you can find RFCs or otherdetailed specs, you can often bootstrap a bespoke test suite from theexamples in the spec documents. Failing that, sometimes you can findsites (even Wikipedia, for the most common formats) that haveknown-correct examples to start with, or have published reverseengineering of files or captured data.

If the format is truly proprietary, you'll be getting lots of reverseengineering practice of your own. 😉

Now that you have some way of testing correctness, you'll want to beable to diagnose the incorrect bits. Make sure you have some way ofpresenting easily-readable text expansions of the binary format, becausejust comparing raw buffer contents can be rather tedious (though I admitto having found bugs in a public test suite by spending so much timestaring at the buffers I could tell they'd messed up a translation in away that made the test always pass). If the format or protocol has anofficial text translation/diagnostic/debug format -- CBOR, BSON,Protobuf, etc. all have these -- so much the better, you should supportthat format as soon as practical.

Once you get down to the nitty-gritty of writing the codec, I find it isvery important to make it work before making it fast. There is a lot ofroom for tuning Raku code, but it is WAY easier to get things going inthe right direction by starting off with idiomatic Raku -- given/when,treating the data buffer as if it was a normal Array (Positionalreally), and so on.

Make sure that with every protocol feature that you add, that you maketests newly pass, and (I find at least) that you write the coding anddecoding bits at the same time, so you can check that you can round-tripdata successfully. For the love of all that is good, don't implementany obtuse features before the core features are rock solid and pass thetest suite with nary a hiccup.

After that, when you think you're ready to optimize, write performance/tests/ first. Make sure you test with data that will both use yourcodec in a typical manner, and also test out all the odd corners. You're looking for things that seem weirdly slow; this usually indicatesa thinko like copying the entire buffer each time you read a byte fromit, or somesuch.

Once you've got the obvious performance kinks worked out, come by andask again, and we can give you further advice from there. Or heck, justcome visit us on IRC (#raku at Libera.chat), and we'll be happy tohelp. (Do stick around for a while though, because traffic variesstrongly by time of day and day of week.)


Best Regards,


Geoff (japhb)

[1] I'm a bit of a nut for it, really. In the distant past, I wrappedC libraries to get the job done, but more recently I've done them asplain Raku code (and sometimes NQP, the language that Rakudo is written in).


I've written some of the binary format codecs for Raku:

 * https://github.com/japhb/CBOR-Simple
   <https://github.com/japhb/CBOR-Simple>
 * https://github.com/japhb/BSON-Simple
   <https://github.com/japhb/BSON-Simple>
 * https://github.com/japhb/Terminal-ANSIParser
   <https://github.com/japhb/Terminal-ANSIParser>
 * https://github.com/japhb/TinyFloats
   <https://github.com/japhb/TinyFloats>

Modified or tuned others:

 * https://github.com/samuraisam/p6-pb/commits?author=japhb
   <https://github.com/samuraisam/p6-pb/commits?author=japhb>
 * https://github.com/japhb/serializer-perf
   <https://github.com/japhb/serializer-perf>
 * (Lots of stuff spread across various Cro
   <https://github.com/croservices> repositories)

Added a spec extension for an existing standardized format (CBOR):

 * https://github.com/japhb/cbor-specs/blob/main/capture.md
   <https://github.com/japhb/cbor-specs/blob/main/capture.md>

And I think I forgot a few things.  😁

Re: about binary protocol porting

Reply via email to