Re: [capnproto] Need help making pycapnp/capnproto work across python and extension boundaries

Kenton Varda Sun, 04 Jun 2017 15:29:22 -0700

Hi Vitaly,

You can direct Cap'n Proto to write bytes to an arbitrary target by
creating a custom subclass of kj::OutputStream which does whatever you
need, then pass that to capnp::writeMessage().


You can also use MessageBuilder::getSegmentsForOutput() to get direct
pointers to the message content without any copies. You can construct a
SegmentArrayMessageReader from these segments elsewhere to read them.

It sounds like the limitations here are on the Python side, which I don't
know very much about.

-Kenton

On Mon, May 8, 2017 at 4:38 PM, vitaly numenta <
[email protected]> wrote:

> pycapnp builders are actually wrapping a capnp::DynamicStruct::Builder
>> under the hood, which is easy to cast back and forth to your native builder
>> type. You just need pycapnp to give you access somehow.
>
>
> Dear Kenton, regarding the above: we're working with your earlier
> suggestion to pass byte buffers across Python and C++ extension
> environments. We believe that this results in a more robust and portable
> implementation, since we have no control over which version of pycapnp the
> user desires to use, including which version of capnproto that pycapnp
> includes, and the compiler toolchain that built that pycapnp's capnproto
> .so on the user's machine, which build flags, etc. versus the build of
> capnproto in our own binary wheel containing our python extension.
>
> To this end, we often need to convert between capnproto readers/builders
> and flat array encodings (from messageToFlatArray) encapsulated as python
> byte string. Since our machine learning models may be huge (GBs), the
> multiple levels of copying is prohibitively expensive in memory resources
> (and possibly in time). So, it's pertinent to eliminate as many levels of
> copying as possible. Presently, pycapnp only exposes `to_bytes`, which is a
> method that extracts data bytes from a builder via
> `capnp::messageToFlatArray` and then copies to a python byte string.
> Unfortunately, capnproto doesn't provide `capnp::messageToFlatArray` for
> readers, so when a reader is involved, yet another level of copy is
> necessitated to convert the reader to a build before applying
> `capnp::messageToFlatArray`.
>
> I believe that the problem is not unique to our extension, and anyone
> attempting to implement this type of binding would run against this issues,
> especially if they are cognizant of the memory resource and performance
> implications.
>
> Ideally, I think it would be great to be able to use something like
> `capnp::messageToFlatArray` on readers as well as builders and also have it
> copy the output efficiently to a user-provided byte-aligned buffer instead
> of returning `kj::Array<capnp::word>`. This way, several levels of copying
> would be eliminated, and instantaneous memory utilization would be cut
> several-fold.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> Visit this group at https://groups.google.com/group/capnproto.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Re: [capnproto] Need help making pycapnp/capnproto work across python and extension boundaries

Reply via email to