Hey all, Wanted to share a project we've been working on at Query.Farm: vgi-rpc, an open-source RPC framework built on Apache Arrow IPC.
It started with a dream, can I build services using Arrow without being locked into the gRPC ecosystem? Last April, Matt Topol and I were having dinner after PyData Charlottesville, and he said something that really stuck with me: "Just use Arrow." So I did exactly that with vgi-rpc. There is no other serialization except Arrow (no protobuf, not msgpack, just Arrow). The elevator pitch: define RPC services as plain Python `Protocol` classes, and get typed proxies with full IDE autocompletion. No `.proto` files, no codegen, no protobuf dependency. Your type annotations *are* the schema. The framework infers Arrow schemas from them automatically. Being Arrow means I can make this RPC framework work in any language that supports Arrow IPC, but I chose to get started with Python. vgi-rpc takes a different approach compared to Flight. It's aimed at people who want typed, ergonomic RPC without the complexity or lock-in of gRPC, but with all the benefits of Arrow. And being Arrow, it's really useful if your RPC responses have variable length responses, or as I say, "stream." - **Custom methods** - Flight gives you a fixed API (`do_get`, `do_put`, `do_action`). vgi-rpc lets you define whatever methods you want, with whatever signatures you want. Your proxy has real method names and type-checked arguments. - **Transport flexibility** - Flight is gRPC-only, and as Arrow developers we've discussed the complexity and restrictions that brings. vgi-rpc runs over in-process pipes, subprocess stdin/stdout, Unix domain sockets, shared memory, or HTTP. Same service code, different transport. Pipes are great for testing, shared memory gives you zero-copy between co-located processes. Shared memory still requires a bit more work in PyArrow to be as fast as the memory bandwidth of your machine. It depends on these PRs (https://github.com/apache/arrow/pull/49262, https://github.com/apache/arrow/pull/49286). On my MacBook Air M3 I got it up to 29 GB/s. - **Shared memory transport** - `ShmPipeTransport` lets two processes share Arrow record batches by passing a pointer over a pipe instead of serializing the whole thing. Flight has no equivalent right now. If you're running a pipeline of worker processes on the same machine, this is a significant win. It almost completely eliminates the overhead of being in an external process. - **No gRPC dependency** - Flight pulls in gRPC and protobuf (including C++ compilation). vgi-rpc's core dependency is just PyArrow. The HTTP transport uses Falcon/httpx, which are pure Python. When vgi-rpc gets ported to other languages it will just depend on the Arrow implementation in that language. - **Simpler streaming** - vgi-rpc supports producer streams (server pushes batches) and lockstep exchange (request-response ping-pong). It deliberately skips full-duplex concurrent streaming. Lockstep is easier to reason about and covers most use cases. **Where Flight still wins:** - Multi-language ecosystem (if you need Java/Go/C++ clients), but I have vgi-rpc implementations in C++, TypeScript, Swift, and Go in progress. Stay tuned. - Concurrent bidirectional streaming (gRPC-style full duplex) **Other things worth mentioning:** - Transparent large batch externalization to S3/GCS. Batches above a threshold get offloaded to cloud storage automatically, so you can host behind servers with request size limits (looking at Cloudflare Workers, AWS Lambda, and Google Cloud Run). - Pluggable auth (JWT, API key, whatever) on the HTTP transport - Built-in introspection endpoint for service discovery - OpenTelemetry instrumentation - Strict mypy typing throughout - Python 3.13+, Apache 2.0 licensed Docs: https://vgi-rpc-python.query.farm/ PyPI: https://pypi.org/project/vgi-rpc/ GitHub: https://github.com/Query-farm/vgi-rpc-python Happy to answer any questions or hear feedback. There is more to the VGI story, this is just the start at the bottom of the layers of what's coming. Cheers, Rusty https://query.farm
