After many of my own attempts, research, digging through source code, and
speaking with folks in various channels in the community, I'm starting to
wonder if *anyone* has *ever* successfully gotten Avro to work with Python
Functions.

(I don't just mean ingesting a byte array with fastavro but actually using
the built-in schema support that Pulsar Python functions are intended to
support - hence the purpose of combining built-in Avro internals with
multi-language support.) This capability is a core part of the community
offering to support Python, and as we've standardized on Avro internals,
I'm concerned we may have a gap in our ability to support this combination
of technologies, which can impact adoption in organizations that have a
heavy investment in both Python and Java (such as for different teams) when
Avro has already been standardized on.

I've brought this question up in various places/groups for almost 3 years
now, and I'm starting to wonder if *nobody* has actually done it.

I've seen examples of using Python producers and consumers with Avro, but
the interaction is different because those interfaces allow the Schema to
be explicitly specified. It's not clear from the source code how (or if)
this can be done currently with the Python Functions API.

If there's a feature gap here, then we need to decide if it's a priority to
address. This is becoming increasingly important as the Python userbase is
growing significantly, but I'd like to hear thoughts from others,
especially since Lari recently asked if we should be considering wider
changes to the Function API internals.

Devin G. Bost

Reply via email to