> Cross-language support for PubSub is not yet implemented but it can be
> done similarly to ReadFromKafka. There are still some limitations regarding
> the coders, i.e. only coders can be used which are available in both the
> Java and the Python SDK (standard coders).
>
Yeah, I was just looking through the source and noticed a few things right
off the bat:
- expansion service needs to be passed as an arg to each external xform
- why not make this part of the pipeline options? does it really
need to vary from transform to transform?
- explicit coders need to be passed to each external xform for each item
to be serialized, key and value coders provided separately
- in python we get auto-detection of coders based on type hints or
data type, including compound data types (e.g. Tuple[int, str, Dict[str,
float]])
- in python we also have a fallback to the pickle coder for complex
types without builtin coders. is the pickle coder supported by java?
- is there a way to express compound java coders as a string?
- why not pass the results in and out of the java xform using
bystrings, and then use python-based coders in python?
As of now the user experience is a bit rough, but we will be improving that
> very soon. Happy to help out if you want to contribute a cross-language
> ReadFromPubSub.
>
We're pretty locked in to Flink, thus adopting Kafka or PubSub is going to
be a requirement, so it looks like we're going the external transform route
either way. I'd love to hear more about A) what the other limitations of
external transforms are, and B) what you have planned to improve the UX.
I'm sure we can find something to contribute!
-chad