Flink needs to know upfront what kind of types it deals with to setup
the serialization stack between operators.
As such, generally speaking, you will have to use some generic container
for transmitting data (e.g., a String or a Jackson ObjectNode) and
either work on them directly or
map them as required to specific type /within the scope of a single
function/ based on some custom logic.
There may be other approaches, but we'd have to know more about the
specific use-case and requirements to help you there (e.g., what does
/your user/ interact with).
My understanding is that you have some single source for all these
events, and now you want some user to define a pipeline processing a
specific subset of these events?
On 1/28/2021 5:44 AM, Devin Bost wrote:
I'm wanting to know if it's possible in Flink to parse strings into a
dynamic JSON object that doesn't require me to know the primitive type
details at compile time.
We have over 300 event types to process, and I need a way to load the
types at runtime. I only need to know if certain fields exist on the
incoming objects, and the object schemas are all different except for
certain fields.
Every example I can find shows Flink users specifying the full type
information at compile time, but there's no way this will scale.
It's possible for us to lookup primitive type details at runtime from
JSON, but I'll still need a way to process that JSON in Flink to
extract the metadata if it's required. So, that brings me back to the
original issue.
How can I do this in Flink?
--
Devin G. Bost