Hi,

One thing that you can do is to read this record using Avro keeping
`Result` as `bytes` and in a subsequent mapping function, you could change
the record type and deserialize the result. In Data Stream API:

source.map(new MapFunction<record_with_bytes,
record_with_deserialized_result> { ...} )

Best,
Piotrek

śr., 14 kwi 2021 o 03:17 Sumeet Malhotra <sumeet.malho...@gmail.com>
napisał(a):

> Hi,
>
> I'm reading data from Kafka, which is Avro encoded and has the following
> general schema:
>
> {
>   "name": "SomeName",
>   "doc": "Avro schema with variable embedded encodings",
>   "type": "record",
>   "fields": [
>     {
>       "name": "Name",
>       "doc": "My name",
>       "type": "string"
>     },
>     {
>       "name": "ID",
>       "doc": "My ID",
>       "type": "string"
>     },
>     {
>       "name": "Result",
>       "doc": "Result data, could be encoded differently",
>       "type": "bytes"
>     },
>     {
>       "name": "ResultEncoding",
>       "doc": "Result encoding media type (e.g. application/avro,
> application/json)",
>       "type": "string"
>     },
>   ]
> }
>
> Basically, the "Result" field is bytes whose interpretation depends upon
> the "ResultEncoding" field i.e. either avro or json. The "Result" byte
> stream has its own well defined schema also.
>
> My use case involves extracting/aggregating data from within the embedded
> "Result" field. What would be the best approach to perform this runtime
> decoding and extraction of fields from the embedded byte data? Would user
> defined functions help in this case?
>
> Thanks in advance!
> Sumeet
>
>

Reply via email to