Hi,
One thing that you can do is to read this record using Avro keeping
`Result` as `bytes` and in a subsequent mapping function, you could change
the record type and deserialize the result. In Data Stream API:
source.map(new MapFunction<record_with_bytes,
record_with_deserialized_result> { ...} )
Best,
Piotrek
śr., 14 kwi 2021 o 03:17 Sumeet Malhotra <[email protected]>
napisał(a):
> Hi,
>
> I'm reading data from Kafka, which is Avro encoded and has the following
> general schema:
>
> {
> "name": "SomeName",
> "doc": "Avro schema with variable embedded encodings",
> "type": "record",
> "fields": [
> {
> "name": "Name",
> "doc": "My name",
> "type": "string"
> },
> {
> "name": "ID",
> "doc": "My ID",
> "type": "string"
> },
> {
> "name": "Result",
> "doc": "Result data, could be encoded differently",
> "type": "bytes"
> },
> {
> "name": "ResultEncoding",
> "doc": "Result encoding media type (e.g. application/avro,
> application/json)",
> "type": "string"
> },
> ]
> }
>
> Basically, the "Result" field is bytes whose interpretation depends upon
> the "ResultEncoding" field i.e. either avro or json. The "Result" byte
> stream has its own well defined schema also.
>
> My use case involves extracting/aggregating data from within the embedded
> "Result" field. What would be the best approach to perform this runtime
> decoding and extraction of fields from the embedded byte data? Would user
> defined functions help in this case?
>
> Thanks in advance!
> Sumeet
>
>