Hi Sebastian,

you can checkout the logic your self by looking into

https://github.com/apache/flink/blob/master/flink-formats/flink-json/src/main/java/org/apache/flink/formats/json/debezium/DebeziumJsonDeserializationSchema.java

and

https://github.com/apache/flink/blob/master/flink-formats/flink-json/src/main/java/org/apache/flink/formats/json/JsonRowDataDeserializationSchema.java

So actually your use case should work. Could you help investogating what is going wrong? In any case we should open an issue for it. It seems to be a bug.

Regards,
Timo

On 12.03.21 21:10, Magri, Sebastian wrote:
I validated it's still accepted by the connector but it's not in the documentation anymore.

It doesn't seem to help in my case.

Thanks,
Sebastian
------------------------------------------------------------------------
*From:* Magri, Sebastian <sebastian.ma...@radancy.com>
*Sent:* Friday, March 12, 2021 18:50
*To:* Timo Walther <twal...@apache.org>; ro...@apache.org <ro...@apache.org>
*Cc:* user <user@flink.apache.org>
*Subject:* Re: [Flink SQL] Leniency of JSON parsing
Hi Roman!

Seems like that option is no longer available.

Best Regards,
Sebastian
------------------------------------------------------------------------
*From:* Roman Khachatryan <ro...@apache.org>
*Sent:* Friday, March 12, 2021 16:59
*To:* Magri, Sebastian <sebastian.ma...@radancy.com>; Timo Walther <twal...@apache.org>
*Cc:* user <user@flink.apache.org>
*Subject:* Re: [Flink SQL] Leniency of JSON parsing
Hi Sebastian,

Did you try setting debezium-json-map-null-key-mode to DROP [1]?

I'm also pulling in Timo who might know better.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/debezium.html#debezium-json-map-null-key-mode <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/debezium.html#debezium-json-map-null-key-mode>

Regards,
Roman



On Fri, Mar 12, 2021 at 2:42 PM Magri, Sebastian
<sebastian.ma...@radancy.com> wrote:

I'm trying to extract data from a Debezium CDC source, in which one of the 
backing tables has an open schema nested JSON field like this:


"objectives": {
     "items": [
         {
             "id": 1,
             "label": "test 1"
             "size": 1000.0
         },
         {
             "id": 2,
             "label": "test 2"
             "size": 500.0
         }
     ],
     "threshold": 10.0,
     "threshold_period": "hourly",
     "max_ms": 30000.0
}


Any of these fields can be missing at any time, and there can also be 
additional, different fields. It is guaranteed that a field will have the same 
data type for all occurrences.

For now, I really need to get only the `threshold` and `threshold_period` 
fields. For which I'm using a field as the following:


CREATE TABLE probes (
   `objectives` ROW(`threshold` FLOAT, `threshold_period` STRING)
   ...
) WITH (
      ...
       'format' = 'debezium-json',
       'debezium-json.schema-include' = 'true',
       'debezium-json.ignore-parse-errors' = 'true'
)


However I keep getting `NULL` values in my `objectives` column, or corrupt JSON 
message exceptions when I disable the `ignore-parse-errors` option.

Does JSON parsing need to match 100% the schema of the field or is it lenient?

Is there any option or syntactic detail I'm missing?

Best Regards,

Reply via email to