Re: JSON to Parquet

2020-08-27 Thread Averell
Hi Dawid, Thanks for the suggestion. So, basically I'll need to use the JSON connector to get the JSON strings into Rows, and from Rows to Parquet records using the parquet connecter? I have never tried the TableAPI in the past, have been using the StreamingAPI only. Will follow your suggestion no

Re: JSON to Parquet

2020-08-25 Thread Dawid Wysakowicz
Hi Averell, If you can describe the JSON schema I'd suggest looking into the SQL API. (And I think you do need to define your schema upfront. If I am not mistaken Parquet must know the common schema.) Then you could do sth like: CREATE TABLE json (     // define the schema of your json data ) WIT

Re: JSON to Parquet

2020-08-25 Thread Dawid Wysakowicz
Hi Averell, If you can describe the JSON schema I'd suggest looking into the SQL API. (And I think you do need to define your schema upfront. If I am not mistaken Parquet must know the common schema.) Then you could do sth like: CREATE TABLE json (     // define the schema of your json data ) WIT

JSON to Parquet

2020-08-20 Thread Averell
Hello, I have a stream with each message is a JSON string with a quite complex schema (multiple fields, multiple nested layers), and I need to write that into parquet files after some slight modifications/enrichment. I wonder what options are available for me to do that. I'm thinking of JSON -> A