hi,
I have a bunch of CSV data files that i need to store in Parquet format. I did look at basic documentation on ParquetIO. and ParquetIO.sink() can be used to achive the same. However there is a dependency on the Avro Schema. how do i infer/generate Avro schema from CSV document data ? Does beam have any API for the same. I tried using Kite SDK API CSVUtil / JsonUtil but had no luck generating avro schema my CSV data files have headers in them and quite a few of the header fields are hyphenated which are not liked by Kite 's CSVUtil I think it will be a redundant effort to convert CSV documents to json documents . Any suggestions on how to infer avro schema from CSV data or a JSON schema will be helpful thanks Sri