Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-09-06 Thread Becket Qin
Hi Stephen, I don't think you should compare the DataType with the AvroSchema directly. They are for different purposes and sometimes cannot be mapped in both directions. As of now, the following conversions are needed in Flink format: 1. Avro Schema -> Flink Table Schema (DataType). This is requ

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-09-06 Thread Becket Qin
Hi Jing, Thanks for the explanation. Since SourceFunction is already deprecated and we are working on > SinkFunction deprecation for 1.19, I would suggest directly > marking InputFormat and OutputFormat as deprecated. Because, once we mark > them as public in one release, users might start to use

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-09-06 Thread Jing Ge
Hi Becket, Thanks for the clarification. > StreamFormatAdapter is internal and it requires a StreamFormat > implementation for Avro files which does not exist either. > I thought the cases 1-6 described in the FLIP mean there is a StreamFormat implementation for Avro. That was my fault. I didn'

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-09-06 Thread 吴 stephen
Hi Becket, I notice that a new config will introduce to Avro Format and user can input their own schema. Since the user can input their schema , should Avro Format support a validation utils that validate whether the input schema is compatible with table columns? I’m modifying the Avro-Confulen

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-09-05 Thread Becket Qin
Hi Jing, Thanks for the comments. 1. "For the batch cases, currently the BulkFormat for DataStream is > missing" - true, and there is another option to leverage > StreamFormatAdapter[1] > StreamFormatAdapter is internal and it requires a StreamFormat implementation for Avro files which does not e

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-08-31 Thread Jing Ge
Hi Becket, It is a very useful proposal, thanks for driving it. +1. I'd like to ask some questions to make sure I understand your thoughts correctly: 1. "For the batch cases, currently the BulkFormat for DataStream is missing" - true, and there is another option to leverage StreamFormatAdapter[1]

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-08-31 Thread Becket Qin
Hi Ryan, thanks for the reply. Verifying the component with the schemas you have would be super helpful. I think enum is actually a type that is generally useful. Although it is not a part of ANSI SQL, MySQL and some other databases have this type. BTW, ENUM_STRING proposed in this FLIP is actual

Re: [DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-08-31 Thread Ryan Skraba
Hey -- I have a certain knowledge of Avro, and I'd be willing to help out with some of these enhancements, writing tests and reviewing. I have a *lot* of Avro schemas available for validation! The FLIP looks pretty good and covers the possible cases pretty rigorously. I wasn't aware of some of th

[DISCUSS] FLIP-358: flink-avro enhancement and cleanup

2023-08-28 Thread Becket Qin
Hi folks, I would like to start the discussion about FLIP-158[1] which proposes to clean up and enhance the Avro support in Flink. More specifically, it proposes to: 1. Make it clear what are the public APIs in flink-avro components. 2. Fix a few buggy cases in flink-avro 3. Add more supported Av