Hi, Just in case it could useful, we are working in Flink-Kudu integration [1]. This is a still Work in Progess but we had to implemente an InputFormat to read from Kudu tables so maybe the code is useful for you [2]
Best [1] https://github.com/rubencasado/Flink-Kudu [2] https://github.com/rubencasado/Flink-Kudu/blob/master/src/main/java/es/accenture/flink/Sources/KuduInputFormat.java El 19/1/17 6:03, "Pawan Manishka Gunarathna" <pawan.manis...@gmail.com> escribió: Hi, When we are implementing that InputFormat Interface, if we have that Input split part in our data analytics server APIs can we directly go to the second phase that you have described earlier....? Since Our data source has database tables architecture I have a thought of follow that 'JDBCInputFormat' in Flink. Can you provide some information regarding how that JDBCInputFormat execution happens? Thanks, Pawan On Mon, Jan 16, 2017 at 3:37 PM, Pawan Manishka Gunarathna < pawan.manis...@gmail.com> wrote: > Hi Fabian, > Thanks for providing those information. > > On Mon, Jan 16, 2017 at 2:36 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi Pawan, >> >> this sounds like you need to implement a custom InputFormat [1]. >> An InputFormat is basically executed in two phases. In the first phase it >> generates InputSplits. An InputSplit references a a chunk of data that >> needs to be read. Hence, InputSplits define how the input data is split to >> be read in parallel. In the second phase, multiple InputFormats are >> started >> and request InputSplits from an InputSplitProvider. Each instance of the >> InputFormat processes one InputSplit at a time. >> >> It is hard to give general advice on implementing InputFormats because >> this >> very much depends on the data source and data format to read from. >> >> I'd suggest to have a look at other InputFormats. >> >> Best, Fabian >> >> [1] >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_flink_blob_master_flink-2Dcore_src_&d=DgIBaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=RVDymwyU0kgdfLg3Rv7z3F9J81xIKmyt-6MlPBY5hSw&s=BDRgnhShzvotGlc7rLXFHyh5iiP4pHXF9lP8uysQW8M&e= >> main/java/org/apache/flink/api/common/io/InputFormat.java >> >> >> 2017-01-16 6:18 GMT+01:00 Pawan Manishka Gunarathna < >> pawan.manis...@gmail.com>: >> >> > Hi, >> > >> > we have a data analytics server that has analytics data tables. So I >> need >> > to write a custom *Java* implementation for read data from that data >> source >> > and do processing (*batch* processing) using Apache Flink. Basically >> it's >> > like a new client connector for Flink. >> > >> > So It would be great if you can provide a guidance for my requirement. >> > >> > Thanks, >> > Pawan >> > >> > > > > -- > > *Pawan Gunaratne* > *Mob: +94 770373556 <+94%2077%20037%203556>* > -- *Pawan Gunaratne* *Mob: +94 770373556* ________________________________ This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. ______________________________________________________________________________________ www.accenture.com