Bumpety-bump. I would be in favour or removing this: - It can be implemented as a MapFunction parser after a TextInputFormat - Additions, changes, fixes that happen on TextInputFormat are not reflected to SimpleTweetInputFormat - SimpleTweetInput format overrides nextRecord(), which is not something DelimitedInputFormats are normally supposed to do, I think - The Tweet POJO has a very strange naming scheme
Best, Aljoscha > On 7. Jun 2017, at 11:15, Chesnay Schepler <ches...@apache.org> wrote: > > Hello, > > I'm proposing to remove the Twitter-InputFormat in FLINK-6710 > <https://issues.apache.org/jira/browse/FLINK-6710>, with an open PR you can > find here <https://github.com/apache/flink/pull/3984>. > The PR currently has a +1 from Robert, but Timo raised some concerns saying > that it is useful for prototyping and > advised me to start a discussion on the ML. > > This format is a DelimitedInputFormat that reads JSON objects and turns them > into a custom tweet class. > I believe this format doesn't provide much value to Flink; there's nothing > interesting about it as an InputFormat, > as it is purely an exercise in manually converting a JSON object into a POJO. > This is apparent since you could just as well use > ExecutionEnvironment#readTextFile(...) and throw the parsing logic > into a subsequent MapFunction. > > In the PR i suggested to replace this with a JsonInputFormat, but this was a > misguided attempt at getting Timo to agree > to the removal. This format has the same problem outlined above, as it could > be effectively implemented with a one-liner map function. > > So the question now is whether we want to keep it, remove it, or replace it > with something more general. > > Regards, > Chesnay