Hello,
As a new Flink user I wondered if there are any existing approaches or
practices for reading file formats such as CSV, TSV, etc. as DataSets or
POJOs? My current approach can be illustrated with a contrived example:
// Simulating a TSV file DataSet
DataSet<String> tsvRatings = env.fromElements("category-1\t10");
// Mapping to a POJO
DataSet<Rating> ratings = tsvRatings.map(line -> {
String[] elements = line.split("\t");
return new Rating(elements[0], Integer.parseInt(elements[1])); });
While such a mapping could be implemented in a more general form, I'm keen
to avoid wheel reinvention and therefore wonder if there are already good
ways of doing this?
Thanks - Elliot.