Strategies for reading structured file formats as POJO DataSets

Elliot West Thu, 05 Mar 2015 01:20:32 -0800

Hello,

As a new Flink user I wondered if there are any existing approaches or
practices for reading file formats such as CSV, TSV, etc. as DataSets or
POJOs? My current approach can be illustrated with a contrived example:


// Simulating a TSV file DataSet

DataSet<String> tsvRatings = env.fromElements("category-1\t10");

// Mapping to a POJO

DataSet<Rating> ratings = tsvRatings.map(line -> {
  String[] elements = line.split("\t");
  return new Rating(elements[0], Integer.parseInt(elements[1]));     });


While such a mapping could be implemented in a more general form, I'm keen
to avoid wheel reinvention and therefore wonder if there are already good
ways of doing this?

Thanks - Elliot.

Strategies for reading structured file formats as POJO DataSets

Reply via email to