[ https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624390#comment-14624390 ]
ASF GitHub Bot commented on FLINK-1520: --------------------------------------- Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/847#issuecomment-120856171 Hi, I just had a closer look at this PR and it made me seriously question the utility of a `Graph.fromCSV` method. Why? First of all because it's more limited than the regular `env.fromCsv()` in the sense that it does not allow POJOs and it would be a bit tedious to support that. There would be a need for methods with 2 to n fields, according to the amount of attributes present in the POJO. Second, because, and I am speaking strictly as a user here, I would rather write: private static DataSet<Edge<Long, Double>> getEdgesDataSet(ExecutionEnvironment env) { if(fileOutput) { return env.readCsvFile(edgeInputPath) .ignoreComments("#") .fieldDelimiter("\t") .lineDelimiter("\n") .types(Long.class, Long.class, Double.class) .map(new Tuple3ToEdgeMap<Long, Double>()); } else { return CommunityDetectionData.getDefaultEdgeDataSet(env); } } than... private static Graph<Long, Long, Double> getGraph(ExecutionEnvironment env) { Graph<Long, Long, Double> graph; if(!fileOutput) { DataSet<Edge<Long, Double>> edges = CommunityDetectionData.getDefaultEdgeDataSet(env); graph = Graph.fromDataSet(edges, new MapFunction<Long, Long>() { public Long map(Long label) { return label; } }, env); } else { graph = Graph.fromCsvReader(edgeInputPath,new MapFunction<Long, Long>() { public Long map(Long label) { return label; } }, env).ignoreCommentsEdges("#") .fieldDelimiterEdges("\t") .lineDelimiterEdges("\n") .typesEdges(Long.class, Double.class) .typesVertices(Long.class, Long.class); } return graph; } Maybe it's just a preference thing... but I believe it's at least worth a discussion. On the other hand, the utility of such a method should have been questioned from its early Jira days, so I guess that's my mistake. I would like to hear your thoughts on this. Thanks! > Read edges and vertices from CSV files > -------------------------------------- > > Key: FLINK-1520 > URL: https://issues.apache.org/jira/browse/FLINK-1520 > Project: Flink > Issue Type: New Feature > Components: Gelly > Reporter: Vasia Kalavri > Assignee: Shivani Ghatge > Priority: Minor > Labels: easyfix, newbie > > Add methods to create Vertex and Edge Datasets directly from CSV file inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)