[ https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514087#comment-14514087 ]
ASF GitHub Bot commented on FLINK-1820: --------------------------------------- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/566#discussion_r29143622 --- Diff: flink-java/src/test/java/org/apache/flink/api/java/io/CsvInputFormatTest.java --- @@ -353,6 +354,99 @@ public void testIntegerFieldsl() throws IOException { assertEquals(Integer.valueOf(888), result.f2); assertEquals(Integer.valueOf(999), result.f3); assertEquals(Integer.valueOf(000), result.f4); + + result = format.nextRecord(result); + assertNull(result); + assertTrue(format.reachedEnd()); + } + catch (Exception ex) { + fail("Test failed due to a " + ex.getClass().getName() + ": " + ex.getMessage()); + } + } + + @Test + public void testEmptyFields() throws IOException { + try { + final String fileContent = "|0|0|0|0\n" + + "1||1|1|1|\n" + + "2|2| |2|2|\n" + + "3 |3|3| |3|\n" + + "4|4|4|4| |\n"; + final FileInputSplit split = createTempFile(fileContent); + + final TupleTypeInfo<Tuple5<Short, Integer, Long, Float, Double>> typeInfo = + TupleTypeInfo.getBasicTupleTypeInfo(Short.class, Integer.class, Long.class, Float.class, Double.class); + final CsvInputFormat<Tuple5<Short, Integer, Long, Float, Double>> format = new CsvInputFormat<Tuple5<Short, Integer, Long, Float, Double>>(PATH, typeInfo); + + format.setFieldDelimiter("|"); + + format.configure(new Configuration()); + format.open(split); + + Tuple5<Short, Integer, Long, Float, Double> result = new Tuple5<Short, Integer, Long, Float, Double>(); + + try { + result = format.nextRecord(result); + fail("Empty String Parse Exception was not thrown! (ShortParser)"); + } catch (ParseException e) {} + try { + result = format.nextRecord(result); + fail("Empty String Parse Exception was not thrown! (IntegerParser)"); + } catch (ParseException e) {} + try { + result = format.nextRecord(result); + fail("Empty String Parse Exception was not thrown! (LongParser)"); + } catch (ParseException e) {} + try { + result = format.nextRecord(result); --- End diff -- Doesn't this call fail because of the tailing whitespace in the `short` field? > Bug in DoubleParser and FloatParser - empty String is not casted to 0 > --------------------------------------------------------------------- > > Key: FLINK-1820 > URL: https://issues.apache.org/jira/browse/FLINK-1820 > Project: Flink > Issue Type: Bug > Components: Core > Affects Versions: 0.8.0, 0.9, 0.8.1 > Reporter: Felix Neutatz > Assignee: Felix Neutatz > Priority: Critical > Fix For: 0.9 > > > Hi, > I found the bug, when I wanted to read a csv file, which had a line like: > "||\n" > If I treat it as a Tuple2<Long,Long>, I get as expected a tuple (0L,0L). > But if I want to read it into a Double-Tuple or a Float-Tuple, I get the > following error: > java.lang.AssertionError: Test failed due to a > org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||' > ParserError NUMERIC_VALUE_FORMAT_ERROR > This error can be solved by adding an additional condition for empty strings > in the FloatParser / DoubleParser. > We definitely need the CSVReader to be able to read "empty values". > I can fix it like described if there are no better ideas :) -- This message was sent by Atlassian JIRA (v6.3.4#6332)