[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514087#comment-14514087
 ] 

ASF GitHub Bot commented on FLINK-1820:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/566#discussion_r29143622
  
    --- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvInputFormatTest.java 
---
    @@ -353,6 +354,99 @@ public void testIntegerFieldsl() throws IOException {
                        assertEquals(Integer.valueOf(888), result.f2);
                        assertEquals(Integer.valueOf(999), result.f3);
                        assertEquals(Integer.valueOf(000), result.f4);
    +
    +                   result = format.nextRecord(result);
    +                   assertNull(result);
    +                   assertTrue(format.reachedEnd());
    +           }
    +           catch (Exception ex) {
    +                   fail("Test failed due to a " + ex.getClass().getName() 
+ ": " + ex.getMessage());
    +           }
    +   }
    +
    +   @Test
    +   public void testEmptyFields() throws IOException {
    +           try {
    +                   final String fileContent = "|0|0|0|0\n" +
    +                           "1||1|1|1|\n" +
    +                           "2|2| |2|2|\n" +
    +                           "3 |3|3|  |3|\n" +
    +                           "4|4|4|4| |\n";
    +                   final FileInputSplit split = 
createTempFile(fileContent);
    +
    +                   final TupleTypeInfo<Tuple5<Short, Integer, Long, Float, 
Double>> typeInfo =
    +                           
TupleTypeInfo.getBasicTupleTypeInfo(Short.class, Integer.class, Long.class, 
Float.class, Double.class);
    +                   final CsvInputFormat<Tuple5<Short, Integer, Long, 
Float, Double>> format = new CsvInputFormat<Tuple5<Short, Integer, Long, Float, 
Double>>(PATH, typeInfo);
    +
    +                   format.setFieldDelimiter("|");
    +
    +                   format.configure(new Configuration());
    +                   format.open(split);
    +
    +                   Tuple5<Short, Integer, Long, Float, Double> result = 
new Tuple5<Short, Integer, Long, Float, Double>();
    +
    +                   try {
    +                           result = format.nextRecord(result);
    +                           fail("Empty String Parse Exception was not 
thrown! (ShortParser)");
    +                   } catch (ParseException e) {}
    +                   try {
    +                           result = format.nextRecord(result);
    +                           fail("Empty String Parse Exception was not 
thrown! (IntegerParser)");
    +                   } catch (ParseException e) {}
    +                   try {
    +                           result = format.nextRecord(result);
    +                           fail("Empty String Parse Exception was not 
thrown! (LongParser)");
    +                   } catch (ParseException e) {}
    +                   try {
    +                           result = format.nextRecord(result);
    --- End diff --
    
    Doesn't this call fail because of the tailing whitespace in the `short` 
field?


> Bug in DoubleParser and FloatParser - empty String is not casted to 0
> ---------------------------------------------------------------------
>
>                 Key: FLINK-1820
>                 URL: https://issues.apache.org/jira/browse/FLINK-1820
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.0, 0.9, 0.8.1
>            Reporter: Felix Neutatz
>            Assignee: Felix Neutatz
>            Priority: Critical
>             Fix For: 0.9
>
>
> Hi,
> I found the bug, when I wanted to read a csv file, which had a line like:
> "||\n"
> If I treat it as a Tuple2<Long,Long>, I get as expected a tuple (0L,0L).
> But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
> following error:
> java.lang.AssertionError: Test failed due to a 
> org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
> ParserError NUMERIC_VALUE_FORMAT_ERROR 
> This error can be solved by adding an additional condition for empty strings 
> in the FloatParser / DoubleParser.
> We definitely need the CSVReader to be able to read "empty values".
> I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to