[ 
https://issues.apache.org/jira/browse/FLINK-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603905#comment-14603905
 ] 

ASF GitHub Bot commented on FLINK-2230:
---------------------------------------

Github user Shiti commented on the pull request:

    https://github.com/apache/flink/pull/867#issuecomment-115936916
  
    @StephanEwen, Apologies, I didn't notice the earlier message in jira. 
Something wrong with my GMail settings, most of the messages from jira and 
mailing list went into Spam.
    
    Kindly excuse my limited understanding of this framework and the 
intention/drivers behind the decisions made. 
    
    Going through the mailing list and the ticket I realized that though there 
may be some valid cases of missing data types, it will not be desirable to 
change the `TupleTypeInfo` and the whole Tuple/Case Class Serialization 
code-base to support null and we should identify an alternative approach to 
handle this.
    
    From my limited understanding, the recommended way of working with missing 
values is to use `(Option[Int], Option[Int]])` instead of `(Int, Int)`, when we 
know there can be missing values in the data. Is that correct?
    
    If that is correct, I have a few doubts,
    
    1. Doesn't this push the handling of missing data to the application code 
(which may be good or bad), but makes the application code more verbose?
    2. Wouldn't the size of Option[Int] in memory (and also in serialization) 
be more than just Int?
    3. If Flink does not support null values except for in the Table API, won’t 
there be inconsistency when users try to convert a `Table` to a 
`DataSet[Tuple]`? 
    
    One alternative approach I can think of is introducing another TypeInfo 
which supports null values (say TupleTypeInfoWithNull) so users can choose to 
use that when they know/think that the data may contain null.


> Add Support for Null-Values in TupleSerializer
> ----------------------------------------------
>
>                 Key: FLINK-2230
>                 URL: https://issues.apache.org/jira/browse/FLINK-2230
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Shiti Saxena
>            Assignee: Shiti Saxena
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to