For a similar problem where we wanted to preserve and track null entries,
we load the CSV as a DataSet[Array[Object]] and then transform it into
DataSet[Row] using a custom RowSerializer(
https://gist.github.com/Shiti/d0572c089cc08654019c) which handles null.

The Table API(which supports null) can then be used on the resulting
DataSet[Row].


On Fri, Oct 23, 2015 at 7:38 PM, Maximilian Michels <m...@apache.org> wrote:

> Hi Philip,
>
> How about making the empty field of type String? Then you can read the CSV
> into a DataSet and treat the empty string as a null value. Not very nice
> but a workaround. As of now, Flink deliberately doesn't support null values.
>
> Regards,
> Max
>
>
> On Thu, Oct 22, 2015 at 4:30 PM, Philip Lee <philjj...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to load the dataset with the part of null value by using
>> readCsvFile().
>>
>> // e.g  _date|_click|_sales|_item|_web_page|_user
>>
>> case class WebClick(_click_date: Long, _click_time: Long, _sales: Int, 
>> _item: Int,_page: Int, _user: Int)
>>
>> private def getWebClickDataSet(env: ExecutionEnvironment): DataSet[WebClick] 
>> = {
>>
>>   env.readCsvFile[WebClick](
>>     webClickPath,
>>     fieldDelimiter = "|",
>>     includedFields = Array(0, 1, 2, 3, 4, 5),
>>     // lenient = true
>>   )
>> }
>>
>>
>> Well, I know there is an option to ignore malformed value, but I have to
>> read the dataset even though it has null value.
>>
>> as it follows, dataset (third column is null) looks like
>> 37794|24669||16705|23|54810
>> but I have to read null value as well because I have to use filter or
>> where function ( _sales == null )
>>
>> Is there any detail suggestion to do it?
>>
>> Thanks,
>> Philip
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> ==========================================================
>>
>> *Hae Joon Lee*
>>
>>
>> Now, in Germany,
>>
>> M.S. Candidate, Interested in Distributed System, Iterative Processing
>>
>> Dept. of Computer Science, Informatik in German, TUB
>>
>> Technical University of Berlin
>>
>>
>> In Korea,
>>
>> M.S. Candidate, Computer Architecture Laboratory
>>
>> Dept. of Computer Science, KAIST
>>
>>
>> Rm# 4414 CS Dept. KAIST
>>
>> 373-1 Guseong-dong, Yuseong-gu, Daejon, South Korea (305-701)
>>
>>
>> Mobile) 49) 015-251-448-278 in Germany, no cellular in Korea
>>
>> ==========================================================
>>
>
>

Reply via email to