For a similar problem where we wanted to preserve and track null entries,
we load the CSV as a DataSet[Array[Object]] and then transform it into
DataSet[Row] using a custom RowSerializer(
https://gist.github.com/Shiti/d0572c089cc08654019c) which handles null.
The Table API(which supports null) can
For a similar problem where we wanted to preserve and track null entries,
we load the CSV as a DataSet[Array[Object]] and then transform it into
DataSet[Row] using a custom RowSerializer(
https://gist.github.com/Shiti/d0572c089cc08654019c) which handles null.
The Table API(which supports null) can
s
>> well. In Java, TupleSerializer is responsible for, well, Tuples.
>>
>> On Tue, 16 Jun 2015 at 06:25 Shiti Saxena wrote:
>>
>>> Hi,
>>>
>>> Can I work on the issue with TupleSerializer or is someone working on it?
>>>
>>> On Mon, J
).head.productElement(0)
> assertEquals(total, 702)
> }
>
> it would have to modified in a similar way to the PojoSerializer and
> RowSerializer. You could either leave the tests as they are now in you pull
> request or also modify the TupleSerializer. Both seem fine to me.
&
Row(2)
val amount = if(entry._1<100) null else entry._1
row.setField(0, amount)
row.setField(1, entry._2)
row
}
val total =
rowDataSet.toTable.select('id.sum).collect().head.productElement(0)
assertEquals(total, 702)
}
On Sun, Jun
uot;a"), (234, "b"),
> (345, "c"), (null, "d")).toTable
>
> I used Integer instead of Int because Scala will complain that null is not
> a valid value for Int otherwise.
>
> Cheers,
> Aljoscha
>
>
> On Sun, 14 Jun 2015 at 19:34 Aljoscha
taSet[Row] is then converted into
Table. Should I use the same approach for the test case?
Thanks,
Shiti
On Sun, Jun 14, 2015 at 4:10 PM, Shiti Saxena wrote:
> I'll do the fix
>
> On Sun, Jun 14, 2015 at 12:42 AM, Aljoscha Krettek
> wrote:
>
>> I merged you
t; explanations.
>>
>> On Thu, 11 Jun 2015 at 09:33 Till Rohrmann wrote:
>>
>>> Hi Shiti,
>>>
>>> here is the issue [1].
>>>
>>> Cheers,
>>> Till
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-2203
&
at the RowSerializer does not support
> null-values. I think we can add support for this, I will open a Jira issue.
>
> Another problem I then see is that the aggregations can not properly deal
> with null-values. This would need separate support.
>
> Regards,
> Aljoscha
>
> On Thu,
Hi,
In our project, we are using the Flink Table API and are facing the
following issues,
We load data from a CSV file and create a DataSet[Row]. The CSV file can
also have invalid entries in some of the fields which we replace with null
when building the DataSet[Row].
This DataSet[Row] is later
10 matches
Mail list logo