Hi everyone,

I'm seeing a lot of null value related pull requests nowadays, like these:

https://github.com/apache/flink/pull/780
https://github.com/apache/flink/pull/831
https://github.com/apache/flink/pull/834

It used to be the case that null values were simply not supported by Flink.
Recently, Flink supports null values for some components. Now I'm wondering
what the current state of null values in Flink is. While ignoring null
values might be a good for not crashing your programs, null values are
generally a bad way of signaling empty values for which better strategies
are available. My intuition would be that it is a bit evil to support them
in DataSets.

Just to give an idea what null values could cause in Flink: DataSet.count()
returns the number of elements of all values in a Dataset (null or not)
while #834 would ignore null values and aggregate the DataSet without them.

Best,
Max

Reply via email to