Re: The null in Flink

Stephan Ewen Wed, 17 Jun 2015 17:43:45 -0700

Hi!

I think we actually have two discussions here, both of them important:

--------------------------------------------------------------
1) Null values in the Programming Language APIs
--------------------------------------------------------------

Fields in composite types may simply be null pointers.

In object types:
  - primitives members are naturally non-nullable
  - all other members are nullable

=> If you want to avoid the overhead of nullability, go with primitive
types.

In Tuples, and derives types (Scala case classes):
  - Fields are non-nullable.

=> The reason here is that we initially decided to keep tuples as a very
fast data type. Because tuples cannot hold primitives in Java/Scala, we
would not have a way to make fast non-nullable fields. The performance of
nullable fields affects the key-operations, especially on normalized keys.
We can work around that with some effort, but have not one it so far.

=> In Scala, the Option types is a natural way of elegantly working around
that.

--------------------------------------------------------------
2) Null values in the high-level (logial) APIs
--------------------------------------------------------------

This is mainly what Ted was referring to, if I understood him correctly.

Here, we need to figure out what form of semantical null values in the
Table API and later, in SQL.

Besides deciding what semantics to follow here in the logical APIs, we need
to decide what these values confert to/from when switching between
logical/physical APIs.

On Mon, Jun 15, 2015 at 10:07 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> On Mon, Jun 15, 2015 at 8:45 AM, Maximilian Michels <m...@apache.org>
> wrote:
>
> > Just to give an idea what null values could cause in Flink:
> DataSet.count()
> > returns the number of elements of all values in a Dataset (null or not)
> > while #834 would ignore null values and aggregate the DataSet without
> them.
> >
>
> Compare R's na.action.
>
> http://www.ats.ucla.edu/stat/r/faq/missing.htm
>

Re: The null in Flink

Reply via email to