To paraphrase a great poet, "mo' templates, 'mo problems". I agree that some theoretical benefits may be reaped in exchange for significantly higher code complexity / likely lower productivity for both developers and users of the library. We would need to see pragmatic argument why the whole library should be made much more complex in exchange compile-time benefits in a small portion of the code.
Probably the biggest issue I see with this is the combinatorial explosion of generated code. For example, let's consider the array function Take(T, Integer) -> T (for example, numpy.ndarray.take). If you introduce nullable types, rather than generating one variant for each type T and integer type, you need 4: Take(T, Int) -> T Take(T, NullableInt) -> NullableT (indices have nulls) or T (indices have no nulls) Take(NullableT, Int) -> NullableT Take(NullableT, NullableInt) -> NullableT If you add to this the fact that any nullable index type may not have any nulls, you actually have more than 4 branches of logic to consider. In Java, this would be less of a concern, because all functions are effectively virtual (dynamic dispatch overhead is something the JIT largely takes care of), but in C++ using virtual functions to make the arrays more "dynamic" (i.e. using NullableT or T in the same code path) would not yield acceptable performance. thanks, Wes On Fri, Feb 26, 2016 at 6:01 AM, Daniel Robinson <danrobinson...@gmail.com> wrote: > In C++ at least, I think making Nullable<T> a template (see here: > https://github.com/danrobinson/arrow-demo/blob/master/types.h) would make > it easier to both define arbitrary classes and to write parsers that take > advantage of template specialization. > > Re: 2 vs 3 code paths: handling the null-skipping case can be a single line > of code at the start of the function (which could be reduced to a macro): > > if (arr.null_count() == 0) return ALGORITHM_NAME(arr.child_array()); > > And it seems like good practice anyway to separate the null_count=0 code > into a separate function. >