On Fri, 22 Feb 2019 at 10:38, Stephen Connolly <
stephen.alan.conno...@gmail.com> wrote:

>
>
> On Fri, 22 Feb 2019 at 10:16, Stephen Connolly <
> stephen.alan.conno...@gmail.com> wrote:
>
>> On Thu, 21 Feb 2019 at 18:29, Frank Grimes <frankgrime...@yahoo.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I've recently started to evaluate Flink and have found it odd that its
>>> Tuple types, while Serializable, don't implement java.lang.Comparable.
>>> This means that I either need to provide an KeySelector for many
>>> operations or subtype the Tuple types and provide my own implementation of
>>> compareTo for each.
>>>
>>> Is there a technical reason why this was omitted?
>>>
>>
>> A Tuple1<T0> would need to have an implementation of the comparable
>> contract in order to be comparable. The only way I can see of writing that
>> would be if we had:
>>
>> Tuple1<T0 extends Comparable<? super T0>>
>>
>> Similarly you'd have
>>
>> Tuple25<T0 extends Comparable<? super T0>, T1 extends Comparable<? super
>> T1>, T2 extends Comparable<? super T2>, ..., T24 extends Comparable<? super
>> T24>>
>>
>> Which is a lot less screen friendly than say
>>
>> Tuple25<T0, T1, T2, ..., T24>
>>
>> So from a purely code readability PoV you would need to at least replace
>> the  Comparable<? super T> with Comparable<T> with the immediate reduction
>> of utility... or perhaps you would remove the generics entirely and go for
>> just Comparable and have the implementation just follow the guarantee...
>> except javac will error out for the erasure.
>>
>> Now jOOQ just ignores the comparability contract on the generic types,
>> e.g.
>> https://github.com/jOOQ/jOOL/blob/master/jOOL/src/main/java/org/jooq/lambda/tuple/Tuple1.java#L35
>>
>> The result is that when asked to compare you get a cast that will work or
>> fail:
>> https://github.com/jOOQ/jOOL/blob/master/jOOL/src/main/java/org/jooq/lambda/tuple/Tuples.java#L31
>>
>> So jOOQ is following the "fail at runtime" mode... you can make tuples of
>> non-comparable objects and only at runtime if you try to compare them will
>> it blow up.
>>
>> Flink is following the "fail at compile time" mode... if you try to
>> compare a tuple, you need to provide a way to compare them.
>>
>> Flink does a lot of work - from what I can see - to fail early and fast.
>> For example, if type inference fails on the dataflow plan, it will error
>> out fast. I would hate to have a Tuple3<A,B,Y> where A and B are both
>> Comparable and Y is not... my tests happened to have a small subset of A
>> and B values that never resulted in a comparison of Y values... so the
>> tests didn't fail... but now on the production environment... at runtime...
>> 1 month later in the middle of my vacation I get a call because the
>> topology is stuck and failing on (a,b,y1).compareTo((a,b,y2))
>>
>> So my analysis would say it is not a technical reason but rather a
>> principal reason of "fail fast"
>>
>
> Idea: you could fake comparable for things that are not comparable.
>
> We know the two objects are serializable, so we could just do a byte by
> byte comparison of their serialized representations if they do not
> implement Comparable... would be slower, would probably not be the
> comparison result that people expect... but it would be consistent and not
> error out.
>
> public int compare(Serializable a,Serializable b) {
>   byte[] b1;
>   try (ByteArrayOutputStream bos = new ByteArrayOutputStream();
> ObjectOutputStream oos = new ObjectOutputStream(bos)) {
>     oos.writeObject(a);
>     b1 = bos.toByteArray();
>   }
>   byte[] b2;
>   try (ByteArrayOutputStream bos = new ByteArrayOutputStream();
> ObjectOutputStream oos = new ObjectOutputStream(bos)) {
>     oos.writeObject(a);
>     b2 = bos.toByteArray();
>   }
>   for (int i = 0; i < Math.min(b1.length,b2.length); i++) {
>     int r = Byte.compare(b1[i],b2[i]);
>     if (r != 0) {
>       return r;
>     }
>   }
>   return Integer.compare(b1.length,b2.length);
> }
>

Hmmm there is nothing stopping you from having the above as the basis for a
utility universal comparison method that you would just pass by method
handle... or if you want the jOOQ behaviour just go with

public class TupleComparator implements Comparator<Tuple> {
  ...
  // do the class cast here and let it blow up at runtime
}


>
> is an example of the kind of algorithm. The idea being that all you really
> want is a consistent comparison because they do not implement Comparable...
> if somebody had a useful comparison contract for objects of that type then
> presumably they would already have implemented the comparability contract.
>
> Another approach that could work would be to use Tuple as a builder for
> ComparableTuple by giving each one a comparable method
>
> public class Tuple2<T0,T1> ... {
>   ...
>   public ComparableTuple2<T0,T1> comparable(Comparator<? super T0> c0,
> Comparator<? super T1> c1) {
>     ...
>   }
>   ...
> }
>
> public class ComparableTuple2<T0,T1> extends ComparableTuple {
>   // notice no bounds on the generic type arguments
>   ...
>   // naïve storage (reduces GC pressure as the builder is not wasted)
>   // would need analysis to determine if better to allow the builder
> allocation
>   // to be elided by JVM and store the values directly as f0 and f1
>   private final Tuple2<T0,T1> tuple;
>   private final Comparator<? super T0> c0;
>   private final Comparator<? super T1>> c1;
>
>   ...
> }
>
> That way you get ComparableTuple2 can ignore the type bounds on its
> generic arguments because the instantiation path guarantees a comparison
> can be made.
>
> ComparableTuple2<User,Page> t =
> Tuple2.of(a,b).comparable(User::byLastName,Page::byLastAcccessed)
>
> That seems very nice and readable...
>

Thinking more, this could have negative implications on the savepoint data
sets and savepoint restore on different dataflows


>
>
>>
>
>>
>>
>>>
>>> For example, the JOOQ/JOOL Tuple types all implement Comparable:
>>>
>>> https://github.com/jOOQ/jOOL/blob/master/jOOL-java-8/src/main/java/org/jooq/lambda/tuple/Tuple2.java#L39
>>>
>>> As an aside, I tried replacing usage of Flink's Tuple types with the
>>> JOOL ones but they caused a StackOverflowError similar to this one:
>>> https://issues.apache.org/jira/browse/FLINK-3922
>>>
>>> Thanks,
>>>
>>> Frank Grimes
>>>
>>>

Reply via email to