Hi Greg!

Sounds very interesting.

Do you have a hunch what "virtual" Tuple methods are being used that become
less jit-able? In many cases, tuples use only field accesses (like
"vakle.f1") in the user functions.

I have to dig into the serializers, to see if they could suffer from that.
The "getField(pos)" method for example should always have many overrides
(though few would be loaded at any time, because one usually does not use
all Tuple classes at the same time).

Greetings,
Stephan


On Fri, Mar 4, 2016 at 11:37 PM, Greg Hogan <c...@greghogan.com> wrote:

> I am noticing what looks like the same drop-off in performance when
> introducing TupleN subclasses as expressed in "Understanding the JIT and
> tuning the implementation" [1].
>
> I start my single-node cluster, run an algorithm which relies purely on
> Tuples, and measure the runtime. I execute a separate jar which executes
> essentially the same algorithm but using Gelly's Edge (which subclasses
> Tuple3 but does not add any extra fields) and now both the Tuple and Edge
> algorithms take twice as long.
>
> Has this been previously discussed? If not I can work up a demonstration.
>
> [1] https://flink.apache.org/news/2015/09/16/off-heap-memory.html
>
> Greg
>

Reply via email to