On Thu, Feb 9, 2017 at 3:50 PM, pat.chormai <pat.chor...@gmail.com> wrote:

> Hi Greg,
>
> Thanks for your feedback. I would like to answer your questions here.
>
> Q: Do I understand correctly that the generated code is only dependent on
> the
> length of the sort key?
>
> A: Yes, you're right. The generated code is mainly dependent on the length
> of the sort key. However, I'm not sure whether we can reuse the generated
> code for 2 different types of key even they have the same length.
>

The only polymorphic callsite is NormalizedKeySorter's call to
TypeComparator.compareSerialized(DataInputView, DataInputView).


> Q: How do you propose computing indexEntriesPerSegment? Is the improved
> performance not dependent on this value being a power-of-2?
>
> A: This improvement doesn't change the way indexEntriesPerSegment is
> computed. What we did here is replacing indexEntriesPerSegment at
> expressions that indexEntriesPerSegment is a divisor with its value [1]. As
> a result, it does not require value of indexEntriesPerSegment to be an
> exponent of 2. More information related to this technique can be found at
> [2].
>

The example for dividing by 5 requires 2 multiples, 6 shifts, and 6
adds/subtracts. Then another multiple and subtract to get the remainder.
FLINK-3722 uses one or two adds/subtracts and a skewed branch prediction.


> Q: Are you able to compare with FLINK-3722?
>
> A: We're planning to do that as well, however, it might take some time due
> to exams.
>
> Also, could you please elaborate more why DividedByConstant and
> UsingBitwiseOperators use only half of sort memory when
> indexEntriesPerSegment is power of 2?
>

If the key size is one more than a power-of-2 then it looks like those
implementations are padding out to the next higher power-of-2 (e.g., key
size of 9 bytes needs 18 bytes but will consume 32 bytes).

[1]
> https://github.com/heytitle/flink-sorter-performance-
> evaluation/blob/master/src/main/java/org/evaluation/sorter/individual/
> optimization/DividedByConstant.java#L353
> [2]
> https://blogs.msdn.microsoft.com/devdev/2005/12/12/integer-
> division-by-constants/
>
>
>
> --
> View this message in context: http://apache-flink-mailing-
> list-archive.1008284.n3.nabble.com/FLINK-5734-Code-Generation-for-
> NormalizedKeySorter-tp15804p15860.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>

Reply via email to