On Thu, Feb 9, 2017 at 3:50 PM, pat.chormai <pat.chor...@gmail.com> wrote:
> Hi Greg, > > Thanks for your feedback. I would like to answer your questions here. > > Q: Do I understand correctly that the generated code is only dependent on > the > length of the sort key? > > A: Yes, you're right. The generated code is mainly dependent on the length > of the sort key. However, I'm not sure whether we can reuse the generated > code for 2 different types of key even they have the same length. > The only polymorphic callsite is NormalizedKeySorter's call to TypeComparator.compareSerialized(DataInputView, DataInputView). > Q: How do you propose computing indexEntriesPerSegment? Is the improved > performance not dependent on this value being a power-of-2? > > A: This improvement doesn't change the way indexEntriesPerSegment is > computed. What we did here is replacing indexEntriesPerSegment at > expressions that indexEntriesPerSegment is a divisor with its value [1]. As > a result, it does not require value of indexEntriesPerSegment to be an > exponent of 2. More information related to this technique can be found at > [2]. > The example for dividing by 5 requires 2 multiples, 6 shifts, and 6 adds/subtracts. Then another multiple and subtract to get the remainder. FLINK-3722 uses one or two adds/subtracts and a skewed branch prediction. > Q: Are you able to compare with FLINK-3722? > > A: We're planning to do that as well, however, it might take some time due > to exams. > > Also, could you please elaborate more why DividedByConstant and > UsingBitwiseOperators use only half of sort memory when > indexEntriesPerSegment is power of 2? > If the key size is one more than a power-of-2 then it looks like those implementations are padding out to the next higher power-of-2 (e.g., key size of 9 bytes needs 18 bytes but will consume 32 bytes). [1] > https://github.com/heytitle/flink-sorter-performance- > evaluation/blob/master/src/main/java/org/evaluation/sorter/individual/ > optimization/DividedByConstant.java#L353 > [2] > https://blogs.msdn.microsoft.com/devdev/2005/12/12/integer- > division-by-constants/ > > > > -- > View this message in context: http://apache-flink-mailing- > list-archive.1008284.n3.nabble.com/FLINK-5734-Code-Generation-for- > NormalizedKeySorter-tp15804p15860.html > Sent from the Apache Flink Mailing List archive. mailing list archive at > Nabble.com. >