Hi All, Hope you all are keeping well in these COVID times.
I am using KLL Sketch for my project where I need to compute approx percentile over a stream. I have gone through the paper Streaming Quantiles Algorithms with Small Space and Update Time (arxiv.org) <https://arxiv.org/pdf/1907.00236.pdf>. There in the last section in appendix Figure 11, it has been mentioned that a significant *amount of time is spend in sorting* after sketch becomes full and, also the figure shows with sort the update time goes to *50ns( Opposed to 20ns without sorting)*. I was wondering if there has been an effort to improve on this using vectorized instruction (leveragin AVX2, AVX512 instructions) even if it works just for integers & floats. There are Sorts like Bitonic Sort which can reduce sorting time with AVX instructions for smaller arrays, as they use sorting networks. My Question to team is : Q1. Some guidance/pointers on any KLL Sketch vectorization effort in this direction if any? (Specially to reduce the sort time, even if it works just for integer & float sketches) I tried to optimize sketch further for integer streams. It was evident in profiling that 50% of time is spent in sorting. Sorting suffered mostly because of branch misprediction. Q2. Do you think any such vectorization effort will be better without lazy compaction? As the level boundaries will be fixed, which might help while loading/storing data into vector registers. -- Regards, Gourav