Hi Gian, Thanks for bringing this up. IMO for the long run and looking at how much code will have to change, it makes more sense to rely on JDK based API JEP 370 and have this work done ONCE as oppose to multiple iteration. FYI i do not think it is far away, seems like there is a good momentum around it. This does not exclude or means we should not use Memory API for other stuff like sketches et al, in fact i think even for project like Sketches it makes more sense to move to newer API offered by the JDK rather that do it your self.
On Tue, Feb 4, 2020 at 10:12 PM Gian Merlino <g...@apache.org> wrote: > Hey Druids, > > There has generally been a lot of talk about moving away from ByteBuffer > and towards the DataSketches Memory package ( > https://datasketches.apache.org/docs/Memory/MemoryPackage.html) or even > using Unsafe directly. Much of that discussion happened on > https://github.com/apache/druid/issues/3892. > > Recently a patch was merged that added datasketches-memory as a dependency > of druid-processing: https://github.com/apache/druid/pull/9308. The reason > was partially due to better performance and partially due to nicer API > (both reasons mentioned in #3892 as well). > > JEP 370 is a potential long term solution but it seems a while away from > being ready: https://openjdk.java.net/jeps/370 > > I wanted to bring the larger discussion back up and see what people think > is a good path forward. > > My suggestion is that we migrate the VectorAggregator interface to use > Memory, but keep BufferAggregator the way it is. That way, as we build out > support for vectorization (right now, only timeseries/groupby support it, > and only a few aggregators, but we should be building this out) we'll be > doing it with a nicer and potentially faster API. But we won't need to go > back and redo a bunch of old code, since we'll keep the non-vectorized code > paths the way they are. (And hopefully, one day, delete them all outright.) > > Gian >