I wonder if Memory itself could be that layer?

On Wed, Feb 5, 2020 at 10:03 AM Jad Naous <jad.na...@imply.io> wrote:

> We could build an abstraction layer on top of the memory interface provided
> by DataSketches. When the JDK gets the new stuff, we can just change the
> implementation of the abstraction.
>
> On Wed, Feb 5, 2020 at 9:43 AM Gian Merlino <g...@apache.org> wrote:
>
> > The thing that worries me about JEP 370 is that if historical Java user
> > migration patterns hold up, we will need to support Java 11 for a while
> > (probably another 2–3 years), and we would therefore need to wait that
> long
> > to use JEP 370. It seems like a long time and until then we would be
> stuck
> > with a pretty inferior API.
> >
> > I also would prefer not having to rewrite code a bunch of times, but
> that's
> > why I suggested starting by using Memory for the VectorAggregator
> interface
> > and stuff that interacts with it. There isn't that much code there yet
> > (only a few aggregators implement VectorAggregator). So we will need to
> > write most of it for the first time, and since it is fresh code, I think
> > it'd be nice to use the best API currently available in Java 8 / 11. From
> > what I see that is Memory.
> >
> > On Wed, Feb 5, 2020 at 9:21 AM Slim Bouguerra <bs...@apache.org> wrote:
> >
> > > Hi Gian,
> > > Thanks for bringing this up.
> > > IMO for the long run and looking at how much code will have to change,
> it
> > > makes more sense to rely on JDK based API JEP 370 and have this work
> done
> > > ONCE as oppose to multiple iteration. FYI i do not think it is far
> away,
> > > seems like there is a good momentum around it.
> > > This does not exclude or means we should not use Memory API for other
> > stuff
> > > like sketches et al, in fact i think even for project like Sketches it
> > > makes more sense to move to newer API offered by the JDK rather that do
> > it
> > > your self.
> > >
> > >
> > > On Tue, Feb 4, 2020 at 10:12 PM Gian Merlino <g...@apache.org> wrote:
> > >
> > > > Hey Druids,
> > > >
> > > > There has generally been a lot of talk about moving away from
> > ByteBuffer
> > > > and towards the DataSketches Memory package (
> > > > https://datasketches.apache.org/docs/Memory/MemoryPackage.html) or
> > even
> > > > using Unsafe directly. Much of that discussion happened on
> > > > https://github.com/apache/druid/issues/3892.
> > > >
> > > > Recently a patch was merged that added datasketches-memory as a
> > > dependency
> > > > of druid-processing: https://github.com/apache/druid/pull/9308. The
> > > reason
> > > > was partially due to better performance and partially due to nicer
> API
> > > > (both reasons mentioned in #3892 as well).
> > > >
> > > > JEP 370 is a potential long term solution but it seems a while away
> > from
> > > > being ready: https://openjdk.java.net/jeps/370
> > > >
> > > > I wanted to bring the larger discussion back up and see what people
> > think
> > > > is a good path forward.
> > > >
> > > > My suggestion is that we migrate the VectorAggregator interface to
> use
> > > > Memory, but keep BufferAggregator the way it is. That way, as we
> build
> > > out
> > > > support for vectorization (right now, only timeseries/groupby support
> > it,
> > > > and only a few aggregators, but we should be building this out) we'll
> > be
> > > > doing it with a nicer and potentially faster API. But we won't need
> to
> > go
> > > > back and redo a bunch of old code, since we'll keep the
> non-vectorized
> > > code
> > > > paths the way they are. (And hopefully, one day, delete them all
> > > outright.)
> > > >
> > > > Gian
> > > >
> > >
> >
>

Reply via email to