Thanks a lot everyone for your comments. Sorry, I meant to say
adding transaction/update/append functionalities in the Dataset API, but it
seems like it would be a duplication of work as in Apache Iceberg. The only
problem with Iceberg/Delta Lake is that it is heavily locked into the JVM
ecosystem, making it difficult to integrate with backends with C++-based
storage interfaces.

On Sat, Sep 10, 2022 at 1:39 AM Weston Pace <weston.p...@gmail.com> wrote:

> I'd agree with Micah.  I'm also not aware of anyone working on this.
> The docs clarify a bit more on the details[1].  I think we'd need a
> bit more thinking around an "update/append" workflow too.
>
> That being said, updates, transactions, and appends are something that
> the Iceberg project has thought a lot about.  Rather than reinvent the
> wheel I think it'd be interesting to see if Acero could be used on the
> read path of an Iceberg workflow.  I have not really planned out what
> that would look like in great detail and, at a minimum, you'd maybe
> want some kind of Iceberg -> Substrait planner.
>
> [1]
> https://arrow.apache.org/docs/python/dataset.html#a-note-on-transactions-acid-guarantees
>
> On Fri, Sep 9, 2022 at 12:06 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> >
> > I would think any transaction concerns would live at the peripheries?
> e.g.
> > the Datasets?  Or at least that is where compatibility would have to be
> > built first.
> >
> > On Fri, Sep 9, 2022 at 12:01 PM Sasha Krassovsky <
> krassovskysa...@gmail.com>
> > wrote:
> >
> > > Hi Jayjeet,
> > > Transactions are currently out of scope for Acero - Acero is only
> meant to
> > > be a query execution engine. That said, it can definitely be used as a
> > > component for building a full database engine, which could implement
> its
> > > own locking of rows while Acero executes on them. You could also check
> out
> > > DuckDB, which can operate on Arrow data and also supports transactions.
> > >
> > > Sasha
> > >
> > > > 9 сент. 2022 г., в 11:54, Jayjeet Chakraborty <
> > > jayjeetchakrabort...@gmail.com> написал(а):
> > > >
> > > > Hi Arrow Community,
> > > >
> > > > Since Acero is developing very fast into a full fledged compute
> engine,
> > > are
> > > > there plans to add transaction semantics to acero, so that it can
> also be
> > > > used as a database layer over already supported storage backends ?
> What I
> > > > am referring to is like a Delta Lake/Iceberg kind of interface over
> Acero
> > > > in C++. Thanks.
> > > >
> > > >
> > > > --
> > > > *Jayjeet Chakraborty*
> > > > CS PhD student
> > > > UC Santa Cruz
> > > > California, USA
> > >
>


-- 
*Jayjeet Chakraborty*
CS PhD student
UC Santa Cruz
California, USA

Reply via email to