Thanks a lot everyone for your comments. Sorry, I meant to say adding transaction/update/append functionalities in the Dataset API, but it seems like it would be a duplication of work as in Apache Iceberg. The only problem with Iceberg/Delta Lake is that it is heavily locked into the JVM ecosystem, making it difficult to integrate with backends with C++-based storage interfaces.
On Sat, Sep 10, 2022 at 1:39 AM Weston Pace <weston.p...@gmail.com> wrote: > I'd agree with Micah. I'm also not aware of anyone working on this. > The docs clarify a bit more on the details[1]. I think we'd need a > bit more thinking around an "update/append" workflow too. > > That being said, updates, transactions, and appends are something that > the Iceberg project has thought a lot about. Rather than reinvent the > wheel I think it'd be interesting to see if Acero could be used on the > read path of an Iceberg workflow. I have not really planned out what > that would look like in great detail and, at a minimum, you'd maybe > want some kind of Iceberg -> Substrait planner. > > [1] > https://arrow.apache.org/docs/python/dataset.html#a-note-on-transactions-acid-guarantees > > On Fri, Sep 9, 2022 at 12:06 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > > I would think any transaction concerns would live at the peripheries? > e.g. > > the Datasets? Or at least that is where compatibility would have to be > > built first. > > > > On Fri, Sep 9, 2022 at 12:01 PM Sasha Krassovsky < > krassovskysa...@gmail.com> > > wrote: > > > > > Hi Jayjeet, > > > Transactions are currently out of scope for Acero - Acero is only > meant to > > > be a query execution engine. That said, it can definitely be used as a > > > component for building a full database engine, which could implement > its > > > own locking of rows while Acero executes on them. You could also check > out > > > DuckDB, which can operate on Arrow data and also supports transactions. > > > > > > Sasha > > > > > > > 9 сент. 2022 г., в 11:54, Jayjeet Chakraborty < > > > jayjeetchakrabort...@gmail.com> написал(а): > > > > > > > > Hi Arrow Community, > > > > > > > > Since Acero is developing very fast into a full fledged compute > engine, > > > are > > > > there plans to add transaction semantics to acero, so that it can > also be > > > > used as a database layer over already supported storage backends ? > What I > > > > am referring to is like a Delta Lake/Iceberg kind of interface over > Acero > > > > in C++. Thanks. > > > > > > > > > > > > -- > > > > *Jayjeet Chakraborty* > > > > CS PhD student > > > > UC Santa Cruz > > > > California, USA > > > > -- *Jayjeet Chakraborty* CS PhD student UC Santa Cruz California, USA