Re: [Discuss] Deprecating Plasma

2022-09-22 Thread Sutou Kouhei
+1 In "[Discuss] Deprecating Plasma" on Thu, 22 Sep 2022 17:38:27 +0200, Antoine Pitrou wrote: > > Hello, > > The Plasma object store (*) hasn't received significant maintenance > since at least 2020. The original authors have stopped contributing to > the Arrow community and instead fork

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Sutou Kouhei
+1 In "[VOTE] Adopt ADBC database client connectivity specification" on Wed, 21 Sep 2022 11:40:11 -0400, "David Li" wrote: > Hello, > > We have been discussing [1] standard interfaces for Arrow-based database > access and have been working on implementations of the proposed interfaces >

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Kun Liu
+1 (non-binding) Gavin Ray 于2022年9月23日周五 03:40写道: > Ah yeah that's true, good point > > > > On Thu, Sep 22, 2022 at 2:38 PM David Li wrote: > > > I suppose the separator would have to be known to the client somehow > > (perhaps as metadata) - you'd have the same problem in the opposite > > dire

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Gavin Ray
Ah yeah that's true, good point On Thu, Sep 22, 2022 at 2:38 PM David Li wrote: > I suppose the separator would have to be known to the client somehow > (perhaps as metadata) - you'd have the same problem in the opposite > direction if the result were a list right? You wouldn't be able to > co

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread David Li
I suppose the separator would have to be known to the client somehow (perhaps as metadata) - you'd have the same problem in the opposite direction if the result were a list right? You wouldn't be able to concatenate the parts together without knowing a safe separator to use. On Thu, Sep 22, 202

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Gavin Ray
Wait, what happens if a datasource's spec allows dots as valid identifiers? On Thu, Sep 22, 2022 at 2:22 PM Gavin Ray wrote: > Ah okay, yeah that's a reasonable angle too haha > > > On Thu, Sep 22, 2022 at 1:59 PM David Li wrote: > >> Frankly it was from a "not drastically refactoring things" p

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Gavin Ray
Ah okay, yeah that's a reasonable angle too haha On Thu, Sep 22, 2022 at 1:59 PM David Li wrote: > Frankly it was from a "not drastically refactoring things" perspective :) > > At least for Arrow: list[utf8] is effectively a utf8 array with an extra > array of offsets, so there's relatively lit

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread David Li
Frankly it was from a "not drastically refactoring things" perspective :) At least for Arrow: list[utf8] is effectively a utf8 array with an extra array of offsets, so there's relatively little overhead. (In particular, there's not an extra allocation per array; there's just an overall allocatio

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Gavin Ray
I suppose you're thinking from a memory/performance perspective right? Allocating a dot character is a lot better than allocating multiple arrays Yeah I don't see why not -- this could even be a library internal where the fact that it's dotted is an implementation detail Then in the Java implement

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread David Li
Ah, interesting… A self-recursive schema wouldn't work in Arrow's schema system, so it'd have to be the latter solution. Or, would it work to have a dotted name in the schema name column? Would parsing that back out (for applications that want to work with the full hierarchy) be too much troubl

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Gavin Ray
Antoine, I can't comment on the Go code (not qualified) but to me, the "verification" test examples look like a mixture between JDBC and Java FlightSQL driver usage, and seem solid. There was one reservation I had about the ability to handle datasource namespacing that I brought up early on in the

[Discuss] Deprecating Plasma

2022-09-22 Thread Antoine Pitrou
Hello, The Plasma object store (*) hasn't received significant maintenance since at least 2020. The original authors have stopped contributing to the Arrow community and instead forked their own code for internal use inside another project (https://github.com/ray-project/ray/tree/master/src

Re: compressed feather v2 "slicing from the middle"

2022-09-22 Thread John Muehlhausen
If I'm understanding the below correctly, it seems that the file format supports finding an arbitrary compressed buffer without decompressing anything else. Correct? -John /// -- /// A Buffer represents a single contiguous memor

Re: compressed feather v2 "slicing from the middle"

2022-09-22 Thread John Muehlhausen
Regarding tab=feather.read_table(fname, memory_map=True) Uncompressed: low-cost setup and len(tab), data is read when sections of the map are "paged in" by the OS Compressed (desired): * low-cost setup * read the length of the "table" without decompressing anything ( len(tab) ) * low-co

Re: [PROPOSAL] Serve stable and development versions of Arrow Cookbooks

2022-09-22 Thread Raúl Cumplido
> Please call it "main" or "master", not "dev". > Sure, just to confirm I said dev because it is using the latest nightlies version but the branch is called main on the cookbooks repo and I don't plan to change that.

Re: [PROPOSAL] Serve stable and development versions of Arrow Cookbooks

2022-09-22 Thread Antoine Pitrou
Hello, Le 22/09/2022 à 15:27, Raúl Cumplido a écrit : I want to get feedback on the following proposal: Stable will be hosted as today at https://arrow.apache.org/cookbook and the development version at https://arrow.apache.org/cookbook/dev. In order to automate the process my initial idea

Re: unclear compilation errors with util::optional

2022-09-22 Thread Yaron Gvili
Updating with master fixed it, thanks. Yaron. From: Antoine Pitrou Sent: Thursday, September 22, 2022 4:28 AM To: dev@arrow.apache.org Subject: Re: unclear compilation errors with util::optional Hi Yaron, On git master we recently moved to C++17 and therefore

[PROPOSAL] Serve stable and development versions of Arrow Cookbooks

2022-09-22 Thread Raúl Cumplido
Hi, I have started to do some work on the Arrow Cookbooks to allow us to build them for both the stable and development versions of Arrow. The rationale for doing this is that today we have to wait until the new version of Arrow is released in order to create Cookbooks for the new functionality.

Re: [VOTE] Adopt ADBC database client connectivity specification

2022-09-22 Thread Antoine Pitrou
Hello, I would urge people to review the proposed ADBC APIs, especially the Go and Java APIs which probably benefitted from less feedback than the C one. Regards Antoine. Le 21/09/2022 à 17:40, David Li a écrit : Hello, We have been discussing [1] standard interfaces for Arrow-based dat

Re: unclear compilation errors with util::optional

2022-09-22 Thread Antoine Pitrou
Hi Yaron, On git master we recently moved to C++17 and therefore removed compatibility backports such as arrow::util::optional. Now you should just use std::optional. So be sure to rebase your work on master and fix any reference to those compatibility backports in your code. Regards A

unclear compilation errors with util::optional

2022-09-22 Thread Yaron Gvili
Hi, In a PR I'm working on [1], I get compilation errors in CI jobs that I don't see the reason for. I'd appreciate help with this. For example, one job's [2] compilation complains about the util::optional symbol not being declared (this happens in other jobs too). This is unclear for a couple