Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.11.2 RC1

2024-12-23 Thread Will Jones
+1 (binding). Verified on M2 Mac. Thanks, Andrew! On Fri, Dec 20, 2024 at 2:37 PM L. C. Hsieh wrote: > +1 (binding) > > Verified on M3 Mac. > > Thanks Andrew. > > On Fri, Dec 20, 2024 at 1:32 PM Andrew Lamb wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow Rust Object

Re: [VOTE][RUST] Release Apache Arrow Rust 54.0.0 RC1

2024-12-19 Thread Will Jones
+1 (binding). Verified on M2 Mac. Thanks, Andrew! On Thu, Dec 19, 2024 at 8:51 AM L. C. Hsieh wrote: > +1 (binding) > > Verified on M3 Mac. > > Thanks Andrew. > > On Thu, Dec 19, 2024 at 6:47 AM Andrew Lamb wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow Rust Imple

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.11.1 RC1

2024-10-17 Thread Will Jones
+1 (binding). Verified on M2 Mac. Thanks for making this release. On Wed, Oct 16, 2024 at 3:36 AM Raphael Taylor-Davies wrote: > +1 (binding) > > Verified on x86_64 GNU/Linux > > On 15/10/2024 22:40, L. C. Hsieh wrote: > > +1 (binding) > > > > Verified on M3 Mac. > > > > Thanks Andrew. > > > >

Re: [VOTE][RUST] Release Apache Arrow Rust 53.1.0 RC1

2024-10-03 Thread Will Jones
+1 Verified on M2 Mac. Thanks, Andrew. On Thu, Oct 3, 2024 at 2:27 AM Raúl Cumplido wrote: > +1 > > I've run the verification script on Ubuntu 24.04 successfully. > > $ dev/release/verify-release-candidate.sh 53.1.0 1 > ... > Release candidate looks good! > > Thanks, > Raúl > > El jue, 3 oct 20

Re: [DISCUSS] Variant Spec Location

2024-08-16 Thread Will Jones
beneficial for in-memory analytics, which are most relevant to Arrow. I'll be creating a seperate thread on the Arrow ML about this soon. Best, Will Jones [1] https://github.com/datafusion-contrib/datafusion-functions-variant/issues On Thu, Aug 15, 2024 at 7:39 PM Gang Wu wrote: >

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.11.0 RC2

2024-08-15 Thread Will Jones
+1 (binding) Verified on M2 Mac. On Tue, Aug 13, 2024 at 8:53 AM Raphael Taylor-Davies wrote: > +1 (binding) > > Verified on x86_64 GNU/Linux > > On 13/08/2024 15:44, L. C. Hsieh wrote: > > +1 (binding) > > > > Verified on M3 Mac. > > > > Thanks Andrew. > > > > On Tue, Aug 13, 2024 at 3:35 AM A

Re: [VOTE][RUST] Release Apache Arrow Rust 52.2.0 RC1

2024-07-26 Thread Will Jones
+1 (binding) Verified on M2 Mac. Thanks Andrew! On Wed, Jul 24, 2024 at 5:13 PM L. C. Hsieh wrote: > +1 (binding) > > Verified on M3 Mac. > > Thanks Andrew. > > On Wed, Jul 24, 2024 at 4:31 PM Andrew Lamb wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow Rust Impleme

Re: [VOTE][RUST] Release Apache Arrow Rust 52.0.0 RC1

2024-06-04 Thread Will Jones
+1 (binding) Verified on M1 Mac. On Tue, Jun 4, 2024 at 5:46 AM Patrick Horan wrote: > +1 > > Verified on M1 Mac. > > Thank you. > > Paddy > > On Tue, Jun 4, 2024, at 5:46 AM, Andrew Lamb wrote: > > +1 (binding) > > > > Verified on M3 mac. > > > > Thank you - this one is a good one. > > > > And

Re: [VOTE] Move Arrow DataFusion Subproject to new Top Level Apache Project

2024-03-03 Thread Will Jones
ice of "Vice President, Apache DataFusion" be > > >>> and hereby is created, the person holding such office to > > >>> serve at the direction of the Board of Directors as the chair > > >>> of the Apache DataFusion Project, and to have primary responsib

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 34.0.0 RC3

2023-12-15 Thread Will Jones
+1 (binding). Verified on x86_64 Ubuntu 22.04. On Fri, Dec 15, 2023 at 1:13 AM Jean-Baptiste Onofré wrote: > +1 (non binding) > > Regards > JB > > On Thu, Dec 14, 2023 at 9:52 PM Andy Grove wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion > Implementation,

Re: decimal64

2023-11-09 Thread Will Jones
eally care about the user experience being good. Best, Will Jones [1] https://github.com/apache/arrow/issues/36648 [2] https://github.com/apache/arrow-rs/issues/4472 [3] https://github.com/apache/arrow-datafusion/issues/7923 On Thu, Nov 9, 2023 at 9:39 AM Curt Hagenlocher wrote: > It certainly

Re: [DISCUSS][Format] C data interface for Utf8View

2023-11-07 Thread Will Jones
I agree with the approach originally proposed by Ben. It seems like the most straightforward way to implement within the current protocol. On Sun, Oct 29, 2023 at 4:59 PM Dewey Dunnington wrote: > In the absence of a general solution to the C data interface omitting > buffer sizes, I think the o

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 33.0.0 RC1

2023-11-06 Thread Will Jones
a sufficiently fundamental bug to warrant special > concern, but happy to defer to others. > > Kind Regards, > > Raphael > > [1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility > > On 7 November 2023 03:20:59 GMT, Will Jones > wrote: > >Hello, > >

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 33.0.0 RC1

2023-11-06 Thread Will Jones
expose users to. Not sure what the precedent here is, but I think either we should consider either (a) seeing if we can release and upgrade Arrow to include the fix, or else (b) calling out the regression as a known bug so downstream projects can include the path in their applications. Best, Will

Re: [VOTE][RUST] Release Apache Arrow Rust 48.0.0 RC2

2023-10-19 Thread Will Jones
+1 Verified on M1 Mac. Thanks Raphael! On Wed, Oct 18, 2023 at 1:30 PM Andrew Lamb wrote: > +1 (binding) -- thank you Raphael > > Verified on x86 Mac > > Hint for anyone else verifying, this is RC*2* (RC1 hit an issue[1]) > > Andrew > > [1]: https://github.com/apache/arrow-rs/pull/4950 > > On

Re: Language-specific discussion (with C# example)

2023-10-17 Thread Will Jones
e, but are useful for when you want to post the issue to get attention to a discussion and are looking for a targeted audience. Best, Will Jones [1] https://github.com/apache/arrow-rs#arrow-rust-community On Tue, Oct 17, 2023 at 3:20 PM Curt Hagenlocher wrote: > I'm curious what other (su

Re: [DISCUSS] Arrow PyCapsule Protocol

2023-10-11 Thread Will Jones
s not "cross-language". [1] [1] https://arrow.apache.org/docs/format/Changing.html On Tue, Oct 10, 2023 at 10:29 AM Will Jones wrote: > Hello Arrow devs, > > We are winding down discussion and review. I have created a rendered > version of the proposed protocol: [1] > >

Re: [DISCUSS] Arrow PyCapsule Protocol

2023-10-10 Thread Will Jones
Hello Arrow devs, We are winding down discussion and review. I have created a rendered version of the proposed protocol: [1] Feel free to add feedback in the PR [2] or on this thread. Best, Will Jones [1] http://crossbow.voltrondata.com/pr_docs/37797/format/CDataInterface

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 32.0.0 RC1

2023-10-09 Thread Will Jones
+1 (binding) Verified on M1 Mac. On Mon, Oct 9, 2023 at 7:12 AM Andrew Lamb wrote: > +1 (binding) > Verified on x86 mac > > Thanks Andy > > On Sun, Oct 8, 2023 at 1:22 PM Andy Grove wrote: > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion > > Implementation, > > ver

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.7.1 RC1

2023-09-27 Thread Will Jones
+1 (binding) Verified on M1 Mac. On Tue, Sep 26, 2023 at 11:29 AM Andrew Lamb wrote: > +1 (binding) > > Verified on mac x86_64 > > Looks like a good release to me -- thank you Raphael > > Andrew > > On Tue, Sep 26, 2023 at 12:05 PM Raphael Taylor-Davies > wrote: > > > Hi, > > > > I would like

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-27 Thread Will Jones
+1 (binding) Verified with Conda on MacOS M1. On Tue, Sep 26, 2023 at 6:49 PM Jacob Wujciak-Jens wrote: > +1 (non-binding) > > full verification with conda arrow 13.0.0 R 4.3 on pop_os 23.04, cmake > 3.27, gcc 11 > > On Wed, Sep 27, 2023 at 1:26 AM Bryce Mecum wrote: > > > +1 (non-binding) > >

[DISCUSS] Arrow PyCapsule Protocol

2023-09-22 Thread Will Jones
eedback in the PR [2]. Thanks for your attention, Will Jones [1] https://github.com/apache/arrow/issues/34031 [2] https://github.com/apache/arrow/pull/37797 [3] https://github.com/apache/arrow-nanoarrow/blob/c4816261dc34f5f898b1658359c25b867b1079cd/python/src/nanoarrow/lib.py#L21-L35

Re: [LAST CALL][DISCUSS] Unsigned integers in Utf8View

2023-09-19 Thread Will Jones
ets anyways, so maybe it's not much more trouble to convert to signed integers on the way. Best, Will Jones [1] https://github.com/facebookincubator/velox/blob/db8edec395288527a7464d17ab86b36b970eb270/velox/type/StringView.h#L46-L78 On Tue, Sep 19, 2023 at 8:26 AM Benjamin Kietzman wrote

Re: [VOTE][RUST] Release Apache Arrow Rust 47.0.0 RC1

2023-09-19 Thread Will Jones
+1 (binding). Verified on Mac M1. Thanks for managing this release, Raphael! On Tue, Sep 19, 2023 at 6:21 AM Raphael Taylor-Davies wrote: > This time with the links.. > > I would like to propose a release of Apache Arrow Rust Implementation, > version 47.0.0. > > This release candidate is base

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 31.0.0 RC1

2023-09-09 Thread Will Jones
+1 (binding). Verified on x86_64 Ubuntu Linux. Thanks, Andy. On Fri, Sep 8, 2023 at 2:16 PM Chao Sun wrote: > +1 (non-binding). Thanks Andy! > > On Fri, Sep 8, 2023 at 12:18 PM Andrew Lamb wrote: > > > > +1 (binding) > > > > Thanks again Andy for keeping the release train moving forward > > >

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-09-01 Thread Will Jones
o/substrait-python > [2]https://pypi.org/project/substrait/ > > > On Thu, Aug 31, 2023 at 10:05 PM Will Jones > wrote: > > > Hello Arrow devs, > > > > We discussed this further in the Arrow community call on 2023-08-30 [1], > > and concluded we should create

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-08-31 Thread Will Jones
creating a PyCapsule based protocol for arrays, schemas, and streams. That is tracked here [3]. Hopefully that isn't too ambitious :) Best, Will Jones [1] https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/edit [2] https://github.com/apache/arrow/issues/37504 [3]

Re: ADBC support for the Rust ecosystem

2023-08-23 Thread Will Jones
ishing the Rust ADBC bindings, but I have a few other projects I need to get done first. The project I was hoping to use it for was delayed. If you are interested in taking over the PRs, even just part of it, I'd be happy to give PR reviews. Best, Will Jones On Wed, Aug 23, 2023 at 5:54 AM

Re: [VOTE] Apache Arrow ADBC (API) 1.1.0

2023-08-14 Thread Will Jones
the packages. (So I will not merge the > linked PR until after we release ADBC 0.6.0.) > > This vote will be open for at least 72 hours. > > [ ] +1 Adopt the ADBC 1.1.0 specification > [ ] 0 > [ ] -1 Do not adopt the specification because... > > Thanks to Sutou Kouhei, M

Re: Do we need CODEOWNERS ?

2023-07-04 Thread Will Jones
I haven't had as much time to review the Parquet PRs, so I'll remove myself from the CODEOWNERS for that. I've found that I have a much easier time keeping up with PR reviews in projects that are smaller, even if there are proportionally fewer maintainers. I think that's the piece that appealed to

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-07-03 Thread Will Jones
ers as Substrait expression or PyArrow ones, since I doubt they'll want to lose support for integrating with older PyArrow versions. I've removed filters from the protocol for now, with the intention of bringing them back as soon as we can get Substrait support. I think we can do this in the

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-06-28 Thread Will Jones
e protocol initially, and keep the existing integrations in DuckDB, Polars, and Datafusion "secret" or "not officially supported" for the time being. At the very least, documenting the pattern to get a Arrow C stream will be a step forward. Best, Will Jones On Wed, Jun 28, 20

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-06-28 Thread Will Jones
s always present to "shove as > much > > compute as you can" I think pyarrow datasets seem to have found a balance > > between the two that users like. > > > > So I would argue that this protocol may never become a general-purpose > > unmaterialized

Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-06-23 Thread Will Jones
g the differences. The main ones I can think of are that: (1) Datasets are "pruneable": one can select a subset of columns and apply a filter on rows to avoid IO and (2) they are splittable and serializable, so that fragments can be distributed amongst processes / workers. Best, Will Jones

[Python][Discuss] PyArrow Dataset as a Python protocol

2023-06-21 Thread Will Jones
s advantage of newer innovations in the Arrow ecosystem (such as the PyCapsule for C Data Interface or Substrait for passing expressions / scan plans). I am tracking such future improvements in [5]. Best, Will Jones [1] https://duckdb.org/2021/12/03/duck-arrow.html [2] https://github.com/apa

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-19 Thread Will Jones
Thanks for fixing that issue. I can now successfully verify the release on M1 Mac with Conda. My vote: +1 (binding) On Mon, Jun 19, 2023 at 12:10 PM Dewey Dunnington wrote: > My vote is +1 (non-binding). Verified on MacOS M1 (both Homebrew and > Conda). > > On Mon, Jun 19, 2023 at 3:58 PM Dewey

Re: [VOTE][RUST] Release Apache Arrow Rust 42.0.0 RC1

2023-06-18 Thread Will Jones
+1, verified on MacOS M1. Thanks Andrew! On Sun, Jun 18, 2023 at 7:02 AM Wayne Xia wrote: > +1, verified on amd64 linux, thanks! > > > vin jake : > > > +1 (binding) > > > > Verified on M1 macbook. > > > > Thanks Andrew. > > > > On Sat, Jun 17, 2023, 02:40 Andrew Lamb wrote: > > > > > Hi, > >

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC0

2023-06-18 Thread Will Jones
Hello, I attempted to verify on M1 MacOS within a conda environment. But sadly encountered some issues that I don't think are nanoarrow's fault: * The gnupg from conda segfaults on MacOS. The homebrew one works fine. * I got a segfault on this test: BitmapTest.BitmapTestCountSetSingleByte (SEGFAUL

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-06-15 Thread Will Jones
believe any current implementation actually does this > or > > takes advantage of this in any meaningful way. > > > > --Matt > > > > On Thu, Jun 15, 2023 at 1:00 PM Will Jones > > wrote: > > > > > Hi Ben, > > > > > > It's

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-06-15 Thread Will Jones
nce. Question: to be able to write buffer only once and reference in multiple arrays, does that require a change to the IPC format? Or is sharing buffers within the same batch already allowed in the IPC format? Best, Will Jones On Thu, Jun 15, 2023 at 9:03 AM Benjamin Kietzman wrote: > He

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-06-13 Thread Will Jones
DB. Found discussion today in a talk from Mark Raasveldt [1]. That does improve the case for adding this type in my eyes. Best, Will Jones [1] https://youtu.be/bZOvAKGkzpQ?t=1570 On Tue, Jun 6, 2023 at 7:40 PM Weston Pace wrote: > > This implies that each canonical alternative layout

Re: [DISCUSS] JSON Canonical Extension Type

2023-06-07 Thread Will Jones
ng an extension type class / struct within Arrow implementations that is specific to JSON; I don't think there's any hard rule that there has to be a 1-1 correspondence between extension types in the format and the concrete data structures in libraries. Best, Will Jones [1] https:/

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 26.0.0 RC1

2023-06-05 Thread Will Jones
+1 (binding). Verified on Ubuntu 22 x86_64. Thanks, Andy! On Sat, Jun 3, 2023 at 9:41 AM vin jake wrote: > +1 (non-binding) > > verified on my M1 mac book. > > Thanks Andy. > > Andy Grove 于 2023年6月3日周六 23:20写道: > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion > > I

Re: [VOTE][RUST] Release Apache Arrow Rust 41.0.0 RC1

2023-06-05 Thread Will Jones
+1 (binding). Verified on Ubuntu 22 x86_64. Thanks, Raphael! On Fri, Jun 2, 2023 at 12:47 PM Andrew Lamb wrote: > +1 (binding) > Verified on x86_64 mac > > The content of this release looks very good 👌 > > Thank you Raphael > > Andrew > > On Fri, Jun 2, 2023 at 2:59 PM L. C. Hsieh wrote: > > >

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.6.1 RC1

2023-06-05 Thread Will Jones
+1 (binding), verified on M1 MacOS. Thanks Raphael! On Fri, Jun 2, 2023 at 11:56 AM L. C. Hsieh wrote: > +1 (binding) > > Verified on M1 Mac. > > Thanks Raphael. > > On Fri, Jun 2, 2023 at 11:38 AM Andrew Lamb wrote: > > > > +1 (binding) > > > > I verified the signature and ran the verification

Re: [DISCUSS] Acero's ScanNode and Row Indexing across Scans

2023-06-02 Thread Will Jones
te files and then create > > the plan. And, if there are many deleted rows, this can be costly. > > > > On Thu, Jun 1, 2023 at 7:13 PM Will Jones > wrote: > > > >> That's a good point, Gang. To perform deletes, we definitely need the > row > >>

Re: [DISCUSS] Acero's ScanNode and Row Indexing across Scans

2023-06-01 Thread Will Jones
ant to implement a streaming > merge-based > > anti-join because I believe delete files are ordered by row_index and so > a > > streaming approach is likely to be much more performant. > > > > On Mon, May 29, 2023 at 4:01 PM Will Jones > > wrote: > > > > &g

Re: [DISCUSS] Acero's ScanNode and Row Indexing across Scans

2023-05-29 Thread Will Jones
tern to what the Rust community has done may be wise. It's possible that row_index is easy to implement while the mask will take time, in which case row_index makes sense as an interim solution. Best, Will Jones [1] https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecon

Re: New datatype: Huge integers & decimals

2023-05-23 Thread Will Jones
manipulated as a float16, which IIUC would be invalid. If anyone has any advice from our work thus far on extension types, I'd welcome your input. Best, Will Jones [1] https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus [2] http

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-05-22 Thread Will Jones
do we have that it will become used outside of Velox? 2. We already have three list types: list, large list (64-bit offsets), and fixed size list. Do we think we will only want a view version of the 32-bit offset variable length list? Or are we potentially talking about view variants f

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-05-21 Thread Will Jones
n, May 21, 2023 at 3:07 PM Will Jones wrote: > Hello, > > I think Sasha brings up a good point, that the advantages of this format > seem to be primarily about query processing. Other encodings like REE and > dictionary have space-saving advantages that justify them simply

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-05-21 Thread Will Jones
-bit offset variable length list? Or are we potentially talking about view variants for all three? Best, Will Jones On Sun, May 21, 2023 at 2:19 PM Felipe Oliveira Carvalho < felipe...@gmail.com> wrote: > The benefit of having a memory format that’s friendly to non-deterministic >

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-05-20 Thread Will Jones
resulting in binary bloat and long compile times, and I worry this would > worsen this situation whilst not really providing compelling advantages > for the vast majority of workloads that don't interact with Velox. > Whilst I can definitely see that the ListView representation is pr

Re: [VOTE][RUST] Release Apache Arrow Rust 40.0.0 RC1

2023-05-20 Thread Will Jones
+1 (binding) Verified on Ubuntu 22.04. Thanks Raphael! On Fri, May 19, 2023 at 10:05 AM L. C. Hsieh wrote: > +1 (binding) > > Verified on M1 Mac. > > Thanks Raphael > > On Fri, May 19, 2023 at 6:37 AM Andrew Lamb wrote: > > > > +1 (binding) > > > > Verified on mac osx x86_64 > > > > Thank you

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.6.0 RC1

2023-05-20 Thread Will Jones
+1 (binding) Verified on Ubuntu 22.04. Thanks Raphael! On Thu, May 18, 2023 at 8:57 AM L. C. Hsieh wrote: > +1 (binding) > > Verified on M1 Mac. > > Thanks Raphael. > > On Thu, May 18, 2023 at 3:31 AM Andrew Lamb wrote: > > > > +1 (binding) > > > > I ran the release verification script (Mac x

Re: [DISCUSS] Interest in a 12.0.1 patch?

2023-05-18 Thread Will Jones
writer and is often passed pandas data, I would appreciate a patch release. Best, Will Jones [1] https://github.com/apache/arrow/issues?q=is%3Aopen+is%3Aissue+milestone%3A12.0.1 On Thu, May 18, 2023 at 10:18 AM Ian Cook wrote: > There is also a major issue with the 12.0.0 R package that has

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-05-16 Thread Will Jones
rings like the hostname, path, and a list column of query parameters, I'd like for those latter columns to be views into the URI buffers, rather than full copies. However, I've never touched the IPC read code paths, so it's quite possible I'm overlooking something obvious. Best

Re: Reusing RecordBatch objects and their memory space

2023-05-12 Thread Will Jones
Hello, I'm not sure if there are easy ways to avoid calling the destructors. However, I would point out memory space reuse is handled through memory pools; if you have one enabled it shouldn't be handing memory back to the OS between each iteration. Best, Will Jones On Fri, May 12,

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-05-11 Thread Will Jones
then we might as well ask Velox to export this type as ListArray and save the rest of the ecosystem some work. Best, Will Jones On Thu, May 11, 2023 at 12:32 PM Felipe Oliveira Carvalho < felipe...@gmail.com> wrote: > Initial reason for ListView arrays in Arrow is zero-copy compa

Re: [VOTE] Release Apache Arrow ADBC 0.4.0 - RC0

2023-05-10 Thread Will Jones
+1 (binding) Verified on Ubuntu 22 with USE_CONDA=1 dev/release/verify-release-candidate.sh 0.4.0 0 On Wed, May 10, 2023 at 2:27 PM Matt Topol wrote: > Using a manjaro linux image (in honor of the issues we found for Arrow v12 > rc) I ran: > USE_CONDA=1 ./dev/release/verify-release-candidate.sh

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 24.0.0 RC1

2023-05-06 Thread Will Jones
+1 (binding) Verified on x86_64 Ubuntu 22. Thanks Andy! On Sat, May 6, 2023 at 1:28 PM L. C. Hsieh wrote: > +1 (binding) > > Verified on M1 Mac. > > Thanks Andy. > > On Sat, May 6, 2023 at 6:26 AM Andy Grove wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusio

Re: [VOTE][RUST] Release Apache Arrow Rust 39.0.0 RC1

2023-05-06 Thread Will Jones
+1 (binding) Verified on x86_64 Ubuntu 22 Thanks Raphael! On Fri, May 5, 2023 at 9:57 AM Andrew Lamb wrote: > +1 (binding) > > Verified on x86_64 mac > > Thanks Raphael > > On Fri, May 5, 2023 at 12:36 PM L. C. Hsieh wrote: > > > +1 (binding) > > > > Verified on M1 Mac. > > > > Thanks Raphael

Re: [ANNOUNCE] New Arrow PMC member: Matt Topol

2023-05-03 Thread Will Jones
Congrats and welcome Matt. Thank you for all your contributions thus far. On Wed, May 3, 2023 at 10:44 AM vin jake wrote: > Congratulations, Matt! > > Felipe Oliveira Carvalho 于 2023年5月4日周四 01:42写道: > > > Congratulations, Matt! > > > > On Wed, 3 May 2023 at 14:37 Andrew Lamb wrote: > > > > > T

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-29 Thread Will Jones
gt;> and extension (if any). > >>>> > >>>> BTW, it seems that we should remove a Perl dependency from > >>>> > https://github.com/apache/arrow/blob/main/cpp/build-support/run-test.sh > >>>> ... > >>>> > >>>> >

Re: [WEBSITE] [DISCUSS] Arrow-Site blog post

2023-04-28 Thread Will Jones
Thanks for highlighting this, Matt. I have added some comments in the document. On Fri, Apr 28, 2023 at 2:34 PM Ian Cook wrote: > Hi Matt, > > I reviewed it and left a few very minor comments. Looks great to me. > > Do any PMC members wish to chime in? If not, it seems OK to give it 72 > hours

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-28 Thread Will Jones
/build-support/run-test.sh > >>> ... > >>> > >>> > >>> I want to reproduce this problem on my environment. Could > >>> you share your environment information? Did you use Manjaro > >>> Linux this too? > >>> > >>> >

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-27 Thread Will Jones
> > /usr/lib/cmake/llvm/LLVM-Config.cmake(131): if(NOT idx LESS 0 ) > > > /usr/lib/cmake/llvm/LLVM-Config.cmake(132): if(TARGET LLVMX86CodeGen ) > > > /usr/lib/cmake/llvm/LLVM-Config.cmake(134): else() > > > /usr/lib/cmake/llvm/LLVM-Config.cmake(135): if(TARGET LLV

Re: [VOTE] Formalize how to change format

2023-04-26 Thread Will Jones
+1. Thanks Kou. On Wed, Apr 26, 2023 at 10:27 AM Matt Topol wrote: > +1 (Non-binding) > > On Wed, Apr 26, 2023 at 5:16 AM Joris Van den Bossche < > jorisvandenboss...@gmail.com> wrote: > > > +1 > > > > On Wed, 26 Apr 2023 at 04:18, Sutou Kouhei wrote: > > > > > > Hi, > > > > > > I've added one

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Will Jones
entations. > >> > >> I believe one significant benefit is that take (and by proxy, filter) > and > >> sort are O(# of items) with the proposed format and O(# of bytes) with > the > >> current format. Jorge did some profiling to this effect in [1]. > &g

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Will Jones
to an ArrayView, but expensive to go the other way. Best, Will Jones On Tue, Apr 25, 2023 at 3:00 PM Felipe Oliveira Carvalho < felipe...@gmail.com> wrote: > Hi folks, > > I would like to start a public discussion on the inclusion of a new array > format to Arrow — array-view arr

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-24 Thread Will Jones
I'm seeing failing Pandas tests in PyArrow when verifying with USE_CONDA=1 dev/release/verify-release-candidate.sh 12.0.0 0 pyarrow/tests/test_extension_type.py::test_extension_to_pandas_storage_type[registered_period_type0] - NotImplementedError: extension> No one else is getting that? On Sun

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 23.0.0 RC1

2023-04-21 Thread Will Jones
+1 (binding) Verified on Ubuntu 22.04 On Fri, Apr 21, 2023 at 4:27 PM L. C. Hsieh wrote: > +1 (binding) > > Verified on Intel Mac. > > Thanks Andy. > > On Fri, Apr 21, 2023 at 1:40 PM Andy Grove wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion > Implement

Re: [VOTE][RUST] Release Apache Arrow Rust 38.0.0 RC1

2023-04-21 Thread Will Jones
+1 (binding) Verified on Ubuntu 22.04. On Fri, Apr 21, 2023 at 12:10 PM Andrew Lamb wrote: > +1 (binding) > > Verified on x86 mac > > Thank you Raphael > > On Fri, Apr 21, 2023 at 2:00 PM L. C. Hsieh wrote: > > > +1 (binding) > > > > Verified on M1 Mac. > > > > Thanks Raphael. > > > > On Fri,

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.5.6 RC2

2023-03-31 Thread Will Jones
+1 Verified on M1 MacOS. On Fri, Mar 31, 2023 at 4:29 AM Raphael Taylor-Davies wrote: > Hi, > > I would like to propose a release of Apache Arrow Rust Object > Store Implementation, version 0.5.6. > > This release candidate is based on commit: > 234b7847ecb737e96df3f4623df7b330b34b3d1b [1] > >

Re: Proposal: add a bot to close PRs that haven't been updated in 30 days

2023-03-31 Thread Will Jones
e PRs if nothing happened > (no push, no comment, ..) on such a PR after an additional period of > time). > > Also good to know: contributors apparently can't re-open PRs if it was > closed by someone else, so we have to be careful with messages like > "feel free to r

Re: Proposal: add a bot to close PRs that haven't been updated in 30 days

2023-03-30 Thread Will Jones
22Component%3A+Parquet%22+draft%3Afalse On Thu, Mar 30, 2023 at 1:37 PM Anja wrote: > Using those labels is a clever idea! > > Would there be a benefit to pinging reviewers for PRs that have been > "awaiting X review" for more than 30 days? > > On Thu, 30 Mar 2023 at

Re: Proposal: add a bot to close PRs that haven't been updated in 30 days

2023-03-30 Thread Will Jones
t;, "awaiting changes" and "awaiting change review" to know > whether is stale due to the contributor or the reviewer. > > El jue, 30 mar 2023, 20:08, Will Jones escribió: > > > First, to clarify: we are discussing for the monorepo only, not for Rust

Re: OpenTelemetry + Arrow

2023-03-30 Thread Will Jones
nd I appreciate that this blog provides advice for some of the trickier parts of working with complex Arrow schemas. I think this will also provide a good concrete use case for us to think about improving the ecosystem's support for nested data. Best, Will Jones On Thu, Mar 30, 2023 at 10:56

Re: Proposal: add a bot to close PRs that haven't been updated in 30 days

2023-03-30 Thread Will Jones
First, to clarify: we are discussing for the monorepo only, not for Rust / Julia / etc.? This is a big project, so best to be specific which subprojects you are addressing. I am +0.5 on this. 30 days seems like an appropriate window for this project. If the PR was stale because the contributor had

Re: [RUST] Somewhat regular sync today

2023-03-30 Thread Will Jones
Hi Andrew, This is great information. When all three are recorded, perhaps these could be shared in a blog post on the Arrow website? On Thu, Mar 30, 2023 at 10:20 AM Andrew Lamb wrote: > Here are the recording and slides from today: > Recording: [1] > Slides: [2] > > I plan to present (and rec

Re: Plasma will be removed in Arrow 12.0.0

2023-03-29 Thread Will Jones
Thanks for the feedback on the benchmark. By switching from Unix domain socket to TCP and reducing the batch size to under 5MB I was able to get nearly 5Gbps throughput. I think Unix domain sockets are just slower on Macs. Updated that repo [1] [1] https://github.com/wjones127/arrow-ipc-bench/tree

Re: zero-copy Take?

2023-03-28 Thread Will Jones
Hi John, Arrays have a `Slice()` method that allows getting a zero-copy slices of the array given an offset and a length. If you had a set of ranges it wouldn't be too hard to write a function that creates a new chunked array made up of these slices. Of course, there are likely cases where the ov

Re: Arrow/C++14

2023-03-24 Thread Will Jones
Yes, we make sure the data is compatible over time, or at least detect that data has new features. Our format versioning is explained here [1]. You can see various additions here [2]. The only one that's newer than 9.0.0 is the Run-end encoded arrays, but those aren't in widespread use yet. [1] ht

Re: Arrow/C++14

2023-03-24 Thread Will Jones
That should be Apache Arrow 9.0.0. It was in Arrow 10 that we mandated C++ 17. [1][2] [1] https://arrow.apache.org/release/10.0.0.html [2] https://issues.apache.org/jira/browse/ARROW-17545 On Fri, Mar 24, 2023 at 12:27 PM Arkadiy Vertleyb (BLOOMBERG/ 120 PARK) < avertl...@bloomberg.net> wrote: >

Re: [VOTE][RUST][DataFusion] Release DataFusion Python Bindings 20.0.0 RC2

2023-03-17 Thread Will Jones
+1 (binding) Verified on Ubuntu 22.04. On Fri, Mar 17, 2023 at 9:44 AM L. C. Hsieh wrote: > +1 (binding) > > Verified on M1 Mac. > > Thanks Andy. > > On Fri, Mar 17, 2023 at 8:01 AM Andy Grove wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion Python > > Bi

Re: [VOTE] Release Apache Arrow ADBC 0.3.0 - RC1

2023-03-17 Thread Will Jones
+1 I successfully ran on Mac OS 12.6 with USE_CONDA=1 TEST_APT=0 TEST_YUM=0 ./dev/release/verify-release-candidate.sh 0.3.0 1 And also on Ubuntu 22.04 with: USE_CONDA=1 ./dev/release/verify-release-candidate.sh 0.3.0 1 On Fri, Mar 17, 2023 at 12:09 PM Matt Topol wrote: > +1 (non-binding) > >

Plasma will be removed in Arrow 12.0.0

2023-03-15 Thread Will Jones
ourse works across languages. I wrote a little more about the shared memory case in a stackoverflow answer [3]. If you have migrated off of Plasma and want to share with other users what you moved to, please do so in this thread. Best, Will Jones [1] https://github.com/apache/arrow/issues/332

Re: [Rust][MaybeNotJustRust] PR titles and generating change logs

2023-03-14 Thread Will Jones
Thanks for sharing this script! > I noticed that some contributors are already prefixing PR titles with > "feat:", "feature:", "fix:", "docs:", etc. I plan on updating the changelog > generator to recognize these prefixes as well, to help automate my job. I believe most people are doing this out

Re: [ADBC][Rust] Proposal for Rust ADBC API

2023-03-09 Thread Will Jones
n to release Rust ADBC library independently from other libraries, with its own (semantic) versioning. I think these two can be considered independently; if we find a strong reason to keep the Rust versions the same as other libraries, I'd still hope it would be fine to have the Rust API be

Re: [EXTERNAL] Re: Field class in Java vs C#

2023-03-09 Thread Will Jones
inish implementing the C Data Interface [1] and C Stream Interface [2]. And for both approaches, I think we need to implement Union and Map types for GetInfo, which currently aren't implemented in Arrow C#. Best, Will Jones [1] https://github.com/apache/arrow/issues/33856 [2] https://github.com/ap

[ADBC][Rust] Proposal for Rust ADBC API

2023-03-01 Thread Will Jones
t the API standard. Best, Will Jones [1] https://github.com/apache/arrow-adbc/pull/478 [2] https://arrow.apache.org/adbc/0.2.0/format/specification.html [3] https://github.com/apache/arrow-adbc/pull/446

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.5.5 RC1

2023-03-01 Thread Will Jones
+1 (non-binding). Verified on MacOS aarch64. On Mon, Feb 27, 2023 at 12:53 PM Andrew Lamb wrote: > +1 (binding) > > Verified on mac x86 -- the release train is quite impressive this month > > Thank you Raphael > > > p.s I get one local failure, tracked with [1] , but I don't think it is > serio

Re: [VOTE] Release Apache Arrow nanoarrow 0.1.0 - RC1

2023-03-01 Thread Will Jones
+1 (non-binding). Verified with Conda on MacOS aarch64. Note: I needed to add gtest to the conda environment. Otherwise it went smoothly. [1] [1] https://github.com/apache/arrow-nanoarrow/pull/138 On Wed, Mar 1, 2023 at 9:04 AM Dewey Dunnington wrote: > Hello, > > I would like to propose the f

Re: [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements

2023-02-14 Thread Will Jones
ll need to be created? Best, Will Jones On Tue, Feb 14, 2023 at 12:58 PM David Li wrote: > Hello, > > I would like to submit some Flight RPC and Flight SQL enhancements for > discussion. They cover the following: > > - Executing 'queries' in a retryable, nonblockin

Re: [VOTE] Release Apache Arrow ADBC 0.2.0 - RC1

2023-02-09 Thread Will Jones
n to it yet. Of course, contributions are also welcome. > > It was also brought to my attention that other ODBC wrappers exist (also: > ConnectorX, IIRC) which could be evaluated in lieu of Turbodbc for this > purpose [1]. If you have experience with any of them, that would also be > inte

Re: [C++] Parquet and Arrow overlap

2023-02-02 Thread Will Jones
> > It would be a dilemma for the parquet-cpp contributors if none of the > > > Apache Arrow community or Apache Parquet community recognizes their > work. > > > > > > Does the parquet rust implementation have a similar issue? > > > > > &g

[C++] Parquet and Arrow overlap

2023-02-01 Thread Will Jones
the contributions to Arrow C++ Parquet being actively reviewed for potential new committers? Best, Will Jones [1] https://lists.apache.org/thread/76wzx2lsbwjl363bg066g8kdsocd03rw [2] https://lists.apache.org/thread/dkh6vjomcfyjlvoy83qdk9j5jgxk7n4j [3] https://github.com/apache/parquet-cpp

Re: [Monorepo] Add labels breaking-change and critical-fix

2023-01-25 Thread Will Jones
t; I would also suggest a "bugfix" or "backport candidate" label if we want > to make it easier to cherrypick changes for bugfix releases. > > Regards > > Antoine. > > > Le 06/01/2023 à 17:57, Will Jones a écrit : > > Hello Arrow devs, > > >

Re: [Monorepo] Add labels breaking-change and critical-fix

2023-01-17 Thread Will Jones
#x27;t as critical. Would it be too > > nuanced to say that: > > > > * A crash, given valid input, is critical > > * A crash, given invalid input, is not critical > > > > > > > > On Sat, Jan 14, 2023, 8:12 AM Antoine Pitrou wrote: > > >

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

2023-01-16 Thread Will Jones
her thing I'd be curious about is if we can generalize this subset > of SQL/Substrait to drivers for other 'storage layers' like Apache Iceberg > and Apache Hudi. > > On Mon, Jan 16, 2023, at 17:53, Will Jones wrote: > >> > >> You could do something like what

  1   2   >