[DISCUSS] Python Wheel Size

2022-10-03 Thread Rusty Conover
Hi Arrow Team, I'm using Apache Arrow with AWS Lambda Functions. The primary motivation is AWS Athena's user-defined functions[1]. Those functions process and return Arrow IPC segments. * The published Python wheels for Apache Arrow include almost every feature of Arrow. (Gandiva, Plasma, Fligh

[DISCUSS] Apache Iceberg / Apache Hudi support in Arrow

2022-10-03 Thread Rusty Conover
Hi Arrow Team, Arrow is fantastic for manipulating the Parquet file format. There is an increasing desire to have the ability to update, delete and insert the rows stored in Parquet files, but without rewriting the Parquet files in their entirety. It is not uncommon to have gigabytes/petabytes o

[Discuss] [Python] ParquetWriter exception handling

2022-11-12 Thread Rusty Conover
Hi All, I frequently write large (10+ gig) Parquet files to S3 using the ParquetWriter class in Python. These files are written using an S3 multipart upload functionality provided by the underlying S3Filesystem implementation. I call S3FileSystem.create_output_stream() and pass that to the Parqu

[DISCUSS] Acero's ScanNode and Row Indexing across Scans

2023-05-29 Thread Rusty Conover
Hi Arrow Team, I wanted to suggest an improvement regarding Acero's Scan node. Currently, it provides useful information such as __fragment_index, __batch_index, __filename, and __last_in_fragment. However, it would be beneficial to have an additional column that returns an overall "row index" fro

Re: [DISCUSS][C++] Can we require CMake 3.16+ since 13.0.0?

2023-06-17 Thread Rusty Conover
Does this imply increasing the required glibc version for the manylinux wheels? Rusty On Thu, Jun 15, 2023 at 5:19 PM Sutou Kouhei wrote: > > Hi, > > We require CMake 3.5+ now because Ubuntu 18.04 ships 3.5. > We dropped support for Ubuntu 18.04 because it reached EOL. > > Can we require CMake 3

Re: [ANNOUNCE] New Arrow PMC member: Rok Mihevc

2025-04-05 Thread Rusty Conover
Congratulations Rok On Sat, Mar 22, 2025 at 08:52 Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > Congrats Rok, and thanks for all your contributions! > > On Fri, 21 Mar 2025 at 11:19, Nic Crane wrote: > > > > Congrats Rok! > > > > On Thu, 20 Mar 2025, 12:33 Weston Pace, wrote: >

Re: Arrow Flight Endpoint Location URLs

2025-03-26 Thread Rusty Conover
Hi Jacob and Matt, I appreciate the opportunity to review the document. It helped clarify some things for me, but I’d like to propose an approach that is slightly different while still aligning with the overall goals. That said, I know you all have far more experience designing and implementing id

Arrow Flight Endpoint Location URLs

2025-03-25 Thread Rusty Conover
Hi Arrow Friends, I remember speaking with Matt T in Brussels about Arrow Flight endpoint locations that may not have their content conveyed via gRPC. Potentially an endpoint location could use regular HTTPS. I believe Matt said that the discussion about wasn't ever brought to a conclusion. Was

Re: Request for comments on adding new IPC option 'ensure_memory_alignment'

2025-03-27 Thread Rusty Conover
Hi, This seems like a sensible approach and an improvement to developer/user experience. Rusty

Re: Arrow Flight Endpoint Location URLs

2025-03-28 Thread Rusty Conover
On Thu, Mar 27, 2025 at 1:56 PM Antoine Pitrou wrote: > > Indeed, it doesn't sound like a terrific use of Arrow maintainer time... > Especially as there's a growing feeling that Flight was not very well > designed, and should perhaps be slowly obsoleted in favor of more > focussed initiative (suc

Re: [DISCUSSION] Extending Arrow Flight Location URIs

2025-04-21 Thread Rusty Conover
Looks good to me Matt. Rusty On Mon, Apr 21, 2025 at 19:02 Matt Topol wrote: > Hi all, > > Following on from the previous discussion [1] and on the google doc > [2], I've put up a PR to formally update the documentation in the > Flight.proto file and the Flight.rst in the Arrow Docs [3]. > > Be

Re: [DISCUSS] Arrow Flight Predicate Pushdown

2025-03-28 Thread Rusty Conover
On Thu, Mar 27, 2025 at 12:01 PM David Li wrote: > It's not clear to me why (4) doesn't work. Can you speak more about the > flow of requests here? Putting a ticket inside a FlightDescriptor makes me > think something more complicated is going on. > I made a mistake in my description of method 4

[DISCUSS] Arrow Flight Predicate Pushdown

2025-03-27 Thread Rusty Conover
Hi everyone, I’d like to discuss the possibility of sending filtering predicates to an Arrow Flight server. Currently, it’s unclear what the "proper" approach is for achieving this. The GetFlightInfo method only accepts a FlightDescriptor without additional parameters. If it allowed for extra par