Re: Query on Memory Leaks with Arrow Flight client

2024-10-25 Thread Susmit Sarkar
I see your point is very valid, in my use case the consumer which will be using an SDK we are exposing an api: The method reads FlightStream in complete with a tunable batch size, the iterator will not be fully traversed only if the process crashes, or explicitly the user stops the process. def

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Ian Cook
Oh ok, thanks Matt, I understand. In that case I am +1 on the proposal but I would like to see notes added to the documentation to make this clearer to readers. I created an issue for this: https://github.com/apache/arrow/issues/44535 Thanks, Ian On Fri, Oct 25, 2024 at 2:54 PM Matt Topol wro

Re: Query on Memory Leaks with Arrow Flight client

2024-10-25 Thread Laurent Goujon
Yes, the first example was returning an iterator to the caller which would interact with the stream after the client is closed (lazy evaluation). The new code should work but may cause a leak if the iterator is not fully traversed. You may want to return a type which has both the Iterator and AutoC

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Matt Topol
Given the promises of the C Data Interface, it's not viable to retire the non-device versions of the interfaces. But overall, it's better to prefer only adding new things in terms of the DeviceArray structs to avoid consumers having to create duplicate interfaces for both ArrowArray and ArrowDevice

Re: Query on Memory Leaks with Arrow Flight client

2024-10-25 Thread Susmit Sarkar
I believe I figured it out i changed slightly the logic, post the changes the issue is not prevalent.. def fetchDataStream(details: ObjectStoreDetails): Iterator[FlightStream] = { logger.info(s"Fetching data for S3 path: ${details.s3Path}") val ticketStr = buildTicketStr(details) logger.info

Re: Query on Memory Leaks with Arrow Flight client

2024-10-25 Thread Laurent Goujon
Haven't tested the code but isn't `FlightStream` a closeable as well? On Fri, Oct 25, 2024 at 3:40 AM Susmit Sarkar wrote: > Hi Team, > > We are seeing the issue often with Memory Leak: > > *JDK 11* > > "org.apache.arrow" % "arrow-flight" % "17.0.0", > "org.apache.arrow" % "arrow-vector" % "17.0

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Ian Cook
Thanks Matt for doing this! I am +0.5 on the current proposal, because (if I understand correctly) it adds ArrowAsyncDeviceStreamHandler but does not add ArrowAsyncStreamHandler. I recognize that the C Device Stream Interface with a DeviceType of CPU is functionally equivalent to the C Stream Inte

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Matt Topol
@pitrou I've updated the format PR to add the Experimental tag to the header and the documentation. Thanks! On Fri, Oct 25, 2024, 7:35 AM Antoine Pitrou wrote: > > +1, with the same comments as Felipe and Dewey. > > Just at one condition from me: the API should be marked experimental. > > Regard

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Antoine Pitrou
+1, with the same comments as Felipe and Dewey. Just at one condition from me: the API should be marked experimental. Regards Antoine. Le 24/10/2024 à 23:17, Felipe Oliveira Carvalho a écrit : +1 from me. I reviewed the PR some time ago and it's not a trivial protocol, but the complexity

Query on Memory Leaks with Arrow Flight client

2024-10-25 Thread Susmit Sarkar
Hi Team, We are seeing the issue often with Memory Leak: *JDK 11* "org.apache.arrow" % "arrow-flight" % "17.0.0", "org.apache.arrow" % "arrow-vector" % "17.0.0", "org.apache.arrow" % "flight-core" % "17.0.0", 4-10-25 15:25:06.394 [main] ERROR o.apache.arrow.memory.BaseAllocator - Memory was le

Re: [VOTE] Release Apache Arrow 18.0.0 - RC0

2024-10-25 Thread Raúl Cumplido
Hi, As there is no one against the current proposed approach, I plan to close the vote in the next 24 hours. El vie, 25 oct 2024 a las 9:27, Ruoxi Sun () escribió: > > Hi Kou and Raul, > > Thanks for bringing this to the discussion. I would like to put a +1 on > Raul's proposal as this might be

Re: [VOTE] Release Apache Arrow 18.0.0 - RC0

2024-10-25 Thread Antoine Pitrou
I also agree that letting conda-forge carry the patch until 19.0.0 is a reasonable solution. It's much more light-weight than having us issue a new RC just for it, unfortunately. Regards Antoine. Le 24/10/2024 à 17:07, Raúl Cumplido a écrit : El jue, 24 oct 2024 a las 0:14, Sutou Kouhei

Re: [VOTE] Release Apache Arrow 18.0.0 - RC0

2024-10-25 Thread Ruoxi Sun
Hi Kou and Raul, Thanks for bringing this to the discussion. I would like to put a +1 on Raul's proposal as this might be the best balance between the availability of conda-forge and our release cost. Another side (maybe dumb) question though: Is it regular/normal/officially-supported in conda-fo

Re: [VOTE] Add Async C Data Interface

2024-10-25 Thread Felipe Oliveira Carvalho
+1 from me. I reviewed the PR some time ago and it's not a trivial protocol, but the complexity seems warranted and necessary. On Thu, Oct 24, 2024 at 6:02 PM Dewey Dunnington wrote: > Thanks Matt for putting this together! > > I was initially concerned about the complexity of the proposal; > h