Re: Extract objects from CompressedOutputStream

2024-10-10 Thread Aldrin
If you think the problem is the context manager, then don't use it. As with file handles, the expectation is that exiting a context manager closes the thing it's managing. But even if you don't close it, the positions should be relevant when you open it again. Also, looking at th

Re: [DISCUSS][C++] Store C++ shared_ptr in arrow table

2024-10-10 Thread Aldrin
I'm fairly sure uintptr_t is an integer type for holding a pointer in C++ (docs specifically say "to void" aka `void*`). It should be equivalent to uint64_t on 64-bit systems, but where I agree it is risky is that it is going to be platform dependent and there are likely nuances for certain comp

Re: Extract objects from CompressedOutputStream

2024-10-10 Thread Robert McLeod
Aldrin, I think my trouble is coming from the fact that a `CompressedOutputStream` closes the `BufferedOutputStream` when it exits its context manager. I don't know if that is required or if it is simply an oversight because no one else has tried to fetch individual objects from a compressed strea

Re: [DISCUSS][C++] Store C++ shared_ptr in arrow table

2024-10-10 Thread Jorge Cardoso Leitão
Hi, This use-case seems semantically equivalent with storing python objects in arrow for the purpose of putting them in an arrow table. This can be achieved by some form of pickling or indirection (I recall Polars and others doing one of these). Imo there are different approaches with different t

Re: [DISCUSS][C++] Store C++ shared_ptr in arrow table

2024-10-10 Thread Felipe Oliveira Carvalho
Hi, Yi Cao's request comes from a misunderstanding of where the performance of Arrow comes from. Arrow arrays follow the SoA paradigm [1]. The moment you start thinking about individual objects with an associated ref-count (std::shared_ptr) is the moment you've given up the SoA approach and you a

Re: [DISCUSS][C++] Store C++ shared_ptr in arrow table

2024-10-10 Thread Andrew Bell
On Thu, Oct 10, 2024 at 4:18 PM Felipe Oliveira Carvalho wrote: > > Hi, > > Yi Cao's request comes from a misunderstanding of where the performance of > Arrow comes from. > > Arrow arrays follow the SoA paradigm [1]. The moment you start thinking about > individual objects with an associated ref