[C++] Parquet file read from s3

2024-11-28 Thread Surya Kiran Gullapalli
Hello all, Trying to read a parquet file from s3 (50MB file) and it is taking much more time than arrow 12.0.1. I've enabled threads (use_threads=true) and batch size is set to 1024*1024. Also set the IOThreadPoolCapacity to 32. When I time the parquet read from s3 using boost timer shows cpu usag

Re: [C++] Parquet file read from s3

2024-11-28 Thread Raúl Cumplido
Thanks for raising the issue. Could you share a snippet of the code you are using on how are you reading the file? Is your decrease on performance also happening with different file-sizes or is the file-size related to your issue? Thanks, Raúl El jue, 28 nov 2024, 13:58, Surya Kiran Gullapalli

Re: [C++] Parquet file read from s3

2024-11-28 Thread Surya Kiran Gullapalli
Thanks for the quick response. When the file sizes are small (less than 10MB), I'm not seeing much difference (not noticeable). But beyond that I'm seeing difference. I'll send a snippet in due course. Surya On Thu, Nov 28, 2024 at 6:37 PM Raúl Cumplido wrote: > Thanks for raising the issue. >

CVE-2024-52338: Apache Arrow R package: Arbitrary code execution when loading a malicious data file

2024-11-28 Thread Dewey Dunnington
Severity: critical Affected versions: - Apache Arrow R package 4.0.0 through 16.1.0 Description: Deserialization of untrusted data in IPC and Parquet readers in the Apache Arrow R package versions 4.0.0 through 16.1.0 allows arbitrary code execution. An application is vulnerable if it reads