Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Felipe Oliveira Carvalho
My take here is that Ben did an excellent job in hiding the fact that C++ has two variations of the format without leaking the pointer version via the interfaces through which Arrow arrays are communicated to other implementations. As things stand right now, there is no zero-copy transfer of point

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Benjamin Kietzman
@Wes 3. Implement the raw pointer variant as an extension type in C++ / C ABI. @Andrew 1. Update the standard to allow raw pointers If adding raw pointers to the C ABI is a satisfactory compromise, then I'd be happy to draft a PR adding it. To me this seems to cover the bases of accommodating ei

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Andrew Lamb
> What this PR is creating is an "unofficial" Arrow format, with data types exposed in Arrow C++ that are not part of the Arrow standard, but are exposed as if they were. I agree with Antoine here. It seems a pretty clear cut story of the C++ implementation doesn't follow the spec and thus we shou

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Raphael Taylor-Davies
FWIW Rust wouldn't have issues using raw pointers, I can't speak for other languages though. They would be more expensive to validate, but validation is not going to be cheap regardless. I could definitely see a world where view types use pointers and IPC coerces to/from the large non-view type

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Wes McKinney
hi all, I'm just catching up on this thread after having taken a look at the format PRs, the C++ implementation PR, and this e-mail thread. So only my $0.02 from having spent a great deal less time on this project than others. The original motivation I had for bringing up the idea of adding the S

[CROWDSOURCING] 2023 ASF Board Report -- October 11, 2023

2023-09-28 Thread Andrew Lamb
Hello Arrow Community, Please add any comments or board content directly to [1] or reply to this email and I will incorporate your comments. You can see what we currently have at the end of this email. One of the responsibilities of being part of the Apache Software Foundation (ASF) is to regular

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Antoine Pitrou
To make things clear, any of the factory functions listed below create a type that maps exactly onto an Arrow columnar layout: https://arrow.apache.org/docs/dev/cpp/api/datatype.html#factory-functions For example, calling `arrow::dictionary` creates a dictionary type that exactly represents

Blog post about new UTF8View

2023-09-28 Thread Andrew Lamb
While we have added support in Arrow for Utf8View Arrays[1], along with the required implementation, I don't think we have written a blog post about it. I think blog posts announcing and describing new features at a higher technical level, with diagrams, are critical to quick and widespread adopti

Re: [DISCUSS][C++] Raw pointer string views

2023-09-28 Thread Antoine Pitrou
Hi Ben, Le 27/09/2023 à 23:25, Benjamin Kietzman a écrit : @Antoine What this PR is creating is an "unofficial" Arrow format, with data types exposed in Arrow C++ that are not part of the Arrow standard, but are exposed as if they were. We already do this in every implementation of the arr