Re: [Rust] Column names in FFI_ArrowSchema

2021-02-22 Thread Marc Prud'hommeaux
through to the FFI_ArrowSchema without an unreasonably large refactor would be appreciated! –Marc On 2021/02/20 15:19:00, Marc Prud'hommeaux wrote: > > Great! I will start experimenting and see how far I get. > > While we're at it, should we consider putting something in the

Re: [Rust][DataFusion] Inconsistent array ordering with "GROUP BY" SQL

2021-02-20 Thread Marc Prud'hommeaux
re: > > 1. The use of hash-based data structures, as you mention > 2. If you have partitioned data then it is processed on multiple threads > and that can affect ordering as well > > Andy. > > On Sat, Feb 20, 2021 at 7:31 AM Marc Prud'hommeaux > wrote: > &g

Re: [Rust] Column names in FFI_ArrowSchema

2021-02-20 Thread Marc Prud'hommeaux
lease bit, as we do in the > DataType string. Think of that function as the "Drop" equivalent for ffi. > > Best, > Jorge > > > On Sat, Feb 20, 2021 at 4:24 AM Marc Prud'hommeaux > wrote: > > > When I export to the C data interface structs with ar

[Rust][DataFusion] Inconsistent array ordering with "GROUP BY" SQL

2021-02-20 Thread Marc Prud'hommeaux
When I group by a column in DataFusion SQL, the order of the results is different every time. For example, "select country from data group by country" against https://github.com/Teradata/kylo/blob/master/samples/sample-data/csv/userdata3.csv might return "Moldova" first one time, and then "Swed

[Rust] Column names in FFI_ArrowSchema

2021-02-19 Thread Marc Prud'hommeaux
When I export to the C data interface structs with array.to_raw(), I'm seeing that FFI_ArrowSchema.name is always null. And looking at ArrowArray::try_new, it appears that FFI_ArrowSchema is only ever created with a format argument; name and metadata are never set to anything. Is there any spec

Re: [Rust] Datafusion table un-registration

2021-02-08 Thread Marc Prud'hommeaux
haven't looked at that > code lately). > > Thanks, > > Andy. > > > On Mon, Feb 8, 2021 at 8:41 AM Marc Prud'hommeaux > wrote: > > > I have a long-running ExecutionContext that needs to have parquet files > > registered and un-registered over t

[Rust] Datafusion table un-registration

2021-02-08 Thread Marc Prud'hommeaux
I have a long-running ExecutionContext that needs to have parquet files registered and un-registered over time. Registration happens via register_table(), but there doesn't seem to be any corresponding unregister_table() function to later remove the table. Is this something that might be useful