GitHub user v1gnesh closed a discussion: Import deeply nested Rust struct/enum 
with custom types

In DuckDB, when I have a ndjson file, I can use `CREATE TABLE t AS SELECT * 
FROM read_ndjson('file.ndjson');`.
I would avoid the interim ndjson step if I can, as the objective is to go from 
deeply nested Rust struct/enums to the Arrow world, as transparently as 
possible.
Currently, I'm serializing `BigStruct` to ndjson `String`, one at a time, and 
then writing out the ndjson file (8x the size of the source file). Then, using 
DuckDB's SQL above, I'm able to automagically get the data types back from 
plaintext (JSON).

It is very desirable to go directly from Rust data types to inserting the 
native struct data type of DataFusion/DuckDB, etc.
Note that I don't have a `Vec<BigStruct>`; `BigStruct`s are being produced in 
an async Stream.
I understand this could be an ArrayOfStruct to StructOfArray 'problem', but I 
don't have an ArrayOfStruct to begin with, as they are produced in a streaming 
fashion (too many to keep it all in memory).
In addition to [this 
example](https://github.com/duckdb/duckdb-rs/blob/main/src/types/serde_json.rs) 
of writing JSON into DuckDB not working (it just writes the hex bytes in 
decimal), I lose all type information (`read_ndjson` via the CLI recreates all 
of it though), native support for Rust data types is a work in progress.

Do you think this is something that is good to have for DataFusion, and if so, 
is it something in the works already?
Are there any examples I can look at?

Oh, and inferred schema would be best. The `BigStruct`s are quite big, and 
conceal a whole lot of variations. It would be a nightmare to write the schema 
for all of them.

Thanks in advance.

GitHub link: https://github.com/apache/datafusion/discussions/7484

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to