Re: [DISCUSS] Donation of a Spark native engine based on DataFusion & Arrow

2024-01-11 Thread Andrew Lamb
I am very supportive of this donation. I know of at least one other DataFusion-based project, blaze-rs[1], which has the same design goal and bringing this project into the ASF may help consolidate these efforts As Andy said, I believe it was very valuable to have a major consumer project (e.g. Da

Re: [DISCUSS] Donation of a Spark native engine based on DataFusion & Arrow

2024-01-11 Thread Parth Chandra
Full disclosure: I worked on the original value vector implementation that became Apache arrow and currently work with Chao, et al on the native engine that is being discussed. I believe that integration of DataFusion with Spark will drive both development and user interest in arrow-rs and DataFusi

ADBC: xdbc_data_type and xdbc_sql_data_type

2024-01-11 Thread David Coe
I recently raised csharp/src/Apache.Arrow/Types/ArrowType: There are different type IDs for values after 21, including Decimal128 and Decimal256, than for Python * Issue #39568 * apache/arrow (github.com) because I have a downstream system that is i

Re: ADBC: xdbc_data_type and xdbc_sql_data_type

2024-01-11 Thread David Li
Those values are inherited from Flight SQL [1] which effectively borrowed types from JDBC/ODBC. xdbc_sql_data_type [2] is defined by an enum [3]. This is the database's type in its SQL dialect, not the Arrow type. Arrow types are always represented in Arrow schemas. (This field is a little cont

Re: ADBC: xdbc_data_type and xdbc_sql_data_type

2024-01-11 Thread Curt Hagenlocher
Interestingly, the description of sql_data_type in FlightSql.proto includes "The value of the SQL DATA TYPE which has the same values as data_type value." On Thu, Jan 11, 2024 at 10:06 AM David Li wrote: > Those values are inherited from Flight SQL [1] which effectively borrowed > types from J

Re: ADBC: xdbc_data_type and xdbc_sql_data_type

2024-01-11 Thread James Duong
(Referring to the CommandXTypeInfo message): The intent of SQL_DATA_TYPE field was to hold source-specific data type codes rather than the usual external facing SQL types reported in ODBC and JDBC API calls. A developer writing code for a specific database, but using a general API might be able

Re: [DISCUSS] Donation of a Spark native engine based on DataFusion & Arrow

2024-01-11 Thread L. C. Hsieh
Spark as a widely used computation engine in industry, has its momentum from developers and users. I believe that the integration with DataFusion, not only can help drive Spark through next level high performance with a new native execution engine, but also can attract more developer attention int

Re: [DISCUSS] Donation of a Spark native engine based on DataFusion & Arrow

2024-01-11 Thread Micah Kornfield
It sounds like there is likely enough support for this to move forward, I'd guess next steps are to work on the donation process/vote. Probably someone more involved with DataFusion should help drive this effort? On Thu, Jan 11, 2024 at 12:55 PM L. C. Hsieh wrote: > Spark as a widely used compu

Re: [DISCUSS] Donation of a Spark native engine based on DataFusion & Arrow

2024-01-11 Thread Albert
Like Andrew Lamb mentioned, blaze-rs has similar goals, I'd really be interested to know some comparisons when the donations are made. All in all, I look forward to the new native project for spark acceleration. On Thu, Jan 11, 2024 at 9:50 PM Andrew Lamb wrote: > I am very supportive of this do