Hello Brendan,

welcome to the community. In addition to the folks at Dremio, I wanted to make 
you aware of the Python ODBC client library 
https://github.com/blue-yonder/turbodbc which provides a high-performance 
ODBC<->Arrow adapter. It is especially popular with MS SQL Server users as the 
fastest known way to retrieve query results as DataFrames in Python from SQL 
Server, considerably faster than pandas.read_sql or using pyodbc directly.

While being the fastest known, I can tell that still there is a lot time CPU 
spent in the ODBC driver "transforming" results so that it matches the ODBC 
interface. At least here, one could get possibly a lot better performance when 
retrieving large columnar results from SQL Server when going through Arrow 
Flight as an interface instead being constraint to the less efficient ODBC for 
this use case. Currently there is a performance difference of 50x between 
reading the data from a Parquet file and reading the same data from a table in 
SQL Server (simple SELECT, no filtering or so). As nearly for the full 
retrieval time the client CPU is at 100%, using a more efficient protocol for 
data transferral could roughly translate into a 10x speedup.

Best,
Uwe

On Wed, May 20, 2020, at 12:16 AM, Brendan Niebruegge wrote:
> Hi everyone,
> 
> I wanted to informally introduce myself. My name is Brendan Niebruegge, 
> I'm a Software Engineer in our SQL Server extensibility team here at 
> Microsoft. I am leading an effort to explore how we could integrate 
> Arrow Flight with SQL Server. We think this could be a very interesting 
> integration that would both benefit SQL Server and the Arrow community. 
> We are very early in our thoughts so I thought it best to reach out 
> here and see if you had any thoughts or suggestions for me. What would 
> be the best way to socialize my thoughts to date? I am keen to learn 
> and deepen my knowledge of Arrow as well so please let me know how I 
> can be of help to the community.
> 
> Please feel free to reach out anytime (email:brn...@microsoft.com)
> 
> Thanks,
> Brendan Niebruegge
> 
>

Reply via email to