There are no guidelines.

I think as long as things are well-documented and (if needed) there is a way to 
disable or reduce usage of resources that might be excessive, then I don't 
think we need to restrict things. 

I'm not sure what that reference to the JDBC specification in Snowflake means. 
Possibly it might be referring to their own driver implementation if it has a 
similar limit.

On Fri, Dec 8, 2023, at 10:09, Joel Lubinitsky wrote:
> I am working on some improvements to bulk ingestion for the Snowflake ADBC
> driver[1] and have been investigating existing implementations in related
> libraries.
>
> The current driver implementation defers to the gosnowflake library
> to handle this. In Snowflake's implementation, uploads are buffered into
> chunks no more than 10 MB in size before sending across the network. They
> claim this limit comes from the JDBC specification[2]. I wasn't able to
> find documentation for this limit, but it got me thinking about potential
> assumptions consumers of ADBC might have regarding resources that a driver
> would utilize. Should our implementation respect this 10 MB limit? If not,
> is there any specific limit we should target?
>
> Similarly, are there any expectations regarding storage usage? The
> snowflake-connector-python write_pandas() implementation uses a different
> approach, saving the dataframe to parquet files in a temp directory and
> then uploading them[3]. We likely don't want to save all data to disk
> before uploading given the size of data this API is intended to handle, but
> even a chunked implementation could produce large files.
>
> Is there a set of guidelines on limits for memory, storage, or other
> resource usage for ADBC drivers? If not, should there be?
>
> Thanks,
> Joel Lubinitsky
>
> [1] https://github.com/apache/arrow-adbc/issues/1327
> [2]
> https://github.com/snowflakedb/gosnowflake/blob/master/bind_uploader.go#L21
> [3]
> https://github.com/snowflakedb/snowflake-connector-python/blob/main/src/snowflake/connector/pandas_tools.py#L168

Reply via email to