timsaucer commented on issue #513:
URL:
https://github.com/apache/datafusion-python/issues/513#issuecomment-3361133274
I've been using Claude to assist me in trying to understand the conventions,
so take this with a grain of salt.
- DuckDB: `result = duckdb.sql("SELECT * FROM df", df=pandas_df)` (but we've
seen they can also pull directly from scope)
- Spark: `spark.sql("SELECT * FROM users WHERE age > :age", age=25)`
- Daft: `daft.sql("SELECT * FROM df", catalog={"df": df_customer})`
- Pandas: `query = "SELECT * FROM df_customer"` (direct injection it looks
like)
- Postgresql: `cursor.execute("SELECT * FROM users WHERE age > %s", (25,))`
or with named parameters `cursor.execute("SELECT * FROM users WHERE age >
%(age)s", {"age": 25})`
One potential problem with the proposed `ctx.sql("select c_custkey, c_name
from {df}", df=df_customer)` is that if the user also uses f-string replacement
it gets messy. For example suppose they did `ctx.sql(f"select
{key_of_interest}, c_name from {df}", df=df_customer)` then I expect this would
go very poorly. It would try to coerce `df` to a string because of the `f" "`.
I'm a bit torn on the PostgreSQL approach. On the one hand `datafusion`
upstream tries to stick closely to PostgreSQL. On the other hand the non-named
parameters I find to be nasty. It reminds me of old `fprint` statements where
you had to closely watch your parameter ordering.
From this preliminary look it doesn't appear that there is a strong
consensus in approach.
If I had to pick from one of these, I would probably lean towards the Spark
approach. I think the f-string replacement argument is a very valid one and
would just lead to headaches down the road for our users.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]