[ https://issues.apache.org/jira/browse/SPARK-53440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Malthe Borch updated SPARK-53440: --------------------------------- Description: The [transform function|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.transform.html] takes a Python _function_-argument (unary or binary) which is then converted into a Catalyst expression using introspection. In some situations, it's preferable to express such functions using SQL, e.g. `x -> x + 1` since no introspection happens, just simple parsing. It should be possible to specify such a function as a string or using some other construct that ultimately lets the user provide the function simply as a string. Note that it _is_ already possible to use `call_function` like so: {code:python} from pyspark.sql.functions import array, call_function, expr column = call_function("transform", array(1, 2, 3), expr("x -> x + 1")) {code} was: The [transform function|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.transform.html] takes a Python _function_-argument (unary or binary) which is then converted into a Catalyst expression using introspection. In some situations, it's preferable to express such functions using SQL, e.g. `x -> x + 1` since no introspection happens, just simple parsing. It should be possible to specify such a function as a string or using some other construct that ultimately lets the user provide the function simply as a string. > Transform function to accept SQL-defined function > ------------------------------------------------- > > Key: SPARK-53440 > URL: https://issues.apache.org/jira/browse/SPARK-53440 > Project: Spark > Issue Type: Wish > Components: PySpark > Affects Versions: 4.0.0 > Reporter: Malthe Borch > Priority: Major > > The [transform > function|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.transform.html] > takes a Python _function_-argument (unary or binary) which is then converted > into a Catalyst expression using introspection. > In some situations, it's preferable to express such functions using SQL, e.g. > `x -> x + 1` since no introspection happens, just simple parsing. > It should be possible to specify such a function as a string or using some > other construct that ultimately lets the user provide the function simply as > a string. > Note that it _is_ already possible to use `call_function` like so: > {code:python} > from pyspark.sql.functions import array, call_function, expr > column = call_function("transform", array(1, 2, 3), expr("x -> x + 1")) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org