[jira] [Created] (SPARK-50752) Introduce configs for Python UDF execution

Jungtaek Lim (Jira) Tue, 07 Jan 2025 00:37:05 -0800

Jungtaek Lim created SPARK-50752:
------------------------------------

             Summary: Introduce configs for Python UDF execution
                 Key: SPARK-50752
                 URL: https://issues.apache.org/jira/browse/SPARK-50752
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 4.0.0
            Reporter: Jungtaek Lim



Unlike Pandas UDF, Python UDF does not have configurations to tune for 
performance. It doesn't mean we do not batch the input/output with Python UDF, 
it means the batch size is hard-coded.

There are configurations which are available in Pandas UDF and mostly also 
relevant to Python UDF:
 * batch size (executor <-> python worker)
 * buffer size to write to channel



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-50752) Introduce configs for Python UDF execution

Reply via email to