L. C. Hsieh created SPARK-51769: ----------------------------------- Summary: Add maxRecordsPerOutputBatch to limit the number of record of Arrow output batch Key: SPARK-51769 URL: https://issues.apache.org/jira/browse/SPARK-51769 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 4.1.0 Reporter: L. C. Hsieh
While implementing columnar-based operator for Spark, if the operator takes input from Arrow-based evaluation operator in Spark, the number of records of output batch is unlimited for now. For such columnar-based operator, sometimes we want to limit the maximum number of input batch. If we need to limit the batch size in rows, it seems there is no existing way we can do. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org