Yicong-Huang opened a new pull request, #54605: URL: https://github.com/apache/spark/pull/54605
### What changes were proposed in this pull request? Add `*` separator to enforce keyword-only arguments in all serializer `__init__` methods in `pyspark.sql.pandas.serializers`, and convert all call sites in `worker.py` from positional to keyword arguments. Affected classes: - `ArrowStreamGroupUDFSerializer` - `ArrowStreamPandasSerializer` - `ArrowStreamPandasUDFSerializer` - `ArrowStreamArrowUDFSerializer` - `ArrowBatchUDFSerializer` - `ArrowStreamPandasUDTFSerializer` - `ArrowStreamAggPandasUDFSerializer` - `GroupPandasUDFSerializer` - `ApplyInPandasWithStateSerializer` - `TransformWithStateInPandasSerializer` - `TransformWithStateInPandasInitStateSerializer` ### Why are the changes needed? As noted by @zhengruifeng in https://github.com/apache/spark/pull/54568#discussion_r2875751193, serializer constructors accept too many positional arguments, making call sites error-prone and hard to read. Enforcing keyword-only arguments prevents positional mistakes and improves readability. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
