Yeah, I think it's good to fix them. please go ahead with opening a JIRA and filing a PR. I think it's good to start fixing them.
On Sat, 12 Jul 2025 at 18:32, Kyungjun Lee <kyungjunlee...@gmail.com> wrote: > Hi all, > > I'm a developer trying to make my first contribution to Apache Spark, and > while exploring the codebase I came across something I was curious about. > > In several places, such as this test case: > > https://github.com/apache/spark/pull/51225/files#diff-b494cf5a64997153f883507917a63dfb17a8c26624cfef72601e111c0800a9e8R295 > > I noticed the use of: > > super(ArrowPythonUDFLegacyTests, cls).setUpClass() > > instead of the more modern and concise: > > super().setUpClass() > > As far as I understand, the explicit form (`super(Class, cls)`) was > commonly used to maintain Python 2 compatibility. Since Spark now requires > Python 3, I was wondering whether it's okay to start updating these to use > `super()`. > > From what I know, if a test file has already been refactored to be Python > 3-only, using `super()` should be fine. > However, I’m not sure if the test environment or codebase still > intentionally keeps the older style for consistency or other reasons. > > I tried looking for past discussions on this but couldn’t find any — maybe > I missed something. > Sorry if this is a naive question — I’m still learning and very excited to > contribute! > > Best regards, > Kyungjun Lee >