Re: [Question] Use of `super(Class, cls)` in Spark codebase

Hyukjin Kwon Sun, 13 Jul 2025 16:04:41 -0700

Yeah, I think it's good to fix them. please go ahead with opening a JIRA
and filing a PR. I think it's good to start fixing them.


On Sat, 12 Jul 2025 at 18:32, Kyungjun Lee <kyungjunlee...@gmail.com> wrote:

> Hi all,
>
> I'm a developer trying to make my first contribution to Apache Spark, and
> while exploring the codebase I came across something I was curious about.
>
> In several places, such as this test case:
>
> https://github.com/apache/spark/pull/51225/files#diff-b494cf5a64997153f883507917a63dfb17a8c26624cfef72601e111c0800a9e8R295
>
> I noticed the use of:
>
>     super(ArrowPythonUDFLegacyTests, cls).setUpClass()
>
> instead of the more modern and concise:
>
>     super().setUpClass()
>
> As far as I understand, the explicit form (`super(Class, cls)`) was
> commonly used to maintain Python 2 compatibility. Since Spark now requires
> Python 3, I was wondering whether it's okay to start updating these to use
> `super()`.
>
> From what I know, if a test file has already been refactored to be Python
> 3-only, using `super()` should be fine.
> However, I’m not sure if the test environment or codebase still
> intentionally keeps the older style for consistency or other reasons.
>
> I tried looking for past discussions on this but couldn’t find any — maybe
> I missed something.
> Sorry if this is a naive question — I’m still learning and very excited to
> contribute!
>
> Best regards,
> Kyungjun Lee
>

Re: [Question] Use of `super(Class, cls)` in Spark codebase

Reply via email to