On 7/21/20 9:40 PM, Holden Karau wrote: > Yeah I think this could be a great project now that we're only Python > 3.5+. One potential is making this an Outreachy project to get more > folks from different backgrounds involved in Spark.
I am honestly not sure if that's really the case. At the moment I maintain almost complete set of annotations for the project. These could be ported in a single step with relatively little effort. As of the further maintenance ‒ this will have to be done along the codebase changes to keep things in sync, so if outreach means low-hanging-fruit, it is uniquely to serve this purpose. Additionally, there are at least two considerations: * At some point (in general when things are heavy in generics, which is the case here), annotations become somewhat painful to write. * In ideal case API design has to be linked (to reasonable extent) with annotations design ‒ not every signature can be annotated in a meaningful way, which is already a problem with some chunks of Spark code. > > On Tue, Jul 21, 2020 at 12:33 PM Driesprong, Fokko > <fo...@driesprong.frl> wrote: > > Since we've recently dropped support for Python <=3.5 > <https://github.com/apache/spark/pull/28957>, I think it would be > nice to add support for type annotations. Having this in the main > repository allows us to do type checking using MyPy > <http://mypy-lang.org/> in the CI itself. > > This is now handled by the Stub > file: https://www.python.org/dev/peps/pep-0484/#stub-files However > I think it is nicer to integrate the types with the code itself to > keep everything in sync, and make it easier for the people who > work on the codebase itself. A first step would be to move the > stubs into the codebase. First step would be to cover the public > API which is the most important one. Having the types with the > code itself makes it much easier to understand. For example, if > you can supply a str or column > here: > https://github.com/apache/spark/pull/29122/files#diff-f5295f69bfbdbf6e161aed54057ea36dR2486 > > One of the implications would be that future PR's on Python should > cover annotations on the public API's. Curious what the rest of > the community thinks. > > Cheers, Fokko > > > > > > > > > > Op di 21 jul. 2020 om 20:04 schreef zero323 > <mszymkiew...@gmail.com <mailto:mszymkiew...@gmail.com>>: > > Given a discussion related to SPARK-32320 PR > <https://github.com/apache/spark/pull/29122> I'd like to > resurrect this > thread. Is there any interest in migrating annotations to the main > repository? > > > > -- > Sent from: > http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > <mailto:dev-unsubscr...@spark.apache.org> > > > > -- > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, > etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau -- Best regards, Maciej Szymkiewicz Web: https://zero323.net Keybase: https://keybase.io/zero323 Gigs: https://www.codementor.io/@zero323 PGP: A30CEF0C31A501EC
signature.asc
Description: OpenPGP digital signature