On 7/21/20 9:40 PM, Holden Karau wrote:
> Yeah I think this could be a great project now that we're only Python
> 3.5+. One potential is making this an Outreachy project to get more
> folks from different backgrounds involved in Spark.

I am honestly not sure if that's really the case.

At the moment I maintain almost complete set of annotations for the
project. These could  be ported in a single step with relatively little
effort.

As of the further maintenance ‒ this will have to be done along the
codebase changes to keep things in sync, so if outreach means
low-hanging-fruit, it is uniquely to serve this purpose.

Additionally, there are at least two considerations:

  * At some point (in general when things are heavy in generics, which
    is the case here), annotations become somewhat painful to write.
  * In ideal case API design has to be linked (to reasonable extent)
    with annotations design ‒ not every signature can be annotated in a
    meaningful way, which is already a problem with some chunks of Spark
    code.

>
> On Tue, Jul 21, 2020 at 12:33 PM Driesprong, Fokko
> <fo...@driesprong.frl> wrote:
>
>     Since we've recently dropped support for Python <=3.5
>     <https://github.com/apache/spark/pull/28957>, I think it would be
>     nice to add support for type annotations. Having this in the main
>     repository allows us to do type checking using MyPy
>     <http://mypy-lang.org/> in the CI itself.
>
>     This is now handled by the Stub
>     file: https://www.python.org/dev/peps/pep-0484/#stub-files However
>     I think it is nicer to integrate the types with the code itself to
>     keep everything in sync, and make it easier for the people who
>     work on the codebase itself. A first step would be to move the
>     stubs into the codebase. First step would be to cover the public
>     API which is the most important one. Having the types with the
>     code itself makes it much easier to understand. For example, if
>     you can supply a str or column
>     here: 
> https://github.com/apache/spark/pull/29122/files#diff-f5295f69bfbdbf6e161aed54057ea36dR2486
>
>     One of the implications would be that future PR's on Python should
>     cover annotations on the public API's. Curious what the rest of
>     the community thinks.
>
>     Cheers, Fokko
>
>
>
>
>
>
>
>
>
>     Op di 21 jul. 2020 om 20:04 schreef zero323
>     <mszymkiew...@gmail.com <mailto:mszymkiew...@gmail.com>>:
>
>         Given a discussion related to  SPARK-32320 PR
>         <https://github.com/apache/spark/pull/29122>   I'd like to
>         resurrect this
>         thread. Is there any interest in migrating annotations to the main
>         repository?
>
>
>
>         --
>         Sent from:
>         http://apache-spark-developers-list.1001551.n3.nabble.com/
>
>         ---------------------------------------------------------------------
>         To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>         <mailto:dev-unsubscr...@spark.apache.org>
>
>
>
> -- 
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark,
> etc.): https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau

-- 
Best regards,
Maciej Szymkiewicz

Web: https://zero323.net
Keybase: https://keybase.io/zero323
Gigs: https://www.codementor.io/@zero323
PGP: A30CEF0C31A501EC

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to