Hi I'm a Python Developer (& Data Scientist) and I contributed to Debian[1][2] last year as a part of Google Summer of Code[3]. Having used Lucene, Kafka and Spark in the past, I wanted to work on at least one of them this summer. Since Spark uses Python[4] (API) unlike the others, I felt I could genuinely contribute to the project. I haven't raised any PRs in the Spark project yet but given the fact that GSoC doesn't begin until May, I can familiarise myself with the codebase as well as my tasks for GSoC well in time.
Would appreciate if someone could mentor by helping pick a Python feature that I could add over the course of GSoC or any bugs I can fix from the JIRA page. In the meantime, are there any Python issues I can get started with? [1] https://pypistats.org/packages/debdialer <https://mailtrack.io/trace/link/edd690867f3e20f999620463a7ba6d7baddbaf2c?url=https%3A%2F%2Fpypistats.org%2Fpackages%2Fdebdialer&userId=2647362&signature=c02a0fff3ed0984c> [2] https://salsa.debian.org/comfortablydumb-guest/Hello-from-the-Debian-side <https://mailtrack.io/trace/link/e7bd891cbb2b489dfd65e75609e281e3eef707ba?url=https%3A%2F%2Fsalsa.debian.org%2Fcomfortablydumb-guest%2FHello-from-the-Debian-side&userId=2647362&signature=9b4bc3dea979738c> [3] https://summerofcode.withgoogle.com/archive/2018/projects/5682274280407040/ <https://mailtrack.io/trace/link/35743a7fdd21587196e4861ea9614ada33be329a?url=https%3A%2F%2Fsummerofcode.withgoogle.com%2Farchive%2F2018%2Fprojects%2F5682274280407040%2F&userId=2647362&signature=8cbecef2d9f51b01> [4] https://projects.apache.org/project.html?spark <https://mailtrack.io/trace/link/fe514b56e6223b789aabd76a26271242b4408772?url=https%3A%2F%2Fprojects.apache.org%2Fproject.html%3Fspark&userId=2647362&signature=2aa68924cf2d27f8> Thanks, Vishal Gupta