In the past few weeks we’ve seen multiple PyPi published packages compromised, thankfully none of them are PySpark dependencies, but it seems like we might want to consider something.
Downside of pinning all dependencies is a higher likelihood of conflicts if folks keep running old versions of PySpark for a long time. One possibility would be to make the pinned version optional (eg pyspark[pinned]) or publish a separate constraints file for people to optionally use with -c? I’m wondering do other folks share my concern here? Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her
