+1 from my side

sounds good, it will be helpful to both users and contributors to improve
the test coverage

On Wed, Jun 14, 2023 at 8:27 AM Hyukjin Kwon <gurwls...@apache.org> wrote:

> Yeah, I have been thinking about this too, and Holden did some work here
> that this SPIP will reuse. I support this.
>
> On Wed, 14 Jun 2023 at 08:10, Amanda Liu <amanda....@databricks.com.invalid>
> wrote:
>
>> Hi all,
>>
>> I'd like to start a discussion about implementing an official PySpark
>> test framework. Currently, there's no official test framework, but only
>> various open-source repos and blog posts.
>>
>> Many of these open-source resources are very popular, which demonstrates
>> user-demand for PySpark testing capabilities. spark-testing-base
>> <https://github.com/holdenk/spark-testing-base> has 1.4k stars, and
>> chispa <https://github.com/MrPowers/chispa> has 532k downloads/month.
>> However, it can be confusing for users to piece together disparate
>> resources to write their own PySpark tests (see The Elephant in the
>> Room: How to Write PySpark Tests
>> <https://towardsdatascience.com/the-elephant-in-the-room-how-to-write-pyspark-unit-tests-a5073acabc34>
>> ).
>>
>> We can streamline and simplify the testing process by incorporating test
>> features, such as a PySpark Test Base class (which allows tests to share
>> Spark sessions) and test util functions (for example, asserting dataframe
>> and schema equality).
>>
>> Please see the SPIP document attached:
>> https://docs.google.com/document/d/1OkyBn3JbEHkkQgSQ45Lq82esXjr9rm2Vj7Ih_4zycRc/edit#heading=h.f5f0u2riv07vAnd
>> the JIRA ticket: https://issues.apache.org/jira/browse/SPARK-44042
>>
>> I would appreciate it if you could share your thoughts on this proposal.
>>
>> Thank you!
>> Amanda Liu
>>
>

Reply via email to