Hi Imran,

My understanding is that doctests and unittests are orthogonal - doctests
are used to make sure docstring examples are correct and are not meant to
replace unittests.
Functionalities are covered by unit tests to ensure correctness and
doctests are used to test the docstring, not the functionalities itself.

There are issues with doctests, for example, we cannot test arrow related
functions in doctest because of pyarrow is optional dependency, but I think
that's a separate issue.

Does this make sense?

Li

On Wed, Aug 29, 2018 at 6:35 PM Imran Rashid <iras...@cloudera.com.invalid>
wrote:

> Hi,
>
> I'd like to propose that we move away from such heavy reliance on doctests
> in python, and move towards more traditional unit tests.  The main reason
> is that its hard to share test code in doc tests.  For example, I was just
> looking at
>
> https://github.com/apache/spark/commit/82c18c240a6913a917df3b55cc5e22649561c4dd
>  and wondering if we had any tests for some of the pyspark changes.
> SparkSession.createDataFrame has doctests, but those are just run with one
> standard spark configuration, which does not enable arrow.  Its hard to
> easily reuse that test, just with another spark context with a different
> conf.  Similarly I've wondered about reusing test cases but with
> local-cluster instead of local mode.  I feel like they also discourage
> writing a test which tries to get more exhaustive coverage on corner cases.
>
> I'm not saying we should stop using doctests -- I see why they're nice.  I
> just think they should really only be when you want that code snippet in
> the doc anyway, so you might as well test it.
>
> Admittedly, I'm not really a python-developer, so I could be totally wrong
> about the right way to author doctests -- pushback welcome!
>
> Thoughts?
>
> thanks,
> Imran
>

Reply via email to