wengh commented on code in PR #50684:
URL: https://github.com/apache/spark/pull/50684#discussion_r2071054016


##########
python/docs/source/user_guide/sql/python_data_source.rst:
##########
@@ -356,17 +356,28 @@ For library that are used inside a method, it must be 
imported inside the method
         from pyspark import TaskContext
         context = TaskContext.get()
 
+Mutating State
+~~~~~~~~~~~~~~
+Some methods such as DataSourceReader.read() and DataSourceReader.partitions() 
must be stateless. Changes to the object state made in these methods are not 
guaranteed to be visible or invisible to future invocations.
+
+Other methods such as DataSource.schema() and 
DataSourceStreamReader.latestOffset() can be stateful. Changes to the object 
state made in these methods are visible to future invocations.
+
+Refer to the documentation of each method for more details.

Review Comment:
   the method documentations don't actually mention whether it can change state 
:(



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to