bogao007 commented on code in PR #50349:
URL: https://github.com/apache/spark/pull/50349#discussion_r2011036886


##########
python/pyspark/sql/pandas/types.py:
##########
@@ -1424,6 +1424,12 @@ def _to_numpy_type(type: DataType) -> 
Optional["np.dtype"]:
         return np.dtype("float32")
     elif type == DoubleType():
         return np.dtype("float64")
+    elif type == TimestampType():

Review Comment:
   @HyukjinKwon It seems 
[spark_type_to_pandas_dtype](https://github.com/apache/spark/blob/b2290444e9c1430c18efb5c8de1dce264034dd4d/python/pyspark/pandas/typedef/typehints.py#L296-L297)
 uses `datetime64[ns]` instead of `datetime64[us]`. This would still return the 
same error since Spark only supports microsecond when [converting from 
Arrow](https://github.com/apache/spark/blob/b2290444e9c1430c18efb5c8de1dce264034dd4d/sql/api/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala#L90).
 We actually have 
[_to_corrected_pandas_type](https://github.com/apache/spark/blob/b2290444e9c1430c18efb5c8de1dce264034dd4d/python/pyspark/sql/pandas/types.py#L748-L777)
 in the same file to reuse, but it also uses nanosecond and would fail in this 
case. Any suggestions on reusing this but also fixing the issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to