Re: [PR] [SPARK-48761][SQL] Introduce clusterBy DataFrameWriter API for Scala [spark]

via GitHub Tue, 16 Jul 2024 01:57:43 -0700


cloud-fan commented on code in PR #47301:
URL: https://github.com/apache/spark/pull/47301#discussion_r1679019564



##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala:
##########
@@ -201,6 +201,22 @@ final class DataFrameWriter[T] private[sql] (ds: 
Dataset[T]) {
     this
   }
 
+  /**
+   * Clusters the output by the given columns on the file system. The rows 
with matching values in

Review Comment:
   let's be a bit more general as data sources are not always based on file 
system. How about `... given columns on the storage.`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-48761][SQL] Introduce clusterBy DataFrameWriter API for Scala [spark]

Reply via email to