cloud-fan commented on code in PR #54014:
URL: https://github.com/apache/spark/pull/54014#discussion_r2757869861


##########
sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala:
##########
@@ -2010,6 +2010,35 @@ abstract class Dataset[T] extends Serializable {
    */
   def exceptAll(other: Dataset[T]): Dataset[T]
 
+  /**
+   * Returns a new [[Dataset]] by appending a column containing consecutive 
0-based Long indices,
+   * similar to `RDD.zipWithIndex()`.
+   *
+   * The index column is appended as the last column of the resulting 
[[DataFrame]].
+   *
+   * @group typedrel

Review Comment:
   ```suggestion
      * @group untypedrel
   ```



##########
sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala:
##########
@@ -2010,6 +2010,35 @@ abstract class Dataset[T] extends Serializable {
    */
   def exceptAll(other: Dataset[T]): Dataset[T]
 
+  /**
+   * Returns a new [[Dataset]] by appending a column containing consecutive 
0-based Long indices,
+   * similar to `RDD.zipWithIndex()`.
+   *
+   * The index column is appended as the last column of the resulting 
[[DataFrame]].
+   *
+   * @group typedrel
+   * @since 4.2.0
+   */
+  def zipWithIndex(): DataFrame = zipWithIndex("index")
+
+  /**
+   * Returns a new [[Dataset]] by appending a column containing consecutive 
0-based Long indices,
+   * similar to `RDD.zipWithIndex()`.
+   *
+   * The index column is appended as the last column of the resulting 
[[DataFrame]].
+   *
+   * @note
+   *   If a column with `indexColName` already exists in the schema, the 
resulting [[DataFrame]]
+   *   will have duplicate column names. Selecting the duplicate column by 
name will throw
+   *   `AMBIGUOUS_REFERENCE`, and writing the [[DataFrame]] will throw 
`COLUMN_ALREADY_EXISTS`.
+   *
+   * @param indexColName
+   *   The name of the index column to append.
+   * @group typedrel

Review Comment:
   ```suggestion
      * @group untypedrel
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to