eason-yuchen-liu commented on code in PR #54298:
URL: https://github.com/apache/spark/pull/54298#discussion_r2818479117


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala:
##########
@@ -1408,6 +1408,28 @@ class RocksDBStateStoreChangeDataReader(
 
   override protected val changelogSuffix: String = "changelog"
 
+  /**
+   * Read the next changelog record, skipping DELETE_RANGE_RECORD entries as 
they cannot
+   * be represented as individual key-value change records in the state change 
data feed.
+   * Returns null if there are no more records.
+   */
+  private def readNextChangelogRecord():
+      (RecordType.Value, Array[Byte], Array[Byte]) = {
+    var reader = currentChangelogReader()

Review Comment:
   reader can be val, and there is no need to have a while loop.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala:
##########
@@ -1408,6 +1408,30 @@ class RocksDBStateStoreChangeDataReader(
 
   override protected val changelogSuffix: String = "changelog"
 
+  /**
+   * Read the next changelog record, skipping DELETE_RANGE_RECORD entries as 
they cannot
+   * be represented as individual key-value change records in the state change 
data feed.
+   * Returns null if there are no more records.
+   */
+  private def readNextChangelogRecord():
+      (RecordType.Value, Array[Byte], Array[Byte]) = {
+    while (true) {
+      val reader = currentChangelogReader()
+      if (reader == null) {
+        return null
+      }
+      val nextRecord = reader.next()
+      if (nextRecord._1 == RecordType.DELETE_RANGE_RECORD) {
+        logWarning(log"Skipping DELETE_RANGE_RECORD in state change data feed 
" +

Review Comment:
   I am more in favor of showing the delete_range entry, but leave the key and 
value entry as null. Or even better, we show the actual range to delete in the 
key column, but this may require slight schema change on the key column.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to