kywe665 commented on a change in pull request #4107:
URL: https://github.com/apache/hudi/pull/4107#discussion_r757680456



##########
File path: website/docs/markers.md
##########
@@ -0,0 +1,90 @@
+---
+title: Write Markers
+toc: true
+---
+
+## Purpose of Markers
+A write operation can fail before it completes, leaving partial or corrupt 
data files on storage. Markers are used to track 
+and cleanup any partial or failed write operations. As a write operation 
begins, a marker is created indicating 
+that a file write is in progress. When the write commit succeeds, the marker 
is deleted. If a write operation fails part 
+way through, a marker is left behind which indicates that the file is 
incomplete. Two important operations that use markers include: 
+
+- **Removing duplicate/partial data files**: 
+  - in Spark, the Hudi write client delegates the data file writing to 
multiple executors. One executor can fail the task, 

Review comment:
       thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to