yihua commented on code in PR #6450:
URL: https://github.com/apache/hudi/pull/6450#discussion_r950435345
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##########
@@ -386,11 +386,11 @@ object DataSourceWriteOptions {
.withDocumentation(" Config to indicate how long (by millisecond) before a
retry should issued for failed microbatch")
/**
- * By default true (in favor of streaming progressing over data integrity)
+ * By default false. If users prefer streaming progress over data integrity,
can set this to true.
*/
val STREAMING_IGNORE_FAILED_BATCH: ConfigProperty[String] = ConfigProperty
.key("hoodie.datasource.write.streaming.ignore.failed.batch")
- .defaultValue("true")
+ .defaultValue("false")
Review Comment:
I wonder if we should remove such a sweep-under-the-rug kind of config which
could hide the critical errors. At the very least, we should clearly warn
users of the consequence of turning this on in the docs.
For this particular config, could you add the docs saying that it could hide
the write status error while the checkpoint moves on?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]