[ https://issues.apache.org/jira/browse/HADOOP-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mukund Thakur reopened HADOOP-17215: ------------------------------------ Some customer spark jobs are failing intermittently because of this {noformat} org.apache.hadoop.fs.PathIOException: `abfs://d...@xxxx.dfs.core.windows.net/warehouse/tablespace/external/hive/audit/bdp/consumption/structured/impala/audit.db/job_status/_SUCCESS': Input/output error: Parallel access to the create path detected. Failing request to honor single writer semantics{noformat} When disabling this, jobs are succeeding. We need to identify why this was done. cc [~anujmodi] [~manika137] [~ste...@apache.org] > ABFS: Support for conditional overwrite > --------------------------------------- > > Key: HADOOP-17215 > URL: https://issues.apache.org/jira/browse/HADOOP-17215 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.3.0 > Reporter: Sneha Vijayarajan > Assignee: Sneha Vijayarajan > Priority: Major > Labels: abfsactive > Fix For: 3.3.1, 3.4.0 > > > Filesystem Create APIs that do not accept an argument for overwrite flag end > up defaulting it to true. > We are observing that request count of creates with overwrite=true is more > and primarily because of the default setting of the flag is true of the > called Create API. When a create with overwrite ends up timing out, we have > observed that it could lead to race conditions between the first create and > retried one running almost parallel. > To avoid this scenario for create with overwrite=true request, ABFS driver > will always attempt to create without overwrite. If the create fails due to > fileAlreadyPresent, it will resend the request with overwrite=true. > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org