[ https://issues.apache.org/jira/browse/SQOOP-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ruben Agudo updated SQOOP-3471: ------------------------------- Description: We are running the sqoop-export tool in Qubole, to export some data from S3 back to an SQL Server Database. Our issue is that sometimes, one of the mappers of the mapping part seem that fail/restart or something. basically we see the progress going back like in the following image: !image-2020-04-21-10-36-15-108.png! This is causing duplicates in our destination table. I'm a bit lost because in the documentation it says that *"If an export map task fails due to these or other reasons, it will cause the export job to fail."* and this is not the behaviour we are seeing. Unfortunately we can't duplicate it in a consistent manner. The command that we are running is: sqoop export -Dsqoop.export.records.per.statement=50000 -Dsqoop.export.statements.per.transaction=100 -Dsqoop.throwOnError=1 --connection-manager org.apache.sqoop.manager.SQLServerManager --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --connect connectionString --table config.table --export-dir config.source --input-fields-terminated-by , --num-mappers 8 --columns theColumnsToCopy --batch --schema theSchema I removed the things that I can't add for privacy reasons. And the table we want to export contains 237,371,726 records. What could be the cause of the mapper going back in progress? And, if that happens, is it possible to make the sqoop export fail? Also, if this isn't the correct channel for this, please let me know. Thanks! was: We are running the sqoop-export tool in Qubole, to export some data from S3 back to an SQL Server Database. Our issue is that sometimes, one of the mappers of the mapping part seem that fail/restart or something. basically we see the progress going back like in the following image: !image-2020-04-21-10-36-15-108.png! This is causing duplicates in our destination table. I'm a bit lost because in the documentation it says that *"If an export map task fails due to these or other reasons, it will cause the export job to fail."* and this is not the behaviour we are seeing. Unfortunately we can't duplicate it in a consistent manner. The command that we are running is: sqoop export -Dsqoop.export.records.per.statement=50000 -Dsqoop.export.statements.per.transaction=100 -Dsqoop.throwOnError=1 --connection-manager org.apache.sqoop.manager.SQLServerManager --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --connect connectionString --table config.table --export-dir config.source --input-fields-terminated-by , --num-mappers 8 --columns theColumnsToCopy --batch --schema theSchema I removed the things that I can't add for privacy reasons. What could be the cause of the mapper going back in progress? And, if that happens, is it possible to make the sqoop export fail? Also, if this isn't the correct channel for this, please let me know. Thanks! > While doing sqoop-export mapper progress goes back causing duplicated data > -------------------------------------------------------------------------- > > Key: SQOOP-3471 > URL: https://issues.apache.org/jira/browse/SQOOP-3471 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.4.6 > Reporter: Ruben Agudo > Priority: Major > Attachments: image-2020-04-21-10-36-15-108.png > > > We are running the sqoop-export tool in Qubole, to export some data from S3 > back to an SQL Server Database. > Our issue is that sometimes, one of the mappers of the mapping part seem that > fail/restart or something. basically we see the progress going back like in > the following image: > !image-2020-04-21-10-36-15-108.png! > This is causing duplicates in our destination table. I'm a bit lost because > in the documentation it says that *"If an export map task fails due to these > or other reasons, it will cause the export job to fail."* and this is not the > behaviour we are seeing. > Unfortunately we can't duplicate it in a consistent manner. > The command that we are running is: > sqoop export > -Dsqoop.export.records.per.statement=50000 > -Dsqoop.export.statements.per.transaction=100 > -Dsqoop.throwOnError=1 > --connection-manager org.apache.sqoop.manager.SQLServerManager > --driver com.microsoft.sqlserver.jdbc.SQLServerDriver > --connect connectionString > --table config.table > --export-dir config.source > --input-fields-terminated-by , > --num-mappers 8 > --columns theColumnsToCopy > --batch > --schema theSchema > I removed the things that I can't add for privacy reasons. > And the table we want to export contains 237,371,726 records. > What could be the cause of the mapper going back in progress? And, if that > happens, is it possible to make the sqoop export fail? > Also, if this isn't the correct channel for this, please let me know. > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)