Ruben Agudo created SQOOP-3471: ---------------------------------- Summary: While doing sqoop-export mapper progress goes back causing duplicated data Key: SQOOP-3471 URL: https://issues.apache.org/jira/browse/SQOOP-3471 Project: Sqoop Issue Type: Bug Affects Versions: 1.4.6 Reporter: Ruben Agudo Attachments: image-2020-04-21-10-36-15-108.png
We are running the sqoop-export tool in Qubole, to export some data from S3 back to an SQL Server Database. Our issue is that sometimes, one of the mappers of the mapping part seem that fail/restart or something. basically we see the progress going back like in the following image: !image-2020-04-21-10-36-15-108! This is causing duplicates in our destination table. I'm a bit lost because in the documentation it says that *"If an export map task fails due to these or other reasons, it will cause the export job to fail."* and this is not the behaviour we are seeing. Unfortunately we can't duplicate it in a consistent manner. The command that we are running is: sqoop export -Dsqoop.export.records.per.statement=50000 -Dsqoop.export.statements.per.transaction=100 -Dsqoop.throwOnError=1 --connection-manager org.apache.sqoop.manager.SQLServerManager --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --connect connectionString --table config.table --export-dir config.source --input-fields-terminated-by , --num-mappers 8 --columns theColumnsToCopy --batch --schema theSchema I removed the things that I can't add for privacy reasons. What could be the cause of the mapper going back in progress? And, if that happens, is it possible to make the sqoop export fail? Also, if this isn't the correct channel for this, please let me know. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)