Naresh AR created SQOOP-3436: -------------------------------- Summary: sqoop imports data from oracle exadata has duplicates Key: SQOOP-3436 URL: https://issues.apache.org/jira/browse/SQOOP-3436 Project: Sqoop Issue Type: Bug Components: sqoop2-build, sqoop2-jdbc-connector Affects Versions: 1.4.7 Environment: sqoop1.4.7,hortonworks2.6.3 Reporter: Naresh AR
Hi I have used sqoop with oracle exadata which results in complete row duplicate ,at present we are removing using the distinct query and dumping into another target table,Please suggest on this Background for oracle table : Oracle used for sqoop import have no primary keys involved (i.e) tables are of scd type2 and have complex keys as primary keys which does not suit split by option and tables are very huge(100gig) Command used for sqoop import from oracle exadata sqoop import --connect %s@//%s:%s/%s --username %s -password %s --table %s.%s --fields-terminated-by '%s' --hive-drop-import-delims --hive-import --hive-overwrite --hive-table %s.%s --null-string '\\\N' --null-non-string '\\\N' --m %s --fetch-size=2500 -- This message was sent by Atlassian JIRA (v7.6.3#76005)