Naresh AR created SQOOP-3436:
--------------------------------

             Summary: sqoop imports data from oracle exadata has duplicates
                 Key: SQOOP-3436
                 URL: https://issues.apache.org/jira/browse/SQOOP-3436
             Project: Sqoop
          Issue Type: Bug
          Components: sqoop2-build, sqoop2-jdbc-connector
    Affects Versions: 1.4.7
         Environment: sqoop1.4.7,hortonworks2.6.3
            Reporter: Naresh AR


Hi I have used sqoop with oracle exadata which results in complete row 
duplicate ,at present we are removing using the distinct query and dumping into 
another target table,Please suggest on this

Background for oracle table :

Oracle used for sqoop import have no primary keys involved (i.e) tables are of 
scd type2 and have complex keys as primary keys which does not suit split by 
option and tables are very huge(100gig)

Command used for sqoop import from oracle exadata

sqoop import --connect %s@//%s:%s/%s --username %s -password %s --table %s.%s 
--fields-terminated-by '%s' --hive-drop-import-delims --hive-import 
--hive-overwrite --hive-table %s.%s --null-string '\\\N' --null-non-string 
'\\\N' --m %s --fetch-size=2500



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to