[jira] [Created] (SQOOP-3161) Import : Controlling Parallelism with Splitting column not evenly distributed

Romain Mercier (JIRA) Fri, 24 Mar 2017 04:43:43 -0700

Romain Mercier created SQOOP-3161:
-------------------------------------

             Summary: Import : Controlling Parallelism with Splitting column 
not evenly distributed
                 Key: SQOOP-3161
                 URL: https://issues.apache.org/jira/browse/SQOOP-3161
             Project: Sqoop
          Issue Type: Improvement
    Affects Versions: 1.4.6
            Reporter: Romain Mercier
            Priority: Minor



Hello everyone, 

Improvement of --split-by :

To import a large table I parallelise the import with multiple mappers.
But when the splitting column is not evenly distributed, and there is no other 
column to use, the load is handle by few mappers only.

Is there a way to distribute equally the load between the multiples mappers 
with the actual numbers of rows (not the min and max of a column) ?

Thank you.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (SQOOP-3161) Import : Controlling Parallelism with Splitting column not evenly distributed

Reply via email to