Romain Mercier created SQOOP-3161:
-------------------------------------
Summary: Import : Controlling Parallelism with Splitting column
not evenly distributed
Key: SQOOP-3161
URL: https://issues.apache.org/jira/browse/SQOOP-3161
Project: Sqoop
Issue Type: Improvement
Affects Versions: 1.4.6
Reporter: Romain Mercier
Priority: Minor
Hello everyone,
Improvement of --split-by :
To import a large table I parallelise the import with multiple mappers.
But when the splitting column is not evenly distributed, and there is no other
column to use, the load is handle by few mappers only.
Is there a way to distribute equally the load between the multiples mappers
with the actual numbers of rows (not the min and max of a column) ?
Thank you.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)