Before you look at any new library/tool:
What is the process of importing, what is the original file format, file size, 
compression etc . once you have investigated this you can start improving it. 
Then, as a last step a new framework can be explored.
Feel free to share those and we can help you better.
BTW if you need to use Spark then go for 2.x - it is also available in HDP.

> On 22. Oct 2017, at 10:20, Pradeep <pradeep.mi...@mail.com> wrote:
> 
> We are on Hortonworks 2.5 and very soon upgrading to 2.6. Spark version 1.6.2.
> 
> We have large volume of data that we bulk load to HBase using import tsv. Map 
> Reduce job is very slow and looking for options we can use spark to improve 
> performance. Please let me know if this can be optimized with spark and what 
> packages or libs can be used.
> 
> PM
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to