Before you look at any new library/tool: What is the process of importing, what is the original file format, file size, compression etc . once you have investigated this you can start improving it. Then, as a last step a new framework can be explored. Feel free to share those and we can help you better. BTW if you need to use Spark then go for 2.x - it is also available in HDP.
> On 22. Oct 2017, at 10:20, Pradeep <pradeep.mi...@mail.com> wrote: > > We are on Hortonworks 2.5 and very soon upgrading to 2.6. Spark version 1.6.2. > > We have large volume of data that we bulk load to HBase using import tsv. Map > Reduce job is very slow and looking for options we can use spark to improve > performance. Please let me know if this can be optimized with spark and what > packages or libs can be used. > > PM > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org