MongoDB Bulk Inserts

Benny Thompson Thu, 20 Nov 2014 19:21:07 -0800

I'm trying to use MongoDB as a destination for an ETL I'm writing in
Spark.  It appears I'm gaining a lot of overhead in my system databases
(and possibly in the primary documents themselves);  I can only assume it's
because I'm left to using PairRDD.saveAsNewAPIHadoopFile.


- Is there a way to batch some of the data together and use Casbah natively
so I can use bulk inserts?

- Is there maybe a less "hacky" way to load to MongoDB (instead of
using saveAsNewAPIHadoopFile)?

Thanks in advance!

MongoDB Bulk Inserts

Reply via email to