May I ask why you wrote your own MongoDB UDF when one exists? https://github.com/mongodb/mongo-hadoop/blob/master/pig/README.md ᐧ
On Thu, Dec 4, 2014 at 3:58 PM, Suraj Nayak M <snay...@gmail.com> wrote: > Hi Cesar, > > UDF is good for processing data. For writing data you should write custom > Storer. Also, for writing data into MongoDB there is already Storer written > available in GitHub with more rich features. > > Take a look at : https://github.com/mongodb/mongo-hadoop/tree/master/pig > > MongoStorage code : https://github.com/mongodb/ > mongo-hadoop/blob/master/pig/src/main/java/com/mongodb/ > hadoop/pig/MongoStorage.java > > Thanks > Suraj Nayak > > > On Wednesday 03 December 2014 01:50 PM, Cesar Pumar García wrote: > >> Hi there, >> >> We are given a text file containing several lines, where each one >> corresponds a mongo document, and we load it as follows: >> >> DEFINE PigToMongo com.beeva.PigToMongo.PigToMongo(); >> >> A = LOAD '/home/hduser/pigfiles/input.txt' USING TextLoader() AS >> (line:chararray); >> >> B = FOREACH A GENERATE PigToMongo(line); >> >> DUMP B >> >> By using PigToMongo(line), we connect to mongo, map A, write and close the >> connection. >> >> PigToMongo creates a connection for each line as follows (which implies >> our >> MongoDB is down*): >> >> MongoClient mongoClient = new MongoClient( "localhost" , 27017 ); >> DB db = mongoClient.getDB( "hadoopDB" ); >> DBCollection coll = db.getCollection("output0"); >> >> I wonder whether it is possible to open and close the connection only >> once, >> outside the UDF. >> >> - By the way, does MongoDB support multiple connections at the same >> time? (from several reducers storing data during a map/reduce job, for >> example) >> >> >> Thank you, >> >> >> *CÉSAR PUMAR GARCÍA* >> >> *BEEVA FOR GRADUATES* >> >> >> >> *cesar.pu...@beeva.com <cesar.pu...@beeva.com>[image: www.beeva.com] >> <http://www.beeva.com>* >> >> > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com