I sorted it out this by spending a fair bit of time with jar files needed. These are my versions
Hive 3 Hadoop 3.1 MongoDB 4.0.1 You need the following jar file added to Hive ADD JAR hdfs://rhes75:9000/jars/mongo-hadoop-core-2.0.2.jar; ADD JAR hdfs://rhes75:9000/jars/mongo-hadoop-hive-2.0.2.jar; ADD JAR hdfs://rhes75:9000/jars/mongo-java-driver-3.8.1.jar; --ADD JAR hdfs://rhes75:9000/jars/hive-serde-3.1.0.jar; *ADD JAR hdfs://rhes75:9000/jars/hive-serde-0.11.0.jar;* The jar file hive-serde-0.11.0.jar is the one you need to use! This code works for mapping a Hive external table (here accounts.ll_18740868_mongo) to an existing Mongo collection ADD JAR hdfs://rhes75:9000/jars/mongo-hadoop-core-2.0.2.jar; ADD JAR hdfs://rhes75:9000/jars/mongo-hadoop-hive-2.0.2.jar; ADD JAR hdfs://rhes75:9000/jars/mongo-java-driver-3.8.1.jar; --ADD JAR hdfs://rhes75:9000/jars/hive-serde-3.1.0.jar; ADD JAR hdfs://rhes75:9000/jars/hive-serde-0.11.0.jar; DROP TABLE IF EXISTS accounts.ll_18740868_mongo; CREATE EXTERNAL TABLE accounts.ll_18740868_mongo ( id Int , TRANSACTIONDATE Date , TRANSACTIONTYPE String , SORTCODE String , ACCOUNTNUMBER String , TRANSACTIONDESCRIPTION String , DEBITAMOUNT Double , CREDITAMOUNT Double , BALANCE Double ) COMMENT 'Mongo external table' STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' WITH SERDEPROPERTIES('mongo.columns.mapping'= '{ "id":"_id" , "TRANSACTIONDATE":"TRANSACTIONDATE" , "TRANSACTIONTYPE":"TRANSACTIONTYPE" , "SORTCODE":"SORTCODE" , "ACCOUNTNUMBER":"ACCOUNTNUMBER" , "TRANSACTIONDESCRIPTION":"TRANSACTIONDESCRIPTION" , "DEBITAMOUNT":"DEBITAMOUNT" , "CREDITAMOUNT":"CREDITAMOUNT" , "BALANCE":"BALANCE" }' ) TBLPROPERTIES ('mongo.uri'='mongodb://account_user_RO:<password>@rhes75:60100/accounts.ll_18740868') ; desc accounts.ll_18740868_mongo; You will need to load the jar files whenever you need to access this table In MongoDB I have > use accounts; switched to db accounts > db.ll_18740868.count() 3623 And confirmed in Hive 0: jdbc:hive2://rhes75:10099/default> select count(1) from accounts.ll_18740868_mongo; +-------+ | _c0 | +-------+ | 3623 | +-------+ 1 row selected (1.426 seconds) HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Sun, 26 Aug 2018 at 23:05, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > Trying to create an external table in Hive to be accessed by MongoDB. > > This is the code in Hive > > ADD JAR /home/hduser/jars/mongo-hadoop-core-2.0.2.jar; > ADD JAR /home/hduser/jars/mongo-hadoop-hive-2.0.2.jar; > ADD JAR /home/hduser/jars/mongo-java-driver-3.8.1.jar; > use accounts; > DROP TABLE IF EXISTS ll_18740868_mongo; > CREATE EXTERNAL TABLE ll_18740868_mongo ( > TransactionDate DATE > ,TransactionType String > ,SortCode String > ,AccountNumber String > ,TransactionDescription String > ,DebitAmount Double > ,CreditAmount Double > ,Balance Double > ) > COMMENT 'Mongo external table' > STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' > WITH SERDEPROPERTIES('mongo.columns.mapping'= > '{ > "TransactionDate":"TransactionDate", > "TransactionType":"TransactionType", > "SortCode":"SortCode", > "AccountNumber":"AccountNumber", > "TransactionDescription":"TransactionDescription", > "DebitAmount":"DebitAmount", > "CreditAmount":"CreditAmount", > "Balance":"Balance" > }' > ) > TBLPROPERTIES ('mongo.uri'='mongodb://account_user_RO:mongodb@rhes75 > :60100/accounts.ll_18740868_mongo') > ; > > In debug mode it throws this error > > CREATE EXTERNAL TABLE ll_18740868_mongo ( > TransactionDate DATE > ,TransactionType String > ,SortCode String > ,AccountNumber String > ,TransactionDescription String > ,DebitAmount Double > ,CreditAmount Double > ,Balance Double > ) > COMMENT 'Mongo external table' > STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' > WITH SERDEPROPERTIES('mongo.columns.mapping'= > '{ > "TransactionDate":"TransactionDate", > "TransactionType":"TransactionType", > "SortCode":"SortCode", > "AccountNumber":"AccountNumber", > "TransactionDescription":"TransactionDescription", > "DebitAmount":"DebitAmount", > "CreditAmount":"CreditAmount", > "Balance":"Balance" > }' > ) > TBLPROPERTIES ('mongo.uri'='mongodb://account_user_RO:mongodb@rhes75 > :60100/accounts.ll_18740868_mongo') > 2018-08-26 23:01:32,424 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > ql.Driver: Starting task [Stage-0:DDL] in serial mode > 2018-08-26 23:01:32,426 ERROR [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > exec.DDLTask: Failed > java.lang.NoClassDefFoundError: org/apache/hadoop/hive/serde2/SerDe > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at java.security.SecureClassLoader.defineClass > (SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) > at java.net.URLClassLoader.access$100(URLClassLoader.java:73) > at java.net.URLClassLoader$1.run(URLClassLoader.java:368) > at java.net.URLClassLoader$1.run(URLClassLoader.java:362) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:361) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > com.mongodb.hadoop.hive.MongoStorageHandler.getSerDeClass(MongoStorageHandler.java:88) > at > org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(CreateTableDesc.java:744) > at > org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4882) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:428) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2479) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2150) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1561) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:471) > at > org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:487) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.SerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 39 more > 2018-08-26 23:01:32,426 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > reexec.ReOptimizePlugin: ReOptimization: retryPossible: false > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe > 2018-08-26 23:01:32,426 ERROR [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > ql.Driver: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe > 2018-08-26 23:01:32,426 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > ql.Driver: Completed executing > command(queryId=hduser_20180826230132_53320afb-5265-407f-82a3-a1b22e584e20); > Time taken: 0.005 seconds > 2018-08-26 23:01:32,426 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > lockmgr.DbTxnManager: Stopped heartbeat for query: > hduser_20180826230132_53320afb-5265-407f-82a3-a1b22e584e20 > 2018-08-26 23:01:32,437 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > conf.HiveConf: Using the default value passed in for log id: > a928186b-244d-4b81-bda9-dd3a83e306b2 > 2018-08-26 23:01:32,437 INFO [a928186b-244d-4b81-bda9-dd3a83e306b2 main] > session.SessionState: Resetting thread name to main > 2018-08-26 23:01:32,437 INFO [main] conf.HiveConf: Using the default > value passed in for log id: a928186b-244d-4b81-bda9-dd3a83e306b2 > 2018-08-26 23:01:32,447 INFO [main] session.SessionState: Deleted > directory: /tmp/hive/hduser/a928186b-244d-4b81-bda9-dd3a83e306b2 on fs with > scheme hdfs > 2018-08-26 23:01:32,447 INFO [main] session.SessionState: Deleted > directory: /tmp/hive/a928186b-244d-4b81-bda9-dd3a83e306b2 on fs with scheme > file > 2018-08-26 23:01:32,447 INFO [main] metastore.HiveMetaStoreClient: Closed > a connection to metastore, current connections: 0 > > May be I am using incompatible jars here? > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > >