This constant was added in Hadoop 2.3. Maybe you are using an older version?
~bhaskar On Thu, Aug 25, 2016 at 3:04 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Actually I started using Spark to import data from RDBMS (in this case > Oracle) after upgrading to Hive 2, running an import like below > > sqoop import --connect "jdbc:oracle:thin:@rhes564:1521:mydb12" --username > scratchpad -P \ > --query "select * from scratchpad.dummy2 where \ > \$CONDITIONS" \ > --split-by ID \ > --hive-import --hive-table "test.dumy2" --target-dir > "/tmp/dummy2" *--direct* > > This gets the data into HDFS and then throws this error > > ERROR [main] tool.ImportTool: Imported Failed: No enum constant > org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS > > I can easily get the data into Hive from the file on HDFS or dig into the > problem (Spark 2, Hive 2, Hadoop 2.6, Sqoop 1.4.5) but I find Spark trouble > free like below > > val df = HiveContext.read.format("jdbc").options( > Map("url" -> dbURL, > "dbtable" -> "scratchpad.dummy)", > "partitionColumn" -> partitionColumnName, > "lowerBound" -> lowerBoundValue, > "upperBound" -> upperBoundValue, > "numPartitions" -> numPartitionsValue, > "user" -> dbUserName, > "password" -> dbPassword)).load > > It does work, opens parallel connections to Oracle DB and creates DF with > the specified number of partitions. > > One thing I am not sure or tried if Spark supports direct mode yet. > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 25 August 2016 at 09:07, Bhaskar Dutta <bhas...@gmail.com> wrote: > >> Which RDBMS are you using here, and what is the data volume and frequency >> of pulling data off the RDBMS? >> Specifying these would help in giving better answers. >> >> Sqoop has a direct mode (non-JDBC) support for Postgres, MySQL and >> Oracle, so you can use that for better performance if using one of these >> databases. >> >> And don't forget that you Sqoop can load data directly into Parquet or >> Avro (I think direct mode is not supported in this case). >> Also you can use Kite SDK with Sqoop to manage/transform datasets, >> perform schema evolution and such. >> >> ~bhaskar >> >> >> On Thu, Aug 25, 2016 at 3:09 AM, Venkata Penikalapati < >> mail.venkatakart...@gmail.com> wrote: >> >>> Team, >>> Please help me in choosing sqoop or spark jdbc to fetch data from rdbms. >>> Sqoop has lot of optimizations to fetch data does spark jdbc also has those >>> ? >>> >>> I'm performing few analytics using spark data for which data is residing >>> in rdbms. >>> >>> Please guide me with this. >>> >>> >>> Thanks >>> Venkata Karthik P >>> >>> >> >