I'm using * Sqoop 1.3.0-cdh3u2 * Hive 0.7.1-cdh3u2 My /tmp/${USER}/hive.log file is not very informative:
2011-11-29 08:04:22,636 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 2011-11-29 08:04:22,636 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 2011-11-29 08:04:22,638 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 2011-11-29 08:04:22,638 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 2011-11-29 08:04:22,638 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. 2011-11-29 08:04:22,638 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. I also changed the permissions on my warehouse directory to "rwxrwxrwx". Any other pointers are welcome ... Thanks, Jurgen On Mon, Nov 28, 2011 at 11:44 PM, arv...@cloudera.com <arv...@cloudera.com> wrote: > Hi Jurgen, > What version of Hive and Sqoop are you using? Also, please look under > /tmp/${USER}/hive.log file which will have more detailed information on what > may be going wrong. > Thanks, > Arvind > > On Mon, Nov 28, 2011 at 3:17 PM, Jurgen Van Gael <jur...@rangespan.com> > wrote: >> >> Hi, >> I am running the Cloudera CDH3 Hive distribution in pseudo-distributed >> mode on my local Mac OS Lion laptop. Hive generally works fine except >> when I use it together with Sqoop. A command like >> >> sqoop import --connect jdbc:mysql://localhost/db --username root >> --password foobar --table sometable --warehouse-dir >> /user/hive/warehouse >> >> completes succesfully and generates part_files, a _logs directory and >> a _SUCCESS file in the hive warehouse directory on HDFS. However, when >> I add the --import-hive part to the Sqoop command, the import still >> works but Hive seems to get into an infinite loop. Looking at the logs >> I find entries >> >> 2011-11-28 22:54:57,279 WARN org.apache.hadoop.hdfs.StateChange: DIR* >> FSDirectory.unprotectedRenameTo: failed to rename >> /user/hive/warehouse/sometable/_SUCCESS to >> /user/hive/warehouse/sometable/_SUCCESS_copy_2 because source does not >> exist >> 2011-11-28 22:54:57,281 WARN org.apache.hadoop.hdfs.StateChange: DIR* >> FSDirectory.unprotectedRenameTo: failed to rename >> /user/hive/warehouse/sometable/_SUCCESS to >> /user/hive/warehouse/sometable/_SUCCESS_copy_3 because source does not >> exist >> 2011-11-28 22:54:57,282 WARN org.apache.hadoop.hdfs.StateChange: DIR* >> FSDirectory.unprotectedRenameTo: failed to rename >> /user/hive/warehouse/sometable/_SUCCESS to >> /user/hive/warehouse/sometable/_SUCCESS_copy_4 because source does not >> exist >> >> I started digging into the source code and can trace it back to >> ql/metadata/Hive.java:checkPaths which tries to find a name for a >> _SUCCESS file during the actual Hive load but somehow fails because >> the Sqoop import MR job already created a _SUCCESS file. I already >> tried disabling MR creation of _SUCCESS files but Hive seems to wait >> for that file to kick off the Hive import and hence fails. >> >> Does anyone have any suggestions on where to search next? >> >> Thanks! Jurgen > > -- ___________________ Jurgen Van Gael Data Scientist at rangespan.com Mobile: +44 (0) 794 3407 007