Hi, Michael. I Have the same problem. My warehouse directory is always created locally. I copied the default hive-site.xml into the $SPARK_HOME/conf directory on each node. After I executed the code below, val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)") hiveContext.hql("LOAD DATA LOCAL INPATH '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE src") hiveContext.hql("FROM src SELECT key, value").collect()
I got the exception below: java.io.FileNotFoundException: File file:/user/hive/warehouse/src/kv1.txt does not exist at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763) at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:106) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193) At last, I found /user/hive/warehouse/src/kv1.txt was created on the node where I start spark-shell. The spark that I used is pre-built spark1.0.1 for hadoop2. Thanks in advance. Michael Armbrust wrote > The warehouse and the metastore directories are two different things. The > metastore holds the schema information about the tables and will by > default > be a local directory. With javax.jdo.option.ConnectionURL you can > configure it to be something like mysql. The warehouse directory is the > default location where the actual contents of the tables is stored. What > directory are seeing created locally? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11024.html Sent from the Apache Spark User List mailing list archive at Nabble.com.