Gave it another try - it seems that it picks up the variable and prints out the correct value, but still puts the metatore_db folder in the current directory, regardless.
On Sat, May 16, 2015 at 1:13 PM, Tamas Jambor <jambo...@gmail.com> wrote: > Thank you for the reply. > > I have tried your experiment, it seems that it does not print the settings > out in spark-shell (I'm using 1.3 by the way). > > Strangely I have been experimenting with an SQL connection instead, which > works after all (still if I go to spark-shell and try to print out the SQL > settings that I put in hive-site.xml, it does not print them). > > > On Fri, May 15, 2015 at 7:22 PM, Yana Kadiyska <yana.kadiy...@gmail.com> > wrote: > >> My point was more to how to verify that properties are picked up from >> the hive-site.xml file. You don't really need hive.metastore.uris if >> you're not running against an external metastore. I just did an >> experiment with warehouse.dir. >> >> My hive-site.xml looks like this: >> >> <configuration> >> <property> >> <name>hive.metastore.warehouse.dir</name> >> <value>/home/ykadiysk/Github/warehouse_dir</value> >> <description>location of default database for the >> warehouse</description> >> </property> >> </configuration> >> >> >> >> and spark-shell code: >> >> scala> val hc= new org.apache.spark.sql.hive.HiveContext(sc) >> hc: org.apache.spark.sql.hive.HiveContext = >> org.apache.spark.sql.hive.HiveContext@3036c16f >> >> scala> hc.sql("show tables").collect >> 15/05/15 14:12:57 INFO HiveMetaStore: 0: Opening raw store with >> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore >> 15/05/15 14:12:57 INFO ObjectStore: ObjectStore, initialize called >> 15/05/15 14:12:57 INFO Persistence: Property datanucleus.cache.level2 >> unknown - will be ignored >> 15/05/15 14:12:58 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> 15/05/15 14:12:58 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> 15/05/15 14:13:03 INFO ObjectStore: Setting MetaStore object pin classes >> with >> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" >> 15/05/15 14:13:03 INFO ObjectStore: Initialized ObjectStore >> 15/05/15 14:13:04 WARN ObjectStore: Version information not found in >> metastore. hive.metastore.schema.verification is not enabled so recording >> the schema version 0.12.0-protobuf-2.5 >> 15/05/15 14:13:05 INFO HiveMetaStore: 0: get_tables: db=default pat=.* >> 15/05/15 14:13:05 INFO audit: ugi=ykadiysk ip=unknown-ip-addr >> cmd=get_tables: db=default pat=.* >> 15/05/15 14:13:05 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as >> "embedded-only" so does not have its own datastore table. >> 15/05/15 14:13:05 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" >> so does not have its own datastore table. >> res0: Array[org.apache.spark.sql.Row] = Array() >> >> scala> hc.getConf("hive.metastore.warehouse.dir") >> res1: String = /home/ykadiysk/Github/warehouse_dir >> >> >> >> I have not tried an HDFS path but you should be at least able to verify >> that the variable is being read. It might be that your value is read but is >> otherwise not liked... >> >> On Fri, May 15, 2015 at 2:03 PM, Tamas Jambor <jambo...@gmail.com> wrote: >> >>> thanks for the reply. I am trying to use it without hive setup >>> (spark-standalone), so it prints something like this: >>> >>> hive_ctx.sql("show tables").collect() >>> 15/05/15 17:59:03 INFO HiveMetaStore: 0: Opening raw store with >>> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore >>> 15/05/15 17:59:03 INFO ObjectStore: ObjectStore, initialize called >>> 15/05/15 17:59:04 INFO Persistence: Property datanucleus.cache.level2 >>> unknown - will be ignored >>> 15/05/15 17:59:04 INFO Persistence: Property >>> hive.metastore.integral.jdo.pushdown unknown - will be ignored >>> 15/05/15 17:59:04 WARN Connection: BoneCP specified but not present in >>> CLASSPATH (or one of dependencies) >>> 15/05/15 17:59:05 WARN Connection: BoneCP specified but not present in >>> CLASSPATH (or one of dependencies) >>> 15/05/15 17:59:08 INFO BlockManagerMasterActor: Registering block >>> manager xxxx:42819 with 3.0 GB RAM, BlockManagerId(2, xxx, 42819) >>> >>> [0/1844] >>> 15/05/15 17:59:18 INFO ObjectStore: Setting MetaStore object pin classes >>> with >>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" >>> 15/05/15 17:59:18 INFO MetaStoreDirectSql: MySQL check failed, assuming >>> we are not on mysql: Lexical error at line 1, column 5. Encountered: "@" >>> (64), after : "". >>> 15/05/15 17:59:20 INFO Datastore: The class >>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as >>> "embedded-only" so does not have its own datastore table. >>> 15/05/15 17:59:20 INFO Datastore: The class >>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as >>> "embedded-only" so does not have its own datastore table. >>> 15/05/15 17:59:28 INFO Datastore: The class >>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as >>> "embedded-only" so does not have its own datastore table. >>> 15/05/15 17:59:29 INFO Datastore: The class >>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as >>> "embedded-only" so does not have its own datastore table. >>> 15/05/15 17:59:31 INFO ObjectStore: Initialized ObjectStore >>> 15/05/15 17:59:32 WARN ObjectStore: Version information not found in >>> metastore. hive.metastore.schema.verification is not enabled so recording >>> the schema version 0.13.1aa >>> 15/05/15 17:59:33 WARN MetricsConfig: Cannot locate configuration: tried >>> hadoop-metrics2-azure-file-system.properties,hadoop-metrics2.properties >>> 15/05/15 17:59:33 INFO MetricsSystemImpl: Scheduled snapshot period at >>> 10 second(s). >>> 15/05/15 17:59:33 INFO MetricsSystemImpl: azure-file-system metrics >>> system started >>> 15/05/15 17:59:33 INFO HiveMetaStore: Added admin role in metastore >>> 15/05/15 17:59:34 INFO HiveMetaStore: Added public role in metastore >>> 15/05/15 17:59:34 INFO HiveMetaStore: No user is added in admin role, >>> since config is empty >>> 15/05/15 17:59:35 INFO SessionState: No Tez session required at this >>> point. hive.execution.engine=mr. >>> 15/05/15 17:59:37 INFO HiveMetaStore: 0: get_tables: db=default pat=.* >>> 15/05/15 17:59:37 INFO audit: ugi=testuser ip=unknown-ip-addr >>> cmd=get_tables: db=default pat=.* >>> >>> not sure what to put in hive.metastore.uris in this case? >>> >>> >>> On Fri, May 15, 2015 at 2:52 PM, Yana Kadiyska <yana.kadiy...@gmail.com> >>> wrote: >>> >>>> This should work. Which version of Spark are you using? Here is what I >>>> do -- make sure hive-site.xml is in the conf directory of the machine >>>> you're using the driver from. Now let's run spark-shell from that machine: >>>> >>>> scala> val hc= new org.apache.spark.sql.hive.HiveContext(sc) >>>> hc: org.apache.spark.sql.hive.HiveContext = >>>> org.apache.spark.sql.hive.HiveContext@6e9f8f26 >>>> >>>> scala> hc.sql("show tables").collect >>>> 15/05/15 09:34:17 INFO metastore: Trying to connect to metastore with URI >>>> thrift://hostname.com:9083 <-- here should be a value from >>>> your hive-site.xml >>>> 15/05/15 09:34:17 INFO metastore: Waiting 1 seconds before next connection >>>> attempt. >>>> 15/05/15 09:34:18 INFO metastore: Connected to metastore. >>>> res0: Array[org.apache.spark.sql.Row] = Array([table1,false], >>>> >>>> scala> hc.getConf("hive.metastore.uris") >>>> res13: String = thrift://hostname.com:9083 >>>> >>>> scala> hc.getConf("hive.metastore.warehouse.dir") >>>> res14: String = /user/hive/warehouse >>>> >>>> >>>> >>>> The first line tells you which metastore it's trying to connect to -- >>>> this should be the string specified under hive.metastore.uris property in >>>> your hive-site.xml file. I have not mucked with warehouse.dir too much but >>>> I know that the value of the metastore URI is in fact picked up from there >>>> as I regularly point to different systems... >>>> >>>> >>>> On Thu, May 14, 2015 at 6:26 PM, Tamas Jambor <jambo...@gmail.com> >>>> wrote: >>>> >>>>> I have tried to put the hive-site.xml file in the conf/ directory >>>>> with, seems it is not picking up from there. >>>>> >>>>> >>>>> On Thu, May 14, 2015 at 6:50 PM, Michael Armbrust < >>>>> mich...@databricks.com> wrote: >>>>> >>>>>> You can configure Spark SQLs hive interaction by placing a >>>>>> hive-site.xml file in the conf/ directory. >>>>>> >>>>>> On Thu, May 14, 2015 at 10:24 AM, jamborta <jambo...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> is it possible to set hive.metastore.warehouse.dir, that is >>>>>>> internally >>>>>>> create by spark, to be stored externally (e.g. s3 on aws or wasb on >>>>>>> azure)? >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/store-hive-metastore-on-persistent-store-tp22891.html >>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>> Nabble.com. >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >