Thanks Yin, here are the logs:
INFO SparkContext - Added JAR file:/home/jegreen1/mms/zookeeper-3.4.6.jar at http://10.39.65.122:38933/jars/zookeeper-3.4.6.jar with timestamp 1453907484092 INFO SparkContext - Added JAR file:/home/jegreen1/mms/mms-http-0.2-SNAPSHOT.jar at http://10.39.65.122:38933/jars/mms-http-0.2-SNAPSHOT.jar with timestamp 1453907484093 INFO Executor - Starting executor ID driver on host localhost INFO Utils - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41220. INFO NettyBlockTransferService - Server created on 41220 INFO BlockManagerMaster - Trying to register BlockManager INFO BlockManagerMasterEndpoint - Registering block manager localhost:41220 with 511.1 MB RAM, BlockManagerId(driver, localhost, 41220) INFO BlockManagerMaster - Registered BlockManager INFO HiveContext - Initializing execution hive, version 1.2.1 INFO ClientWrapper - Inspected Hadoop version: 2.6.0 INFO ClientWrapper - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0 WARN HiveConf - HiveConf of name hive.enable.spark.execution.engine does not exist INFO HiveMetaStore - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore INFO ObjectStore - ObjectStore, initialize called INFO Persistence - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored INFO Persistence - Property datanucleus.cache.level2 unknown - will be ignored WARN HiveConf - HiveConf of name hive.enable.spark.execution.engine does not exist INFO ObjectStore - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" INFO Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. INFO Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. INFO Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. INFO Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. INFO MetaStoreDirectSql - Using direct SQL, underlying DB is DERBY INFO ObjectStore - Initialized ObjectStore WARN ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 WARN ObjectStore - Failed to get database default, returning NoSuchObjectException INFO HiveMetaStore - Added admin role in metastore INFO HiveMetaStore - Added public role in metastore INFO HiveMetaStore - No user is added in admin role, since config is empty INFO HiveMetaStore - 0: get_all_databases INFO audit - ugi=jegreen1 ip=unknown-ip-addr cmd=get_all_databases INFO HiveMetaStore - 0: get_functions: db=default pat=* INFO audit - ugi=jegreen1 ip=unknown-ip-addr cmd=get_functions: db=default pat=* INFO Datastore - The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. WARN NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable INFO SessionState - Created local directory: /tmp/9b102c97-c3f4-4d92-b722-0a2e257d3b5b_resources INFO SessionState - Created HDFS directory: /tmp/hive/jegreen1/9b102c97-c3f4-4d92-b722-0a2e257d3b5b INFO SessionState - Created local directory: /tmp/jegreen1/9b102c97-c3f4-4d92-b722-0a2e257d3b5b INFO SessionState - Created HDFS directory: /tmp/hive/jegreen1/9b102c97-c3f4-4d92-b722-0a2e257d3b5b/_tmp_space.db WARN HiveConf - HiveConf of name hive.enable.spark.execution.engine does not exist INFO HiveContext - default warehouse location is /user/hive/warehouse INFO HiveContext - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. INFO ClientWrapper - Inspected Hadoop version: 2.6.0 INFO ClientWrapper - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0 WARN HiveConf - HiveConf of name hive.enable.spark.execution.engine does not exist INFO metastore - Trying to connect to metastore with URI thrift://dkclusterm2.imp.net:9083 INFO metastore - Connected to metastore. INFO SessionState - Created local directory: /tmp/7e230580-37af-47d3-81cc-eb4829b8da62_resources INFO SessionState - Created HDFS directory: /tmp/hive/jegreen1/7e230580-37af-47d3-81cc-eb4829b8da62 INFO SessionState - Created local directory: /tmp/jegreen1/7e230580-37af-47d3-81cc-eb4829b8da62 INFO SessionState - Created HDFS directory: /tmp/hive/jegreen1/7e230580-37af-47d3-81cc-eb4829b8da62/_tmp_space.db INFO ParquetRelation - Listing hdfs://dkclusterm1.imp.net:8020/user/jegreen1/ex208 on driver INFO SparkContext - Starting job: parquet at ThriftTest.scala:39 INFO DAGScheduler - Got job 0 (parquet at ThriftTest.scala:39) with 32 output partitions INFO DAGScheduler - Final stage: ResultStage 0 (parquet at ThriftTest.scala:39) INFO DAGScheduler - Parents of final stage: List() INFO DAGScheduler - Missing parents: List() INFO DAGScheduler - Submitting ResultStage 0 (MapPartitionsRDD[1] at parquet at ThriftTest.scala:39), which has no missing parents INFO MemoryStore - Block broadcast_0 stored as values in memory (estimated size 65.5 KB, free 65.5 KB) INFO MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 22.9 KB, free 88.3 KB) INFO BlockManagerInfo - Added broadcast_0_piece0 in memory on localhost:41220 (size: 22.9 KB, free: 511.1 MB) INFO SparkContext - Created broadcast 0 from broadcast at DAGScheduler.scala:1006 INFO DAGScheduler - Submitting 32 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at parquet at ThriftTest.scala:39) INFO TaskSchedulerImpl - Adding task set 0.0 with 32 tasks INFO TaskSetManager - Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 6528 bytes) INFO TaskSetManager - Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 6528 bytes) INFO TaskSetManager - Starting task 2.0 in stage 0.0 (TID 2, localhost, partition 2,PROCESS_LOCAL, 6528 bytes) INFO TaskSetManager - Starting task 3.0 in stage 0.0 (TID 3, localhost, partition 3,PROCESS_LOCAL, 6528 bytes) INFO TaskSetManager - Starting task 4.0 in stage 0.0 (TID 4, localhost, partition 4,PROCESS_LOCAL, 6528 bytes) INFO TaskSetManager - Starting task 5.0 in stage 0.0 (TID 5, localhost, partition 5,PROCESS_LOCAL, 6528 bytes) From: Yin Huai [mailto:yh...@databricks.com] Sent: 26 January 2016 17:48 To: Green, James (UK Guildford) Cc: dev@spark.apache.org Subject: Re: spark hivethriftserver problem on 1.5.0 -> 1.6.0 upgrade Can you post more logs, specially lines around "Initializing execution hive ..." (this is for an internal used fake metastore and it is derby) and "Initializing HiveMetastoreConnection version ..." (this is for the real metastore. It should be your remote one)? Also, those temp tables are stored in the memory and are associated with a HiveContext. If you can not see temp tables, it usually means that the HiveContext that you used with JDBC was different from the one used to create the temp table. However, in your case, you are using HiveThriftServer2.startWithContext(hiveContext). So, it will be good to provide more logs and see what happened. Thanks, Yin On Tue, Jan 26, 2016 at 1:33 AM, james.gre...@baesystems.com<mailto:james.gre...@baesystems.com> <james.gre...@baesystems.com<mailto:james.gre...@baesystems.com>> wrote: Hi I posted this on the user list yesterday, I am posting it here now because on further investigation I am pretty sure this is a bug: On upgrade from 1.5.0 to 1.6.0 I have a problem with the hivethriftserver2, I have this code: val hiveContext = new HiveContext(SparkContext.getOrCreate(conf)); val thing = hiveContext.read.parquet("hdfs://dkclusterm1.imp.net:8020/user/jegreen1/ex208<http://dkclusterm1.imp.net:8020/user/jegreen1/ex208>") thing.registerTempTable("thing") HiveThriftServer2.startWithContext(hiveContext) When I start things up on the cluster my hive-site.xml is found – I can see that the metastore connects: INFO metastore - Trying to connect to metastore with URI thrift://dkclusterm2.imp.net:9083<http://dkclusterm2.imp.net:9083> INFO metastore - Connected to metastore. But then later on the thrift server seems not to connect to the remote hive metastore but to start a derby instance instead: INFO AbstractService - Service:CLIService is started. INFO ObjectStore - ObjectStore, initialize called INFO Query - Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0<mailto:org.datanucleus.store.rdbms.query.SQLQuery@0>" since the connection used is closing INFO MetaStoreDirectSql - Using direct SQL, underlying DB is DERBY INFO ObjectStore - Initialized ObjectStore INFO HiveMetaStore - 0: get_databases: default INFO audit - ugi=jegreen1 ip=unknown-ip-addr cmd=get_databases: default INFO HiveMetaStore - 0: Shutting down the object store... INFO audit - ugi=jegreen1 ip=unknown-ip-addr cmd=Shutting down the object store... INFO HiveMetaStore - 0: Metastore shutdown complete. INFO audit - ugi=jegreen1 ip=unknown-ip-addr cmd=Metastore shutdown complete. INFO AbstractService - Service:ThriftBinaryCLIService is started. INFO AbstractService - Service:HiveServer2 is started. On 1.5.0 the same bit of the log reads: INFO AbstractService - Service:CLIService is started. INFO metastore - Trying to connect to metastore with URI thrift://dkclusterm2.imp.net:9083<http://dkclusterm2.imp.net:9083> ******* ie 1.5.0 connects to remote hive INFO metastore - Connected to metastore. INFO AbstractService - Service:ThriftBinaryCLIService is started. INFO AbstractService - Service:HiveServer2 is started. INFO ThriftCLIService - Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads So if I connect to this with JDBC I can see all the tables on the hive server – but not anything temporary – I guess they are going to derby. I see someone on the databricks website is also having this problem. Thanks James Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm.