I have figured it out.
As shown in the code below, if the HiveContext hc were created in the actor
object and used to create db in response to message, it would throw null
pointer exception. This is fixed by creating the HiveContext inside the MyActor
class instead. I also tested the code by replacing Actor with Thread. The
problem and fix are similar.
Du
——
abstract class MyMessage
case object CreateDB extends MyMessage
object MyActor {
def init(_sc: SparkContext) = {
if( actorSystem == null || actorRef == null ) {
actorSystem = ActorSystem(“root")
actorRef = actorSystem.actorOf(Props(new MyActor(_sc)), “myactor")
}
//hc = new MyHiveContext(_sc)
}
def !(m: MyMessage) {
actorRef ! m
}
//var hc: MyHiveContext = _
private var actorSystem: ActorSystem = null
private var actorRef: ActorRef = null
}
class MyActor(sc: SparkContext) extends Actor {
val hc = new MyHiveContext(sc)
def receive: Receiver = {
case CreateDB => hc.createDB()
}
}
class MyHiveContext(sc: SparkContext) extends HiveContext(sc) {
def createDB() {...}
}
From: "Chester @work" <[email protected]<mailto:[email protected]>>
Date: Thursday, September 18, 2014 at 7:17 AM
To: Du Li <[email protected]<mailto:[email protected]>>
Cc: Michael Armbrust <[email protected]<mailto:[email protected]>>,
"Cheng, Hao" <[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: problem with HiveContext inside Actor
Akka actor are managed under a thread pool, so the same actor can be under
different thread.
If you create HiveContext in the actor, is it possible that you are essentially
create different instance of HiveContext ?
Sent from my iPhone
On Sep 17, 2014, at 10:14 PM, Du Li
<[email protected]<mailto:[email protected]>> wrote:
Thanks for your reply.
Michael: No. I only create one HiveContext in the code.
Hao: Yes. I subclass HiveContext and defines own function to create database
and then subclass akka Actor to call that function in response to an abstract
message. By your suggestion, I called
println(sessionState.getConf.getAllProperties) that printed
tons of properties; however, the same NullPointerException was still thrown.
As mentioned, the weird thing is that everything worked fine if I simply called
actor.hiveContext.createDB() directly. But it throws the null pointer exception
from Driver.java if I do "actor ! CreateSomeDB”, which seems to me just the
same thing because
the actor does nothing but call createDB().
Du
From: Michael Armbrust <[email protected]<mailto:[email protected]>>
Date: Wednesday, September 17, 2014 at 7:40 PM
To: "Cheng, Hao" <[email protected]<mailto:[email protected]>>
Cc: Du Li <[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: problem with HiveContext inside Actor
- dev
Is it possible that you are constructing more than one HiveContext in a single
JVM? Due to global state in Hive code this is not allowed.
Michael
On Wed, Sep 17, 2014 at 7:21 PM, Cheng, Hao
<[email protected]<mailto:[email protected]>> wrote:
Hi, Du
I am not sure what you mean “triggers the HiveContext to create a database”, do
you create the sub class
of HiveContext? Just be sure you call the “HiveContext.sessionState” eagerly,
since it will set the proper “hiveconf” into the SessionState, otherwise the
HiveDriver will always get the null value when retrieving HiveConf.
Cheng Hao
From: Du Li [mailto:[email protected]]
Sent: Thursday, September 18, 2014 7:51 AM
To: [email protected]<mailto:[email protected]>;
[email protected]<mailto:[email protected]>
Subject: problem with HiveContext inside Actor
Hi,
Wonder anybody had similar experience or any suggestion here.
I have an akka Actor that processes database requests in high-level messages.
Inside this Actor, it creates a HiveContext object that does the
actual db work. The main thread creates the needed SparkContext and passes in
to the Actor to create the HiveContext.
When a message is sent to the Actor, it is processed properly except that, when
the message triggers the HiveContext to create a database, it
throws a NullPointerException in hive.ql.Driver.java which suggests that its
conf variable is not initialized.
Ironically, it works fine if my main thread directly calls actor.hiveContext to
create the database. The spark version is 1.1.0.
Thanks,
Du