One more question. If the SemanticAnalyzer isn't fully thread safe could you provide any pointers as to why it may not be thread safe? It's a 9000 line file so any hints as to where to get started would be much appreciated. I don't see anything very obvious like globally shared member variables so I'm guessing it's more subtle then that.
Thanks, Kris On 8/15/13 5:29 PM, "Kristopher Glover" <kglo...@appnexus.com> wrote: >Thanks for all the great insight. I'll poke around a little more to see >if I could at least start documenting the changes required to make >everything thread safe as well as remove the synchronization. > >@Xuefu- >I completely understand your points, I was just trying to figure out if >there was a specific functional reason for making them public when there >was a known vulnerability. For instance, why not synchronize the compile >method itself instead of relying on external synchronization. From the >sound of it there were no specific reasons, other then no one has gotten >around to making the improvements yet. Maybe it'll be something I can >contribute back. > >Thanks again, >Kris > >Xuefu Zhang wrote: >To add, > >1. Being public doesn't necessarily guarantee thread-safety. Of course, >this is no excuse for not documenting thread-safety. >2. Sometimes a method is made public for testing, which is bad in my >opnion, but I saw many instances like this before. > >--Xuefu > > > >On Thu, Aug 15, 2013 at 1:11 PM, Brock Noland <br...@cloudera.com> wrote: > >> Well you would have probably found the areas we need to fix! :) The hive >> source is is not strict about methods and member visibility. The good >>news >> is that we have been making significant improvements in this aspect. >> >> Brock >> >> >> On Thu, Aug 15, 2013 at 2:55 PM, Kristopher Glover <kglo...@appnexus.com >> >wrote: >> >> > Interesting, I didn't realize that. If that's the case then I suppose >> it'd >> > be really bad for me to circumvent the lock by reproducing the >>Driver#run >> > method by calling Driver#compile and Driver#execute directly from >>within >> > my app. >> > >> > If that is the case why make Driver#compile and Driver#execute public >> > methods? There doesn't seem to be any inheritance that requires them >>to >> be >> > public and the fact that they are public opens up a thread safety >>issue. >> > >> > Thanks, >> > Kris >> > >> > On 8/15/13 1:11 PM, "Brock Noland" <br...@cloudera.com> wrote: >> > >> > >The hive semantic analyzer is not fully thread safe. We'd like to >> remove >> > >that lock but it will be a large project. >> > > >> > >Brock >> > > >> > > >> > >On Thu, Aug 15, 2013 at 11:12 AM, Kristopher Glover >> > ><kglo...@appnexus.com>wrote: >> > > >> > >> Hi Everyone, >> > >> >> > >> I'm experiencing a threading issue with the Hive client where I >>want >> to >> > >> run multiple queries on the same JVM. >> > >> >> > >> The problem I'm having is that >>org.apache.hadoop.hive.ql.Driver#run >> > >>(line >> > >> 907) has the following few lines of code : >> > >> >> > >> synchronized (compileMonitor) { >> > >> >> > >> ret = compile(command); >> > >> >> > >> } >> > >> >> > >> >> > >> The compileMonitor is a static so it blocks all threads even though >> I'm >> > >> using different instances of the Driver class. I could explicitly >>call >> > >> Driver#compile then Driver#execute to avoid the synchronized block >> but I >> > >> don't know if it's serving a special purpose. Does anyone know why >> that >> > >> synchronized block is there and if its really necessary ? >> > >> >> > >> >> > >> Thanks, >> > >> >> > >> Kris >> > >> >> > > >> > > >> > > >> > >-- >> > >Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >> > >> > >> >> >> -- >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >>