in pyspark, 'sqlc' is the injected variable name at the moment. So, current workaround could be simply doing 'sqlContext = sqlc'
I think there's no good reason that not using sqlContext as a variable name. (keeping sqlc, too. for backward compatibility) It would be great If you can contribute. Related codes is here https://github.com/apache/incubator-zeppelin/blob/master/spark/src/main/resources/python/zeppelin_pyspark.py#L66 Best, moon On Thu, Jun 25, 2015 at 6:30 AM Dafne van Kuppevelt < dafnevankuppev...@gmail.com> wrote: > Hi, > > I run into the problem that the 'global' sqlContext variable is not > available in the pyspark interpreter. > > If I have for example the folowing code: > %pyspark > df = sqlContext.createDataFrame(...) > > I get the error: > (<type 'exceptions.NameError'>, NameError("name 'sqlContext'is not > defined",) > > When I add the sqlContext explicitly: > from pyspark.sql import SQLContext > sqlContext = SQLContext(sc) > > the df will be created, but if I register it as a (temp) table, it is not > available in the sql interpreter! (or in the SQLcontext in Scala) > > If I do the same thing in Scala it works fine, for example if I run the > example notebook with the 'bank' table. > > Some info about my enivornment: > I'm running spark in yarn-client mode, the spark.home and > zeppelin.pyspark.python properties of the interpreter are set, to resp > spark 1.3 and python 2.7. > > Thanks in advance for your help, > > Dafne >