in pyspark, 'sqlc' is the injected variable name at the moment.
So, current workaround could be simply doing
'sqlContext = sqlc'

I think there's no good reason that not using sqlContext as a variable
name. (keeping sqlc, too. for backward compatibility)

It would be great If you can contribute. Related codes is here
https://github.com/apache/incubator-zeppelin/blob/master/spark/src/main/resources/python/zeppelin_pyspark.py#L66

Best,
moon

On Thu, Jun 25, 2015 at 6:30 AM Dafne van Kuppevelt <
dafnevankuppev...@gmail.com> wrote:

> Hi,
>
> I run into the problem that the 'global' sqlContext variable is not
> available in the pyspark interpreter.
>
> If I have for example the folowing code:
> %pyspark
> df = sqlContext.createDataFrame(...)
>
> I get the error:
> (<type 'exceptions.NameError'>, NameError("name 'sqlContext'is not
> defined",)
>
> When I add the sqlContext explicitly:
> from pyspark.sql import SQLContext
> sqlContext = SQLContext(sc)
>
> the df will be created, but if I register it as a (temp) table, it is not
> available in the sql interpreter! (or in the SQLcontext in Scala)
>
> If I do the same thing in Scala it works fine, for example if I run the
> example notebook with the 'bank' table.
>
> Some info about my enivornment:
> I'm running spark in yarn-client mode, the spark.home and
> zeppelin.pyspark.python properties of the interpreter are set, to resp
> spark 1.3 and python 2.7.
>
> Thanks in advance for your help,
>
> Dafne
>

Reply via email to