1.4 and I did set the second parameter. The DSL works fine but trying out
with SQL doesn't.

On Mon, Jun 29, 2015, 4:32 PM Salih Oztop <[email protected]> wrote:

> Hi Bob,
> I tested your scenario with Spark 1.3 and I assumed you did not miss the
> second parameter of pow(x,y)
>
> from pyspark.sql import SQLContext sqlContext = SQLContext(sc)
> df = sqlContext.jsonFile("/vagrant/people.json")
> # Displays the content of the DataFrame to stdout
> df.show()
> #These are all fine
> df.select("name", (df.age)*(df.age)).show()
>
> name    (age * age)
> Michael null
> Andy    900
> Justin  361
>
>
> df.select("name", (df.age)+1).show()
>
> name    (age + 1)
> Michael null
> Andy    31
> Justin  20
>
>
> However the following tests give the same error.
>
> df.select("name", pow(df.age,2)).show()
>
> ---------------------------------------------------------------------------TypeError
>                                  Traceback (most recent call 
> last)<ipython-input-27-ce7299d3ef76> in <module>()----> 1 df.select("name", 
> pow(df.age,2)).show()
> TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'
>
>
> df.select("name", (df.age)**2).show()
>
> ---------------------------------------------------------------------------TypeError
>                                  Traceback (most recent call 
> last)<ipython-input-24-29540c3536bf> in <module>()----> 1 df.select("name", 
> (df.age)**2).show()
> TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'
>
>
> Moreover testing the functions individually they are working fine.
>
> pow(2,4)
>
> 16
>
> 2**4
>
> 16
>
>
>
> Kind Regards
> Salih Oztop
>
>   ------------------------------
>  *From:* Bob Corsaro <[email protected]>
> *To:* user <[email protected]>
> *Sent:* Monday, June 29, 2015 7:27 PM
> *Subject:* SparkSQL built in functions
>
> I'm having trouble using "select pow(col) from table" It seems the
> function is not registered for SparkSQL. Is this on purpose or an
> oversight? I'm using pyspark.
>
>
>

Reply via email to