Hitested wih Spark 1.4
We need to import pow otherwise it uses python version of pow I guess.
>>> from pyspark.sql.functions import pow>>> 
>>> df.select(pow(df.age,df.age)).show()
15/06/29 22:36:05 INFO Ta+--------------------+| POWER(age, 
age)|+--------------------+| null|| 
2.05891132094649E44||1.978419655660313...|+--------------------+>>> 
df.select(pow(df.age,2)).show()
+---------------+|POWER(age, 2.0)|+---------------+|           null||          
900.0||          361.0|+---------------+
Kind RegardsSalih Oztop

      From: Krishna Sankar <[email protected]>
 To: Bob Corsaro <[email protected]> 
Cc: Salih Oztop <[email protected]>; user <[email protected]> 
 Sent: Monday, June 29, 2015 9:52 PM
 Subject: Re: SparkSQL built in functions
   
Interesting. Looking at the definitions, sql.functions.pow is defined only for 
(col,col). Just as an experiment, create a column with value 2 and see if that 
works.Cheers<k/>


On Mon, Jun 29, 2015 at 1:34 PM, Bob Corsaro <[email protected]> wrote:

1.4 and I did set the second parameter. The DSL works fine but trying out with 
SQL doesn't.
On Mon, Jun 29, 2015, 4:32 PM Salih Oztop <[email protected]> wrote:

Hi Bob,I tested your scenario with Spark 1.3 and I assumed you did not miss the 
second parameter of pow(x,y)
from pyspark.sql import SQLContextsqlContext = SQLContext(sc)
df = sqlContext.jsonFile("/vagrant/people.json")# Displays the content of the 
DataFrame to stdoutdf.show()#These are all finedf.select("name", 
(df.age)*(df.age)).show()
name    (age * age)
Michael null       
Andy    900        
Justin  361  
df.select("name", (df.age)+1).show()
name    (age + 1)
Michael null     
Andy    31       
Justin  20
However the following tests give the same error.df.select("name", 
pow(df.age,2)).show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-ce7299d3ef76> in <module>()
----> 1 df.select("name", pow(df.age,2)).show()

TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'

df.select("name", (df.age)**2).show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-29540c3536bf> in <module>()
----> 1 df.select("name", (df.age)**2).show()

TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'
Moreover testing the functions individually they are working fine.pow(2,4)
162**4
16

Kind Regards
Salih Oztop
      From: Bob Corsaro <[email protected]>
 To: user <[email protected]> 
 Sent: Monday, June 29, 2015 7:27 PM
 Subject: SparkSQL built in functions
   
I'm having trouble using "select pow(col) from table" It seems the function is 
not registered for SparkSQL. Is this on purpose or an oversight? I'm using 
pyspark.

 




  

Reply via email to