1.4 and I did set the second parameter. The DSL works fine but trying out with SQL doesn't.
On Mon, Jun 29, 2015, 4:32 PM Salih Oztop <[email protected]> wrote: > Hi Bob, > I tested your scenario with Spark 1.3 and I assumed you did not miss the > second parameter of pow(x,y) > > from pyspark.sql import SQLContext sqlContext = SQLContext(sc) > df = sqlContext.jsonFile("/vagrant/people.json") > # Displays the content of the DataFrame to stdout > df.show() > #These are all fine > df.select("name", (df.age)*(df.age)).show() > > name (age * age) > Michael null > Andy 900 > Justin 361 > > > df.select("name", (df.age)+1).show() > > name (age + 1) > Michael null > Andy 31 > Justin 20 > > > However the following tests give the same error. > > df.select("name", pow(df.age,2)).show() > > ---------------------------------------------------------------------------TypeError > Traceback (most recent call > last)<ipython-input-27-ce7299d3ef76> in <module>()----> 1 df.select("name", > pow(df.age,2)).show() > TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int' > > > df.select("name", (df.age)**2).show() > > ---------------------------------------------------------------------------TypeError > Traceback (most recent call > last)<ipython-input-24-29540c3536bf> in <module>()----> 1 df.select("name", > (df.age)**2).show() > TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int' > > > Moreover testing the functions individually they are working fine. > > pow(2,4) > > 16 > > 2**4 > > 16 > > > > Kind Regards > Salih Oztop > > ------------------------------ > *From:* Bob Corsaro <[email protected]> > *To:* user <[email protected]> > *Sent:* Monday, June 29, 2015 7:27 PM > *Subject:* SparkSQL built in functions > > I'm having trouble using "select pow(col) from table" It seems the > function is not registered for SparkSQL. Is this on purpose or an > oversight? I'm using pyspark. > > >
