Sojan James created ZEPPELIN-1411:
-------------------------------------

             Summary: UDF with pyspark not working
                 Key: ZEPPELIN-1411
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1411
             Project: Zeppelin
          Issue Type: Bug
          Components: python-interpreter
    Affects Versions: 0.6.1
            Reporter: Sojan James


The following UDF example doesn't work.

{code}
from pyspark.sql.types import StringType
from pyspark.sql.functions import udf

maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType())

df = sqlContext.createDataFrame([{'name': 'Alice', 'age': 1}])
df.withColumn("maturity", maturity_udf(df.age))
{code}

Stack trace
{code}
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-64075962331083004.py", line 266, in <module>
    raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-64075962331083004.py", line 259, in <module>
    exec(code)
  File "<stdin>", line 3, in <module>
  File 
"/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py",
 line 1789, in udf
    return UserDefinedFunction(f, returnType)
  File 
"/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py",
 line 1751, in __init__
    self._judf = self._create_judf(name)
  File 
"/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py",
 line 1758, in _create_judf
    jdt = ctx._ssql_ctx.parseDataType(self.returnType.json())
AttributeError: 'JavaMember' object has no attribute 'parseDataType'
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to