Greetings,

I've been trying to migrate some piece of code from Scala - Spark 2.X to
PySpark 3.0.1. Part of the software includes a User-Defined Aggregate
Function (UDAF), which represented a two-fold problem:

   - The UserDefinedAggregateFunction abstract class is deprecated in Spark
   >= 3.0 in favor of the Aggregator
   <https://spark.apache.org/docs/latest/sql-ref-functions-udf-aggregate.html>
abstract
   class.
   - Pyspark doesn't implement UDAFs. It only has, as far as I can tell, a
   registerJavaUDAF
   
<https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=udaf#pyspark.sql.UDFRegistration.registerJavaUDAF>
    function.

I thus reimplemented the UserDefinedAggregateFunction as an Aggregator, and
attempted to use registerJavaUDAF() to register the Scala code in PySpark.
However, I'm met with the following exception:

AnalysisException: class <class-name> doesn't implement interface
> UserDefinedAggregateFunction;


I'm able to use the old class implementing the UserDefinedAggregateFunction
abstract class but keeping legacy code isn't desirable. My understanding of
the documentation led me to believe pyspark would be able to register Scala
UDAF implementing the Aggregator abstract class. I feel this is a bug, or
at least means the documentation is outdated. It also means there's, as far
as I can tell, no way to use UDAF natively with PySpark >= 3.0.

Am I missing a solution?

Regards,
G. Dugernier

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 

Reply via email to