Is it possible to have a UDF which takes a variable number of arguments?

e.g. df.select(myUdf($"*")) fails with

org.apache.spark.sql.AnalysisException: unresolved operator 'Project
[scalaUDF(*) AS scalaUDF(*)#26];

What I would like to do is pass in a generic data frame which can be then
passed to a UDF which does scoring of a model. The UDF needs to know the
schema to map column names in the model to columns in the DataFrame.

The model has 100s of factors (very wide), so I can't just have a scoring
UDF that has 500 parameters (for obvious reasons).

Cheers,
~N

Reply via email to