Good day,

I have been attempting to submit a job to a session cluster. This job
involves a pair of dynamic tables and a SQL query. The SQL query is calling
a UDF which I register via the table API's createTemporarySystemFunction()
method. The job runs locally, but when I attempt to submit it to a remote
session cluster, the job fails with the error:

`Cannot load user class: <fully-qualified-class-name>`

If I place a fat jar containing all of my local dependencies on the
JobManagers and TaskManagers, the UDF will be loaded. However, I would
expect the UDF to be serialized and sent with the rest of the job. I have
looked over the UDF documentation, and I don't see a reason why it would
not be serialized with the rest of the job. However, seeing as there is no
error related to serializing the UDF, my assumptions related to UDF
serialization must be incorrect. Is there a hint I can use to cause the
closure cleaner to identify the UDF for serialization? I suspect the reason
it is not being included is that it is referenced only in the SQL query,
and not streams feeding the input table or the stream consuming the output
table.

Summary of questions:
- Will UDF be serialized with the job? Or are they never included?
- Is it possible to hint at what should be serialized and sent along with
the job?

Thank you,
Joel


-- 
Joel Edwards
Software Architect
Ed-Craft Software Solutions

Reply via email to