Is there a way you can identify those patterns in a file or in its name and
then just tackle them in separate jobs? I use the function
input_file_name() to find the name of input file of each record and then
filter out certain files.

Regards,
Gourav

On Wed, Jul 10, 2019 at 6:47 AM Wei Chen <weic...@apache.org> wrote:

> Hello All,
>
> I am using spark to process some files parallelly.
> While most files are able to be processed within 3 seconds,
> it is possible that we stuck on 1 or 2 files as they will never finish (or
> will take more than 48 hours).
> Since it is a 3rd party file conversion tool, we are not able to debug why
> the converter stuck at the time.
>
> Is it possible that we set a timeout for our process, throw exceptions for
> those tasks,
> while still continue with other successful tasks?
>
> Best Regards
> Wei
>

Reply via email to