Re: DataFlow Template error - SDK not reporting number of elements processed

2023-01-11 Thread Patrick McQuighan via user
Hi, I think I finally managed to track down the difference - the dataflow job runs correctly when it has the pipeline option tempLocation set (in addition to temp_location). I have been having issues trying to get that field set via the gcloud CLI, but using the python SDK

Re: DataFlow Template error - SDK not reporting number of elements processed

2023-01-11 Thread Patrick McQuighan via user
Hi Bruno, Thanks for the response. The SDK version and all should be identical - this issue occurs using code from the exact same commit in git, and the dependencies are frozen. I should mention this is using the python SDK version 2.39.0. Diffing between the templates only appears to show expec

Re: DataFlow Template error - SDK not reporting number of elements processed

2023-01-10 Thread Bruno Volpato via user
Hi Patrick, I have a few questions that might help troubleshoot this: Did you use the same SDK? Have you updated Beam or any other dependencies? Are there any other error logs (prior to the trace above) that could help understand it? Do you still have the previous template so you can compare the

DataFlow Template error - SDK not reporting number of elements processed

2023-01-10 Thread Patrick McQuighan via user
user@beam.apache.org Hi, I recently started encountering a strange error where a Dataflow job launched from a template never completes, but runs when launched directly. The template has been in use since Dec 14 without issue, but trying to recreate the template today (or the past week) and execut