Hi All,
I have a BigQuery dataset which includes event logs from many thousand devices.
It's basically time-series data which includes a "CONNECTED" or
"DISCONNECTED" state for any device at infrequent points in time.
However, except for error cases where it might be missing, there is
always an ev
Hii team,
We are facing an issue while trying to push data to RDBMS(oracle in our
case) while it runs for small amount of records but when is run through
bigger dataset it fails, throwing this error,
Error message from worker: org.apache.beam.sdk.util.UserCodeException:
> java.sql.BatchUpdateExce
Which runner are you using ?
Also, do you have the bottom of the StackTrace here ? It's possibly due to
Docker containers running the Java SDK not having access to your database,
but I'm not sure based on the information provided.
Thanks,
Cham
On Tue, Feb 21, 2023 at 11:32 AM Somnath Chouwdhury
Hello Cham,
The Runner in use is Dataflow Runner.
The last 28 lines aren't available in Cloud logging as well.
The Code shared above works just fine with 2-3 records but starts to fail
when we try with a bigger source data payload.
Does it look like multiple threads trying to acquire a write lock
When you run larger workloads, Dataflow likely tries to split the work into
more splits and may be also autoscaling to add more workers. So the number
of parallel connections to the Database will go up if the workload is
higher. So probably try adjusting the Database settings to allow more
parallel