[ 
https://issues.apache.org/jira/browse/BEAM-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121776#comment-17121776
 ] 

Kenneth Knowles commented on BEAM-8906:
---------------------------------------

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Long BigQuery dry runs cause avalanche delay
> --------------------------------------------
>
>                 Key: BEAM-8906
>                 URL: https://issues.apache.org/jira/browse/BEAM-8906
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.16.0
>         Environment: Google Cloud Platform
>            Reporter: June Oh
>            Assignee: Chamikara Madhusanka Jayalath
>            Priority: P2
>              Labels: stale-assigned
>
> Reproduction Steps:
> 1. Compose a BigQuery SELECT query that will take over 80 seconds for a dry 
> run.
> 2. Run the query with Beam SDK's BigQueryIO.
> 3. Observe the 10+ minute delay before the actual query job is created.
> When running readTableRows(), BigQueryIO attempts to estimate the query size 
> by performing a dry run, even if withoutValidation() is set. If the request 
> takes over 80 seconds (RetryHttpRequestInitializer.HANGING_GET_TIMEOUT_SEC), 
> RetryHttpRequestInitializer will time out and retry, up to 9 times 
> (BigQueryServicesImpl.MAX_RPC_RETRIES). Hence, once a dry run duration 
> crosses the 80 second tipping point, it causes an inevitable avalanche of a 
> 720-second delay. Considering the fact that size estimation is not a 
> requirement in running the query [1], BigQueryIO should provide a way to 
> circumvent the redundant delay, especially in consideration of time-critical 
> enterprise workloads.
> There can be several ways to address this:
> - increasing the timeout threshold (which will still create a tipping point);
> - preventing the dry run requests from retrying; or
> - adding an option to skip the size estimation within 
> serializeToCloudSource().
> [1] 
> https://github.com/apache/beam/blob/2ec3b0495c191597c9a88830d25a2c360b3277e0/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/CustomSources.java#L75



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to