[ 
https://issues.apache.org/jira/browse/BEAM-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133486#comment-17133486
 ] 

Beam JIRA Bot commented on BEAM-6514:
-------------------------------------

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Dataflow Batch Job Failure is leaving Datasets/Tables behind in BigQuery
> ------------------------------------------------------------------------
>
>                 Key: BEAM-6514
>                 URL: https://issues.apache.org/jira/browse/BEAM-6514
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>            Reporter: Rumeshkrishnan Mohan
>            Assignee: Chamikara Madhusanka Jayalath
>            Priority: P2
>              Labels: stale-assigned
>
> Dataflow is leaving Datasets/Tables behind in BigQuery when the pipeline is 
> cancelled or when it fails. I cancelled a job or it failed at run time, and 
> it left behind a dataset and table in BigQuery.
>  # `cleanupTempResource` method involves cleaning tables and dataset after 
> batch job succeed.
>  # If job failed in the middle or cancelled explicitly, the temporary dataset 
> and tables remain exist. I do see the table expire period 1 day as per code 
> in `getTableToExtract` function written in BigQueryQuerySource.java.
>  # I can understand that, keep temp tables and dataset when failure for 
> debugging.
>  # Can we have pipeline or job optional parameters which get clean temporary 
> dataset and tables when cancel or fail ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to