hi Lina, there are multiple reasons why copy job is used with temporary table; - you may be using dynamic destinations - you are loading lots of data, probably with truncate This way we ensure atomicity as we can trigger copy from multiple temp tables into one final table. Can you confirm or paste a snippet how you configured pache_beam.io.gcp.bigquery.WriteToBigQuery ?
BigQuery doesn't allow copying non partitioned tables into partitioned tables - it's a BQ limitation. have you tried other loading methods .e.g storage write api? Radek On Wed, Mar 19, 2025 at 5:42 AM Lina Mårtensson via user < user@beam.apache.org> wrote: > Hi, > > We have, by now, a large set of different Beam jobs all written in Python > that all write to a set of BigQuery tables that more or less behave the > same way in a single dataset. These tables aren't partitioned at all, but > going forward, we need them to be. > > I partitioned a single table to start with, and was very surprised to find > that a Beam job that wrote to it couldn't do so: > > Failed to copy Non partitioned table to Column partitioned table: not > supported. > > > We have a bunch of pre-created tables, and I would've thought I could just > keep writing to those without changing settings even after setting up > partitioning on them. It doesn't seem to matter whether the > create_disposition is CREATE_IF_NEEDED or CREATE_NEVER. > It does work when I set the additional_bq_parameters to add > time_partitioning, but it would be a huge undertaking not only to update > all of our currently running jobs across many projects, but also to make > sure to synchronize these changes with updating the underlying BigQuery > tables. And it doesn't seem like it should be necessary to specify if we're > not creating any new tables? > > Is there any way to just write the data we have to pre-created, > partitioned tables without having to set time_partitioning in > additional_bq_parameters, or potentially if there's some other > recommended way to solve this problem? > > Thanks, > -Lina >