Yes I figured out the above from reading source code again. I hope the
steps can be documented somewhere in beam
But I still can not find the details for those jobs
For example
bq show -j --format=prettyjson --project_id=.... beam_bq_job_COPY_
gives me
BigQuery error in show operation: Not found: Job project-data

On Thu, Oct 3, 2024 at 2:17 PM Ahmed Abualsaud via user <
[email protected]> wrote:

> For small/medium writes, it should load directly to the table.
>
> For larger writes (your case), it writes to multiple temp tables then
> performs a single copy job [1] that copies their contents to the final
> table. Afterwards, the sink will clean up all those temp tables.
> My guess is your pipeline is failing at the copy step. Note what Reuven
> said in the other thread that Dataflow will retry "indefinitely for
> streaming", so your pipeline will continue running. You should be able to
> see error messages in your logs though.
>
> As to why it's failing, we'd have to know more about your use case or see
> a stack trace. With these things, it's best to submit a support ticket so
> the engineers can investigate. From my experience though, jobs failing at
> the copy step are usually because of trying to copy partitioned columns.
> That isn't supported by BigQuery (see copy job limitations [2]
>
> [1] https://cloud.google.com/bigquery/docs/managing-tables#copy-table
> [2]
> https://cloud.google.com/bigquery/docs/managing-tables#limitations_on_copying_tables
>
> On Thu, Oct 3, 2024 at 11:56 PM [email protected] <[email protected]> wrote:
>
>> Hey guys,
>>
>> Any help is appreciated. I'm using BigqueryIO file upload method to load
>> data to BQ, I don't see any error, any warning but I also don't see a
>> SINGLE row inserted to the table either
>>
>> Only thing I see is hundreds of load job like
>> beam_bq_job_TEMP_TABLE_LOAD_.....
>> And hundreds of temp table created
>>
>> Most jobs are done and I can see the data in temp table, but there is
>> not a single row written to the final destination?
>>
>> I know there is no way to track row level error, but At least the
>> runner/beam api should give me some hint what is wrong in any steps? And
>> there is zero document/example about this either.
>>
>>
>> Regards,
>>
>>

Reply via email to