Yeah, I found those failed job but none of them records why it fail and `bq show` gives me job not found
On Thu, Oct 3, 2024 at 2:33 PM Ahmed Abualsaud <[email protected]> wrote: > I'd check your Dataflow worker logs and look for any messages about > `beam_bq_job_COPY` > > On Fri, Oct 4, 2024 at 12:31 AM [email protected] <[email protected]> wrote: > >> And interestingly in bigquery UI I only see beam_bq_job_LOAD not beam_bq_ >> job_COPY, but the job id did show up in logs >> >> On Thu, Oct 3, 2024 at 2:28 PM [email protected] <[email protected]> wrote: >> >>> Yes I figured out the above from reading source code again. I hope the >>> steps can be documented somewhere in beam >>> But I still can not find the details for those jobs >>> For example >>> bq show -j --format=prettyjson --project_id=.... beam_bq_job_COPY_ >>> gives me >>> BigQuery error in show operation: Not found: Job project-data >>> >>> On Thu, Oct 3, 2024 at 2:17 PM Ahmed Abualsaud via user < >>> [email protected]> wrote: >>> >>>> For small/medium writes, it should load directly to the table. >>>> >>>> For larger writes (your case), it writes to multiple temp tables then >>>> performs a single copy job [1] that copies their contents to the final >>>> table. Afterwards, the sink will clean up all those temp tables. >>>> My guess is your pipeline is failing at the copy step. Note what Reuven >>>> said in the other thread that Dataflow will retry "indefinitely for >>>> streaming", so your pipeline will continue running. You should be able to >>>> see error messages in your logs though. >>>> >>>> As to why it's failing, we'd have to know more about your use case or >>>> see a stack trace. With these things, it's best to submit a support ticket >>>> so the engineers can investigate. From my experience though, jobs failing >>>> at the copy step are usually because of trying to copy partitioned columns. >>>> That isn't supported by BigQuery (see copy job limitations [2] >>>> >>>> [1] https://cloud.google.com/bigquery/docs/managing-tables#copy-table >>>> [2] >>>> https://cloud.google.com/bigquery/docs/managing-tables#limitations_on_copying_tables >>>> >>>> On Thu, Oct 3, 2024 at 11:56 PM [email protected] <[email protected]> >>>> wrote: >>>> >>>>> Hey guys, >>>>> >>>>> Any help is appreciated. I'm using BigqueryIO file upload method to >>>>> load data to BQ, I don't see any error, any warning but I also don't see a >>>>> SINGLE row inserted to the table either >>>>> >>>>> Only thing I see is hundreds of load job like >>>>> beam_bq_job_TEMP_TABLE_LOAD_..... >>>>> And hundreds of temp table created >>>>> >>>>> Most jobs are done and I can see the data in temp table, but there is >>>>> not a single row written to the final destination? >>>>> >>>>> I know there is no way to track row level error, but At least the >>>>> runner/beam api should give me some hint what is wrong in any steps? And >>>>> there is zero document/example about this either. >>>>> >>>>> >>>>> Regards, >>>>> >>>>>
