Try looking at the worker logs to get a full stack trace. Take a look at
this page for some debugging guidance[1] or consider opening a support case
with GCP.

1:
https://cloud.google.com/dataflow/docs/guides/troubleshooting-your-pipeline

On Thu, Jun 25, 2020 at 1:42 AM Mohil Khare <[email protected]> wrote:

> BTW, just to make sure that there is no issue with any individual
> PTransform, I enabled each one of them one by one and the pipeline started
> successfully. Issue happens as soon as I enable more than one new
> aforementioned PTransform.
>
> Thanks and regards
> Mohil
>
> On Thu, Jun 25, 2020 at 1:26 AM Mohil Khare <[email protected]> wrote:
>
>> Hello All,
>>
>> I am using the BEAM java 2.19.0 version on google dataflow.
>>
>> Need urgent help in debugging one issue.
>>
>> I recently added 3-4 new PTransformations. to an existing pipeline where
>> I read data from BQ for a certain timestamp and create
>> PCollectionView<Map<Key,value>> to be used as side input in other
>> PTransforms.
>>
>> i.e. something like this:
>>
>> /**
>>  * Get PCollectionView Stats1
>>  */
>> PCollectionView<Map<Stats1Key, Stats1>> stats1View =
>>     jobCompleteStatus
>>         .apply("Reload_MonthlyS2Stats_FromBQ", new ReadStatsS1())
>>         .apply("View_S1STATS", View.asSingleton());
>>
>> /**
>>  * Get PCollectionView of Stats2
>>  */
>> PCollectionView<Map<Stats2Key, Stats2>> stats2View =
>>     jobCompleteStatus
>>         .apply("Reload_OptimalAppCharsInfo_FromBQ", new ReadStatsS2())
>>         .apply("View_S2STATS", View.asSingleton());
>>
>>
>> and a couple more like these PTransforms. Here jobCompleteStatus is a message
>>
>> received from PubSub that act as a trigger to reload these views.
>>
>> The moment I deployed the above pipeline, it didn't start and
>>
>> error reporting gave weird exceptions(see attached screenshot1 and 
>> screenshot) which I don't know how to debug.
>>
>>
>> Then as an experiment I made a change where I enabled only one new 
>> transformation
>>
>> and disabled others. This time I didn't see any issue.
>>
>> So it looks like some memory issue.
>>
>> I also compared worker logs between working case and non working case
>>
>> and it looks resources were not granted in non working case.
>>
>> (See attached working-workerlogs and nonworking-workerlogs)
>>
>> I could't find any other log.
>>
>>
>> I would really appreciate quick help here.
>>
>>
>> Thanks and Regards
>>
>> Mohil
>>
>>
>>

Reply via email to