Hey,
When trying to read a file from S3 with a combine action, the pipeline
seems to be stuck. When replacing it with a GCP source it works fine.
Furthermore - if I comment out the Count.PerElement part it also works.

Anyone has an idea why that is?

 lines = p | beam.io.ReadFromText('s3://...')
        transformed = (
                lines
                | 'SplitLine' >>
(beam.ParDo(WordExtractingDoFn()).with_output_types(unicode))
                | 'Count' >> beam.combiners.Count.PerElement()
                | 'Format' >> beam.MapTuple(lambda w, c: f'{w}: {c}')
        )

        transformed | 'Write' >> beam.io.WriteToText('s3://...')

Thanks!
Nir

Reply via email to