I wrote an apache beam pipeline in python to read messages from pubsub subscription.
The data rate at which the messages published from that pubsub topic which is in us-east4 region is 10,000 tuples/sec. The pipeline looks likes this: | 'read from pubsub' >> beam.io.textio.ReadFromText() | ' print' >> beam.Map(print) I created template for this pipeline and submitted the job in dataflow with n2d-standard-4 machine. Its using 90% of CPU just to read from pubsub and backlog is around 10 seconds which is constant over the time. My question are: 1. Is it normal to use 90% of CPU just to read the messages from pubsub 2. what could be the possible reasons for this. 3. why it is not able to clear all the backlog and infact it is increasing after after sometime as throughput is also decreasing. Thank you