Re: Source advancing before previously read records are processed fully

2021-03-22 Thread Ahmet Altay
/cc @Boyuan Zhang On Wed, Mar 17, 2021 at 3:38 AM Pradyumna Achar wrote: > Hello, > > I am running into a strange issue with the KafkaIO streaming source. > > The source just keeps reading records from the Kafka topics even before > the downstream DoFns in the pipeline have got a chance to proc

Re: Is there a perf comparison between Beam (on spark) and native Spark?

2021-03-22 Thread Boyuan Zhang
+Kyle Weaver Kyle, do you happen to have some information here? On Mon, Mar 22, 2021 at 10:00 AM Tao Li wrote: > Hi Beam community, > > > > I am wondering if there is a doc to compare perf of Beam (on Spark) and > native spark for batch processing? For example using TPCDS benmark. > > > > I did

Custom template issue

2021-03-22 Thread Tajdar Siddiqui
Any ideas ? https://stackoverflow.com/questions/66731268/gcp-dataflow-custom-templated-job-read-from-bigquery-fails-during-cleanup

Is there a perf comparison between Beam (on spark) and native Spark?

2021-03-22 Thread Tao Li
Hi Beam community, I am wondering if there is a doc to compare perf of Beam (on Spark) and native spark for batch processing? For example using TPCDS benmark. I did find some relevant links like this