How large is the BigQuery table? Does it fit in memory?
10 columns (each column data is small), 800,000 rows of data. I believe these data should be easily fitted into the memory. At 2020-05-24 11:13:04, "Reuven Lax" <re...@google.com> wrote: How large is the BigQuery table? Does it fit in memory? On Sat, May 23, 2020 at 7:01 PM 杨胜 <liff...@163.com> wrote: Hi everyone, I am new to apache beam, but I had experiences on spark streaming. I have a daily updated bigquery table, I want to use this bigquery table as a lookup table, read this table into beam as bounded PCollection<TableRow> and refresh this collection within beam on daily basis, I named this variable bigqueryTableRows. I also had another pubsub topic messages, I want to read this message as unbounded PCollection<TableRow>, I named this variable as pubsubTableRows. then join bigqueryTableRows with pubsubTableRows. finally write result into bigquery. I have checked all the examples under beam's github repository: https://github.com/apache/beam/tree/d906270f243bb4de20a7f0baf514667590c8c494/examples/java/src/main/java/org/apache/beam/examples. But none matches my case. Any suggestion on how I should implement my pipeline? Many Thanks, Steven