Re: SplitEnumerator for Bigquery Source.

2022-10-18 Thread Lavkesh Lahngir
Hi Martin, Tables are partitioned on timestamp, just like Hive. It can be range partitioned too. It doesn't matter. The option number two in the first email talks about one split of each partition. Are you suggesting something different? Thanks þri., 18. okt. 2022 kl. 15:28 skrifaði Martijn Visse

Re: SplitEnumerator for Bigquery Source.

2022-10-18 Thread Martijn Visser
Hi Lavkesh, I'm not familiar with Big Query but when looking through the BQ API, I noticed that the `Table` resource provides both a timePartioning and a rangePartioning. [1] Couldn't you use that? Best regards, Martijn https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#Table On T

Re: SplitEnumerator for Bigquery Source.

2022-10-17 Thread yuxia
I'm familiar with Hive source but have no much knowledge about Bigquery. But from my side, the apprach number three sounds more reasonable. option1 sounds a llitte of complex and may time-counsuming during generateing splits . option2 seems isnot flexible and is too coarse-grained. option4 need