It is pretty clear. Thanks Caizhi!

On Thu, Oct 7, 2021 at 7:27 PM Caizhi Weng <tsreape...@gmail.com> wrote:

> Hi!
>
> These configurations are not required to merely read from a database. They
> are here to accelerate the reads by allowing sources to read data in
> parallel.
>
> This optimization works by dividing the data into several
> (scan.partition.num) partitions and each partition will be read by a task
> slot (not a task manager, as a task manager may have multiple task slots).
> You can set scan.partition.column to specify the partition key and also set
> the lower and upper bounds for the range of data.
>
> Let's say your partition key is the column "k" which ranges from 0 to 999.
> If you set the lower bound to 0, the upper bound to 999 and the number of
> partitions to 10, then all data satisfying 0 <= k < 100 will be divided
> into the first partition and read by the first task slot, all 100 <= k <
> 200 will be divided into the second partition and read by the second task
> slot and so on. So these configurations should have nothing to do with the
> number of rows you have, but should be related to the range of your
> partition key.
>
> Qihua Yang <yang...@gmail.com> 于2021年10月7日周四 上午7:43写道:
>
>> Hi,
>>
>> I am trying to read data from database with JDBC driver. From [1], I have
>> to config below parameters. I am not quite sure if I understand it
>> correctly. lower-bound is smallest value of the first partition,
>> upper-bound is largest value of the last partition. For example, if the db
>> table has 1000 rows. lower-bound is 0, upper-bound is 999. Is that correct?
>> If  setting scan.partition.num to 10, each partition read 100 row?
>> if I set scan.partition.num to 10 and I have 10 task managers. Each task
>> manager will pick a partition to read?
>>
>>    - scan.partition.column: The column name used for partitioning the
>>    input.
>>    - scan.partition.num: The number of partitions.
>>    - scan.partition.lower-bound: The smallest value of the first
>>    partition.
>>    - scan.partition.upper-bound: The largest value of the last partition.
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/jdbc/
>>
>> Thanks,
>> Qihua
>>
>

Reply via email to