Re: jdbc connector configuration

Arvid Heise Mon, 18 Oct 2021 01:30:19 -0700

K8s should not restart a finished job. Are you seeing this? How did you
configure the job?


On Wed, Oct 13, 2021 at 7:39 AM Qihua Yang <yang...@gmail.com> wrote:

> Hi,
>
> If I configure batch mode, application will stop after the job is
> complete, right? Then k8s will restart the pod and rerun the job. That is
> not what we want.
>
> Thanks,
> Qihua
>
> On Tue, Oct 12, 2021 at 7:27 PM Caizhi Weng <tsreape...@gmail.com> wrote:
>
>> Hi!
>>
>> It seems that you want to run a batch job instead of a streaming job.
>> Call EnvironmentSettings.newInstance().inBatchMode().build() to create your
>> environment settings for a batch job.
>>
>> Qihua Yang <yang...@gmail.com> 于2021年10月13日周三 上午5:50写道：
>>
>>> Hi,
>>>
>>> Sorry for asking again. I plan to use JDBC connector to scan a database.
>>> How do I know if it is done? Are there any metrics I can track? We want to
>>> monitor the progress, stop flink application when it is done.
>>>
>>> Thanks,
>>> Qihua
>>>
>>> On Fri, Oct 8, 2021 at 10:07 AM Qihua Yang <yang...@gmail.com> wrote:
>>>
>>>> It is pretty clear. Thanks Caizhi!
>>>>
>>>> On Thu, Oct 7, 2021 at 7:27 PM Caizhi Weng <tsreape...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> These configurations are not required to merely read from a database.
>>>>> They are here to accelerate the reads by allowing sources to read data in
>>>>> parallel.
>>>>>
>>>>> This optimization works by dividing the data into several
>>>>> (scan.partition.num) partitions and each partition will be read by a task
>>>>> slot (not a task manager, as a task manager may have multiple task slots).
>>>>> You can set scan.partition.column to specify the partition key and also 
>>>>> set
>>>>> the lower and upper bounds for the range of data.
>>>>>
>>>>> Let's say your partition key is the column "k" which ranges from 0 to
>>>>> 999. If you set the lower bound to 0, the upper bound to 999 and the 
>>>>> number
>>>>> of partitions to 10, then all data satisfying 0 <= k < 100 will be divided
>>>>> into the first partition and read by the first task slot, all 100 <= k <
>>>>> 200 will be divided into the second partition and read by the second task
>>>>> slot and so on. So these configurations should have nothing to do with the
>>>>> number of rows you have, but should be related to the range of your
>>>>> partition key.
>>>>>
>>>>> Qihua Yang <yang...@gmail.com> 于2021年10月7日周四 上午7:43写道：
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to read data from database with JDBC driver. From [1], I
>>>>>> have to config below parameters. I am not quite sure if I understand it
>>>>>> correctly. lower-bound is smallest value of the first partition,
>>>>>> upper-bound is largest value of the last partition. For example, if the 
>>>>>> db
>>>>>> table has 1000 rows. lower-bound is 0, upper-bound is 999. Is that 
>>>>>> correct?
>>>>>> If  setting scan.partition.num to 10, each partition read 100 row?
>>>>>> if I set scan.partition.num to 10 and I have 10 task managers. Each
>>>>>> task manager will pick a partition to read?
>>>>>>
>>>>>>    - scan.partition.column: The column name used for partitioning
>>>>>>    the input.
>>>>>>    - scan.partition.num: The number of partitions.
>>>>>>    - scan.partition.lower-bound: The smallest value of the first
>>>>>>    partition.
>>>>>>    - scan.partition.upper-bound: The largest value of the last
>>>>>>    partition.
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/jdbc/
>>>>>>
>>>>>> Thanks,
>>>>>> Qihua
>>>>>>
>>>>>

Re: jdbc connector configuration

Reply via email to