Re: Flink custom source

Anil Dasari Sun, 06 Oct 2024 08:48:41 -0700

Hi Ahmed,
Thanks for the response.
This is the part that I find unclear in the documentation and FLIP-27. The
actual split assignment happens in the SplitEnumerator#handleSplitRequest
method and not in SplitEnumerator#start. Both KafkaSource and JdbcSource
use context.callAsync to identify new splits in SplitEnumerator#start. From
what I understand, split assignment does not need to occur in the start
method. SplitEmulator assigns the assigned splits to the defined reader in
the source.


SplitEnumerator#handleSplitRequest calls decided by parallelism ? i think,
yes.

Kafka and Jdbc Source split emulators a bit complex due to the nature of
the auto discovery required for the given configuration. Here is the simple
split emulator from flink repo
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/util/IteratorSourceEnumerator.java#L58
where split assignment happen in handleRequest.

Let me know if you have any questions.

Thanks.

On Sun, Oct 6, 2024 at 6:44 AM Ahmed Hamdy <hamdy10...@gmail.com> wrote:

> Hi Anil
> I am glad the new (yet to be only) source API is getting attention,
> according to the execution model the assignments of splits is the
> responsibility of the enumerator and it seems that the enumerator is not
> assigning the readers any splits.
> Check kafka source for reference[1]
> 1-
> https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/KafkaSourceEnumerator.java#L286
> <https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/KafkaSourceEnumerator.java#L286>
> Best Regards
> Ahmed Hamdy
>
>
> On Sun, 6 Oct 2024 at 06:41, Anil Dasari <adas...@guidewire.com> wrote:
>
>> Hello,
>> I have implemented a custom source that reads tables in parallel, with
>> each split corresponding to a table and custom source implementation can be
>> found here -
>> https://github.com/adasari/mastering-flink/blob/main/app/src/main/java/org/example/paralleljdbc/DatabaseSource.java
>> <https://github.com/adasari/mastering-flink/blob/main/app/src/main/java/org/example/paralleljdbc/DatabaseSource.java>
>>
>> However, it seems the source splits are not being scheduled and data is
>> not being read from the tables. Can someone help me identify the issue in
>> the implementation?
>>
>> Thanks
>>
>>

Re: Flink custom source

Reply via email to