Re: Supporting Collector API in Kinesis Connector

Danny Cranmer Thu, 14 Apr 2022 00:51:45 -0700

Just to clarify, the native KCL/KPL aggregation [1] handles the partition
key rebalancing for you out of the box.



[1] https://docs.aws.amazon.com/streams/latest/dev/kinesis
-kpl-concepts.html#kinesis-kpl-concepts-aggretation

On Thu, Apr 14, 2022 at 8:48 AM Danny Cranmer <dannycran...@apache.org>
wrote:

> Hey Blake,
>
> I am happy to take a look, but I will not have capacity until next week.
>
> The current way to achieve multiple records per PUT is to use the native
> KCL/KPL aggregation [1], which is supported by the Flink connector. A
> downside of aggregation is that the sender has to manage the partitioning
> strategy. For example, each record in your list will be sent to the same
> shard. If the sender implements grouping of records by partition key, then
> care needs to be taken during shard scaling.
>
> Thanks,
>
> [1]
> https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html#kinesis-kpl-concepts-aggretation
>
>
> On Tue, Apr 12, 2022 at 3:52 AM Blake Wilson <bl...@yellowpapersun.com>
> wrote:
>
>> Hello, I recently submitted a pull request to support the Collector API
>> for
>> the Kinesis Streams Connector.
>>
>> The ability to use this API would save a great deal of shuttling bytes
>> around in multiple Flink programs I've worked on. This is because to
>> construct a stream of the desired type without Collector support, the
>> Kinesis source must emit a List[Type], and this must be flattened to a
>> Type
>> stream.
>>
>> Because of the way Kinesis pricing works, it rarely makes sense to send
>> one
>> value per Kinesis record. In provisioned mode, Kinesis PUTs are priced to
>> the nearest 25KB (https://aws.amazon.com/kinesis/data-streams/pricing/),
>> so
>> records are more sensibly packed with multiple values unless these values
>> are quite large. Therefore, I suspect the need to handle multiple values
>> per Kinesis record is quite common.
>>
>> The PR is located at https://github.com/apache/flink/pull/19417, and I'd
>> love to get some feedback on Github or here.
>>
>> Thanks!
>>
>

Re: Supporting Collector API in Kinesis Connector

Reply via email to