Thanks for updating the FLIP Danny. Changes look good to me. Please feel free to proceed with a vote soon.
On Tue, Jun 30, 2020 at 11:19 PM Cranmer, Danny <cranm...@amazon.com.invalid> wrote: > Hey Gordon, > > I have updated the FLIP [1] to include support for configurable > registration strategies: > - Added 2 additional configuration keys > - Added Registration/De-registration Configuration section > - Updated Stream Consumer Registration/Tear Down section > - Remove rejected alternative (since we are now optionally supporting it) > > A slight tweak. I have added a third option, "none". This will disable > registration/de-registration and allow the user to directly pass in the > ConsumerARN. This option adds overhead/complexity to the user configuration > but it will remove all start-up and teardown AWS SDK calls. > > Let me know if you are happy to proceed to a vote or have any further > feedback. > > Thanks, > Danny Cranmer > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-128%3A+Enhanced+Fan+Out+for+AWS+Kinesis+Consumers > > On 29/06/2020, 15:28, "Cranmer, Danny" <cranm...@amazon.com.INVALID> > wrote: > > Hey Gordon, > > Thank-you for you review and feedback. > > I agree with your suggestion for the contribution plan. I have updated > the FLIP to include the additional step with the FanOutKinesisProxy/AWS SDK > 2.x dependency. I have also added another precursor step to generally > improve test coverage. I have been playing around in the connector code and > discovered that restarting consumption from an aggregated record is not > covered by unit/integration tests. I have written some additional tests > with simulated Kinesis behaviour, pushing these tests in advance would > increase confidence in the next contribution step. > > Regarding the consumer de-/registration. I had not considered making > it optional via configuration as you propose. I agree with your observation > on the additional configuration complexity and it would also expose > internal implementation details to the user. That being said, if a user > application has a very high parallelism they could quite easily exceed the > ListStreamConsumers quota (5 TPS [1]), and increase the application > start-up time substantially with back-off delays. The client could > conditionally (based on config) register the stream consumer and add the > ConsumerARN(s) to the consumer properties, eliminating the required > parallel calls to ListStreamConsumers by the Flink tasks. I will update the > FLIP to include this change, and reply to this thread once it is done. > > [1] > https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListStreamConsumers.html > > Thanks, > Danny, > > On 29/06/2020, 05:19, "Tzu-Li (Gordon) Tai" <tzuli...@apache.org> > wrote: > > CAUTION: This email originated from outside of the organization. > Do not click links or open attachments unless you can confirm the sender > and know the content is safe. > > > > Also, if it wasn't clear, I'll be happy to provide committer > support on > reviewing and merging this FLIP, if it gets approved :) > > On Mon, Jun 29, 2020 at 12:12 PM Tzu-Li (Gordon) Tai < > tzuli...@apache.org> > wrote: > > > Hi Cranmer, > > > > Thank you for proposing the feature and starting the discussion > thread. > > This is really great work! > > > > Overall, +1 to adding EFO support to the Kinesis connector. > > I can see that having a dedicated throughput quota for each > consuming > > Flink application is definitely a requirement for AWS users. > > In the past, we worked around this by using adaptive polling to > avoid > > exceeding the quotas with multiple consumers, this would > probably go away > > with this implemented. > > > > There are a few things I like about the current proposal: > > - EFO is an opt-in feature for now. Once we decide to > reimplement the > > Kinesis connector on top of the new source interface (FLIP-27), > we can > > probably consider enabling EFO by default to match the defaults > of the > > higher-level KCL library. > > - From the design, it seems like the changes can indeed be fairly > > consolidated in the Kinesis connector. The change should be > fairly safe as > > well, since we're essentially abstracting record publishing > concerns only, > > which is transparent to the exactly-once semantics / watermark > components. > > > > Concerning competing stream consumer de-/registration: > > this would most likely go away with the new source interface, > where this > > can be done on the source split enumerator. > > I'm personally okay with the proposed strategy of competing with > backoff. > > As food for thought, have you considered an opt-in / opt-out, > where the > > user knows that the client has access to KDS, and can choose to > register > > once only on the client side instead of competing in TMs? > > I'm not sure if this is worth the extra configuration complexity > though. > > > > For the concrete next steps for implementation after this FLIP > has passed: > > From the FLIP I can conclude that implementation wise, this > would come in > > a few steps - > > 1. Abstract away record publishing behind a new interface. > Initially we > > only have one implementation (the current polling mechanism). > > 2. Add in AWS SDK 2.x dependency, with a new FanOutKinesisProxy > > implementation. > > 3. Add FanOutRecordPublisher to finalize EFO support. > > > > I think step 2. wasn't explicitly mentioned in the FLIP, but I > strongly > > suggest to consolidate that step as a single PR, as we would > need to do a > > license check for dependency changes and it would be nice to > move forward > > with that with at least interference of code changes as possible. > > > > Pushing this FLIP forward for approval: > > Since this FLIP is a fairly consolidated change, we should be > safe to > > proceed with a vote soon. > > That usually happens in a separate vote thread, linking to this > discussion > > thread. > > > > cc'ing Thomas Weise as well, as he has also worked substantially > on the > > Kinesis connector in the past. > > > > Cheers, > > Gordon > > > > On Tue, Jun 23, 2020 at 6:02 PM hey_wxl < > xiaolong.w...@smartnews.com> > > wrote: > > > >> Hi, Cranmer. > >> > >> I'm Roland Wang. I've read the FLIP you wrote, and agree > with your > >> design. > >> Recently, I'm working on this feature too, and have made > some progress: > >> > >> 1. I add two methods: getOrRegisterConsumer & > subscribeToShard on > >> KinesisProxyInterface. > >> 2. I re-implement the KinesisProxy using AWS SDK V2.x. > >> 3. I use the new KinesisProxy to implement ShardConsumer. > >> > >> Though my design is not fully considered, I hope we can > discuss a > >> little > >> bit about this feature. I wish to make some contribution to the > community. > >> > >> > >> > >> > >> -- > >> Sent from: > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ > >> > > > > >