Good point Tom. We will update the KIP with the ACLs section and also the message format changes.
> On Feb 2, 2017, at 06:45, Tom Crayford <tcrayf...@heroku.com> wrote: > > I said this in the voting thread, but can the authors include a section > about new ACLs if there are going to be ACLs for TransactionalId. It's > mentioned in the google doc, but I think new ACLs should be in a KIP > directly. > >> On Thu, Feb 2, 2017 at 2:42 PM, Ismael Juma <ism...@juma.me.uk> wrote: >> >> Thanks for the responses and updates to the document, Guozhang and Jason. >> They look good. One follow-up and one additional comment: >> >> 1. I did some benchmarking and CRC32C seems to be a massive win when using >> the hardware instruction (particularly for messages larger than 65k), so >> I'm keen on taking advantage of the message format version bump to add >> support for it. I can write a separate KIP for this as it's not tied to >> Exactly-once, but it would be good to include the code change in the same >> PR that bumps the message format version. The benchmark and results can be >> found in the following link: >> https://gist.github.com/ijuma/e58ad79d489cb831b290e83b46a7d7bb. >> >> 2. The message timestamp field is 8 bytes. Did we consider storing the >> first timestamp in the message set and then storing deltas using varints in >> the messages like we do for offsets (the difference would be the usage of >> signed varints)? It seems like the deltas would be quite a bit smaller in >> the common case (potentially 0 for log append time, so we could even not >> store them at all using attributes like we do for key/value lengths). An >> alternative is using MaxTimestamp that is already present in the message >> set and computing deltas from that, but that seems more complicated. In any >> case, details aside, was this idea considered and rejected or is it worth >> exploring further? >> >> Ismael >> >> On Thu, Feb 2, 2017 at 1:02 AM, Jason Gustafson <ja...@confluent.io> >> wrote: >> >>> Ismael, >>> >>> Thanks for the comments. A few responses below: >>> >>> >>>> 2. `ProducerAppId` is a new authorization resource type. This >> introduces >>> a >>>> compatibility issue with regards to existing third-party authorizers. >> It >>>> would be good to highlight this in the migration/compatibility section. >>> >>> >>> Ack. I added a note in the migration section. >>> >>> 4. The Migration plan is relatively brief at the moment. Have we >>> considered >>>> if there's any additional work required due to KIP-97 (introduced in >>>> 0.10.2.0)? >>> >>> >>> Thanks, I added a few notes about client compatibility to the migration >>> section. I covered the main issues that come to mind, but let me know if >>> you think of others. >>> >>> 7. It seems like there is a bit of inconsistency when it comes to naming >>>> convention. For example, we have `InitPIDRequest`, `PidSnapshot` >>>> and `InvalidPidMapping`. The latter two match Kafka's naming >> conventions. >>>> There are a few other examples like that and it would be good to clean >>> them >>>> up. >>> >>> >>> Let's go with InitPidRequest for consistency. Haha, "InitPIdRequest" >> seems >>> like a compromise which satisfies no one. >>> >>> >>> -Jason >>> >>> On Wed, Feb 1, 2017 at 4:14 PM, Guozhang Wang <wangg...@gmail.com> >> wrote: >>> >>>> Ismael, thanks for your feedbacks. Replied inline. >>>> >>>>> On Wed, Feb 1, 2017 at 8:03 AM, Ismael Juma <ism...@juma.me.uk> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> A few comments follow: >>>>> >>>>> 1. The document states "inter-broker communications will be increased >>> by >>>> M >>>>> * N * P round trips per sec. We need to conduct some system >> performance >>>>> test to make sure this additional inter-broker traffic would not >>> largely >>>>> impact the broker cluster". Has this testing been done? And if not, >> are >>>> we >>>>> planning to do it soon? It seems important to validate this sooner >>> rather >>>>> than later. This applies more generally too, it would be great to >>>>> understand how the new message format affects the producer with small >>>>> messages, for example. >>>>> >>>>> >>>> Yes we are conducting the perf tests with the message format changes in >>> the >>>> first stage; then the inter-broker communication with minimal >> transaction >>>> coordinator implementations in the second stage. >>>> >>>> >>>>> 2. `ProducerAppId` is a new authorization resource type. This >>> introduces >>>> a >>>>> compatibility issue with regards to existing third-party authorizers. >>> It >>>>> would be good to highlight this in the migration/compatibility >> section. >>>>> >>>>> 3. I was happy to see that default values for the new configs have >> been >>>>> added to the document since I last checked it. It would be good to >>>> explain >>>>> the motivation for the choices. >>>>> >>>>> >>>> Updated doc. >>>> >>>> >>>>> 4. The Migration plan is relatively brief at the moment. Have we >>>> considered >>>>> if there's any additional work required due to KIP-97 (introduced in >>>>> 0.10.2.0)? >>>>> >>>>> 5. transactional.id sounds good >>>>> >>>>> 6. Since we are keeping per message CRCs for auditing apps, have we >>>>> considered mitigating the performance cost by using the more >> performant >>>>> CRC32c in the new message format version? >>>>> >>>>> >>>> We have not discussed about this before. But I think it should be >> doable >>> as >>>> long as we can include the additional conversion logic in the migration >>>> plan. >>>> >>>> >>>>> Nits: >>>>> >>>>> 7. It seems like there is a bit of inconsistency when it comes to >>> naming >>>>> convention. For example, we have `InitPIDRequest`, `PidSnapshot` >>>>> and `InvalidPidMapping`. The latter two match Kafka's naming >>> conventions. >>>>> There are a few other examples like that and it would be good to >> clean >>>> them >>>>> up. >>>>> >>>>> >>>> I agree with the inconsistency issue. About the name itself though, >>> should >>>> it be "InitPIdRequest", "PIdSnapshot" "InvalidPIdMapping" though, since >>> we >>>> need to capitalize "I" right? >>>> >>>> >>>>> 8. The document states "The first four fields of a message set in >> this >>>>> format must to be the same as the existing format because any fields >>>> before >>>>> the magic byte cannot be changed in order to provide a path for >>> upgrades >>>>> following a similar approach as was used in KIP-32". This makes >> things >>>>> easier, but it seems to me that the only strict requirement is that >> the >>>>> magic byte remains in the same offset and with the same size. >>>>> >>>>> >>>> I agree theoretically it is not required, but I think in practice it is >>>> actually better to make it more restrict: the three fields before magic >>>> byte are offset, length, and crc. Among them, crc needs to be before >>> magic >>>> byte if it wants to cover the magic byte fields; length would better be >>>> before the magic byte as well for pre-allocate memory to >> deser/decompress >>>> the message set, and the only field that does not matter too much to be >>>> after magic byte is offset, but in KIP-98 we will use it as the base >>> offset >>>> for message set and some validation checks can be optimized to not scan >>>> through the whole message with this field in front of the format. >>>> >>>> >>>>> On Mon, Jan 30, 2017 at 10:37 PM, Guozhang Wang <wangg...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello Folks, >>>>>> >>>>>> We have addressed all the comments collected so far, and would like >>> to >>>>>> propose a voting thread this Wednesday. If you have any further >>>> comments >>>>> on >>>>>> this KIP, please feel free to continue sending them on this thread >>>> before >>>>>> that. >>>>>> >>>>>> >>>>>> Guozhang >>>>>> >>>>>> >>>>>> On Mon, Jan 30, 2017 at 1:10 PM, Jason Gustafson < >> ja...@confluent.io >>>> >>>>>> wrote: >>>>>> >>>>>>> +1 for transactional.id. >>>>>>> >>>>>>> -Jason >>>>>>> >>>>>>> On Mon, Jan 30, 2017 at 1:05 PM, Guozhang Wang < >> wangg...@gmail.com >>>> >>>>>> wrote: >>>>>>> >>>>>>>> If I have to choose between app.id and >> transactional.instance.id >>> , >>>>> I'd >>>>>>>> choose the latter. >>>>>>>> >>>>>>>> Renaming transactional.instance.id to transactional.id sounds >>> even >>>>>>> better. >>>>>>>> >>>>>>>> >>>>>>>> Guozhang >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jan 30, 2017 at 11:49 AM, Apurva Mehta < >>>> apu...@confluent.io> >>>>>>>> wrote: >>>>>>>> >>>>>>>>>> Bumping one suggestion from Apurva above. The name "AppID" >>> has >>>>>> caused >>>>>>>>> some >>>>>>>>>> confusion. We're considering the following renaming: >>>>>>>>>> >>>>>>>>>> 1. AppID -> ProducerId (transaction.app.id -> producer.id) >>>>>>>>>> 2. PID -> IPID (internal producer ID) >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> How about AppId -> TransactionalId (transaction.app.id -> >>>>>>>> transactional.id >>>>>>>>> ) >>>>>>>>> >>>>>>>>> This makes it clear that this id just needs to be set when >> the >>>>>>>> application >>>>>>>>> wishes to use transactions. I also think it is more intuitive >>> in >>>>> the >>>>>>>>> context of how this id is used, viz. to maintain transactions >>>>> across >>>>>>>>> producer sessions. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Apurva >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- Guozhang >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- Guozhang >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> -- Guozhang >>>> >>> >>