Hi, Ismael, Thanks for the KIP.
Regarding log.record.version.force.upgrade: 1. Is that a topic level or a broker level config? The slight benefit of topic level is that it allows people to upgrade incrementally. The potential downside is the additional complexity associated with it. 2. When log.record.version.force.upgrade is set to 2, do we need to scan all batches in all existing segments during broker restart to verify the record format? Since this is rarely needed, the performance impact may not be a big concern. However, it will probably be useful to document this in the KIP. 3. Related to 2, we need a way to know when the upgrading to v2 message format has completed. This could be as simple as some logging. It would be useful to document this in the KIP. 4. Related to 2, once the upgrade to v2 message format has completed, do we expect the user to set log.record.version.force.upgrade to null? Otherwise, we will pay the batch scanning overhead on every broker restart. It would be useful to document the upgrade process related to this. Thanks, Jun On Tue, Jun 8, 2021 at 8:21 AM Ismael Juma <ism...@juma.me.uk> wrote: > Hi Jason, > > Thanks for the thoughtful comments. Responses below. > > 1. Good point. I updated the KIP to up-convert during replication if the > IBP is 3.0. I am still not 100% sure if this is truly required, but it does > seem like a good idea to avoid flip-flopping like this. I'll follow up with > you to understand if this is a correctness requirement or something that > makes the system easier to reason about and then update the KIP and this > thread. > > 2. Agreed. I updated the KIP to mention that we will up-convert to single > record batches for the uncompressed case. > > 3. I added a note regarding produce request versions we will no longer > support. I originally did not include this since the broker doesn't seem to > do the validation for produce requests while it does for fetch requests. > But from a specification perspective, it makes sense to include it since > the current behavior is confusing (it may be this way for compatibility > with now very old clients). With regards to list offsets v0, I suggest we > do that via a separate KIP. As you said, there might be other similar > opportunities and since the target would be AK 4.0, we have a bit of time > to do the analysis. > > Ismael > > On Fri, Jun 4, 2021 at 2:10 PM Jason Gustafson <ja...@confluent.io.invalid > > > wrote: > > > Hi Ismael, > > > > I agree it would be awesome to drop support for the old formats! A few > > comments below: > > > > 1. Suppose that a partition with 3 replicas begins at v0. One broker is > > upgraded to v2 with the new "force upgrade" config as part of a rolling > > restart, which leaves two replicas on v0 and one on v2. The KIP documents > > that replicas will take the format from the leader without any > conversion, > > so I'm a little concerned that leader changes in this state could result > in > > the version flipping from v2 to v0. Could that happen? Say, for example, > if > > the third replica was catching up while the rolling restart was in > > progress. Since it is probably an unlikely case in practice, would it > make > > sense to allow up-conversion during replication when "force upgrade" has > > been enabled? Not sure I see any great alternatives to this. > > > > 2. A related question concerns batching when upgrading from the > non-batched > > formats (i.e. v0 and v1 when no compression is in use). I suspect the > only > > reasonable way to do this is to treat each individual record as a batch > of > > 1 during the conversion process. Otherwise, I am not sure we can come up > > with a deterministic approach to re-batching which can be applied > > consistently on all replicas (especially taking into account compaction). > > The reason this is important is that replication is aligned by batches, > so > > if the batches are not consistent, the replication won't work. The only > > problem with using single-record batches is that the v2 message format > has > > some additional overhead for the batch itself when compared with the v0 > and > > v1 individual record format. So the up-conversion process would likely > > increase storage. I don't think this is a dealbreaker, but probably worth > > calling out somewhere. > > > > 3. In 4.0, we are dropping support for Fetch versions 0 through 3. I > might > > have missed it in the KIP, but it sounds like we need a similar > restriction > > on Produce versions since up-conversion will also not be supported? Would > > it be reasonable to drop ListOffsets v0 at the same time? It is not > related > > to the message format, but the Fetch restriction will already effectively > > kill any client that might still be using this wonky version. Not sure if > > there are any similar opportunities, but might be worth checking. > > > > Thanks, > > Jason > > > > > > > > On Fri, Jun 4, 2021 at 6:14 AM Ismael Juma <ism...@juma.me.uk> wrote: > > > > > If there are no comments or objections, I will start a vote on this > soon. > > > > > > On Tue, Jun 1, 2021 at 7:59 AM Ismael Juma <ism...@juma.me.uk> wrote: > > > > > > > Hi all, > > > > > > > > It's time to start the process of sunsetting message formats v0 and > v1 > > in > > > > order to establish a new baseline in terms of supported client/broker > > > > behavior and to improve maintainability & supportability of Kafka. > > Please > > > > take a look at the proposal: > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-724%3A+Drop+support+for+message+formats+v0+and+v1 > > > > > > > > Ismael > > > > > > > > > >