Jun, KAFKA-2147 doesn't seem to have a commit associated with it, so I can't cherrypick just this fix. I suggest leaving this out since there is a 0.8.2.x workaround in the JIRA.
Gwen On Mon, Aug 17, 2015 at 5:24 PM, Jun Rao <j...@confluent.io> wrote: > Gwen, > > Thanks for putting the list together. > > I'd recommend that we exclude the following: > KAFKA-1702: This is for the old producer and is only a problem if there are > some unexpected exceptions (e.g. UnknownClass). > KAFKA-2336: Most people don't change offsets.topic.num.partitions. > KAFKA-1724: The patch there is never committed since the fix is included in > another jira (a much larger patch). > KAFKA-2241: This doesn't seem be a common problem. It only happens when the > fetch request blocks on the broker for an extended period of time, which > should be rare. > > I'd also recommend that we include the following: > KAFKA-2147: This impacts the memory size of the purgatory and a number of > people have experienced that. The fix is small and has been tested in > production usage. It hasn't been committed though since the issue is > already fixed in trunk and we weren't planning for an 0.8.2.2 release then. > > Thanks, > > Jun > > On Mon, Aug 17, 2015 at 2:56 PM, Gwen Shapira <g...@confluent.io> wrote: > > > Thanks for creating a list, Grant! > > > > I placed it on the wiki with a quick evaluation of the content and > whether > > it should be in 0.8.2.2: > > > > > https://cwiki.apache.org/confluence/display/KAFKA/Proposed+patches+for+0.8.2.2 > > > > I'm attempting to only cherrypick fixes that are both important for large > > number of users (or very critical to some users) and very safe (mostly > > judged by the size of the change, but not only) > > > > If your favorite bugfix is missing from the list, or is there but marked > > "No", please let us know (in this thread) what we are missing and why it > is > > both important and safe. > > Also, if I accidentally included something you consider unsafe, speak up! > > > > Gwen > > > > On Mon, Aug 17, 2015 at 8:17 AM, Grant Henke <ghe...@cloudera.com> > wrote: > > > > > +dev > > > > > > Adding dev list back in. Somehow it got dropped. > > > > > > > > > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke <ghe...@cloudera.com> > > wrote: > > > > > > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I > > > don't > > > > suspect all of these will (or should) make it into the release but > this > > > > should be a relatively complete list to work from: > > > > > > > > - KAFKA-2114 <https://issues.apache.org/jira/browse/KAFKA-2114>: > > > Unable > > > > to change min.insync.replicas default > > > > - KAFKA-1702 <https://issues.apache.org/jira/browse/KAFKA-1702>: > > > > Messages silently Lost by producer > > > > - KAFKA-2012 <https://issues.apache.org/jira/browse/KAFKA-2012>: > > > > Broker should automatically handle corrupt index files > > > > - KAFKA-2406 <https://issues.apache.org/jira/browse/KAFKA-2406>: > > ISR > > > > propagation should be throttled to avoid overwhelming controller. > > > > - KAFKA-2336 <https://issues.apache.org/jira/browse/KAFKA-2336>: > > > > Changing offsets.topic.num.partitions after the offset topic is > > > created > > > > breaks consumer group partition assignment > > > > - KAFKA-2337 <https://issues.apache.org/jira/browse/KAFKA-2337>: > > > Verify > > > > that metric names will not collide when creating new topics > > > > - KAFKA-2393 <https://issues.apache.org/jira/browse/KAFKA-2393>: > > > > Correctly Handle InvalidTopicException in > > KafkaApis.getTopicMetadata() > > > > - KAFKA-2189 <https://issues.apache.org/jira/browse/KAFKA-2189>: > > > Snappy > > > > compression of message batches less efficient in 0.8.2.1 > > > > - KAFKA-2308 <https://issues.apache.org/jira/browse/KAFKA-2308>: > > New > > > > producer + Snappy face un-compression errors after broker restart > > > > - KAFKA-2042 <https://issues.apache.org/jira/browse/KAFKA-2042>: > > New > > > > producer metadata update always get all topics. > > > > - KAFKA-1367 <https://issues.apache.org/jira/browse/KAFKA-1367>: > > > Broker > > > > topic metadata not kept in sync with ZooKeeper > > > > - KAFKA-972 <https://issues.apache.org/jira/browse/KAFKA-972>: > > > MetadataRequest > > > > returns stale list of brokers > > > > - KAFKA-1867 <https://issues.apache.org/jira/browse/KAFKA-1867>: > > > liveBroker > > > > list not updated on a cluster with no topics > > > > - KAFKA-1650 <https://issues.apache.org/jira/browse/KAFKA-1650>: > > > Mirror > > > > Maker could lose data on unclean shutdown. > > > > - KAFKA-2009 <https://issues.apache.org/jira/browse/KAFKA-2009>: > > Fix > > > > UncheckedOffset.removeOffset synchronization and trace logging > issue > > > in > > > > mirror maker > > > > - KAFKA-2407 <https://issues.apache.org/jira/browse/KAFKA-2407>: > > Only > > > > create a log directory when it will be used > > > > - KAFKA-2327 <https://issues.apache.org/jira/browse/KAFKA-2327>: > > > > broker doesn't start if config defines advertised.host but not > > > > advertised.port > > > > - KAFKA-1788: producer record can stay in RecordAccumulator > forever > > if > > > > leader is no available > > > > - KAFKA-2234 <https://issues.apache.org/jira/browse/KAFKA-2234>: > > > > Partition reassignment of a nonexistent topic prevents future > > > reassignments > > > > - KAFKA-2096 <https://issues.apache.org/jira/browse/KAFKA-2096>: > > > > Enable keepalive socket option for broker to prevent socket leak > > > > - KAFKA-1057 <https://issues.apache.org/jira/browse/KAFKA-1057>: > > Trim > > > > whitespaces from user specified configs > > > > - KAFKA-1641 <https://issues.apache.org/jira/browse/KAFKA-1641>: > > Log > > > > cleaner exits if last cleaned offset is lower than earliest offset > > > > - KAFKA-1648 <https://issues.apache.org/jira/browse/KAFKA-1648>: > > > Round > > > > robin consumer balance throws an NPE when there are no topics > > > > - KAFKA-1724 <https://issues.apache.org/jira/browse/KAFKA-1724>: > > > > Errors after reboot in single node setup > > > > - KAFKA-1758 <https://issues.apache.org/jira/browse/KAFKA-1758>: > > > > corrupt recovery file prevents startup > > > > - KAFKA-1866 <https://issues.apache.org/jira/browse/KAFKA-1866>: > > > > LogStartOffset gauge throws exceptions after log.delete() > > > > - KAFKA-1883 <https://issues.apache.org/jira/browse/KAFKA-1883>: > > > NullPointerException > > > > in RequestSendThread > > > > - KAFKA-1896 <https://issues.apache.org/jira/browse/KAFKA-1896>: > > > > Record size funcition of record in mirror maker hit NPE when the > > > message > > > > value is null. > > > > - KAFKA-2101 <https://issues.apache.org/jira/browse/KAFKA-2101>: > > > > Metric metadata-age is reset on a failed update > > > > - KAFKA-2112 <https://issues.apache.org/jira/browse/KAFKA-2112>: > > make > > > > overflowWheel volatile > > > > - KAFKA-2117 <https://issues.apache.org/jira/browse/KAFKA-2117>: > > > > OffsetManager uses incorrect field for metadata > > > > - KAFKA-2164 <https://issues.apache.org/jira/browse/KAFKA-2164>: > > > > ReplicaFetcherThread: suspicious log message on reset offset > > > > - KAFKA-1668 <https://issues.apache.org/jira/browse/KAFKA-1668>: > > > > TopicCommand doesn't warn if --topic argument doesn't match any > > topics > > > > - KAFKA-2198 <https://issues.apache.org/jira/browse/KAFKA-2198>: > > > > kafka-topics.sh exits with 0 status on failures > > > > - KAFKA-2235 <https://issues.apache.org/jira/browse/KAFKA-2235>: > > > > LogCleaner offset map overflow > > > > - KAFKA-2241 <https://issues.apache.org/jira/browse/KAFKA-2241>: > > > > AbstractFetcherThread.shutdown() should not block on > > > > ReadableByteChannel.read(buffer) > > > > - KAFKA-2272 <https://issues.apache.org/jira/browse/KAFKA-2272>: > > > > listeners endpoint parsing fails if the hostname has capital > letter > > > > - KAFKA-2345 <https://issues.apache.org/jira/browse/KAFKA-2345>: > > > > Attempt to delete a topic already marked for deletion throws > > > > ZkNodeExistsException > > > > - KAFKA-2353 <https://issues.apache.org/jira/browse/KAFKA-2353>: > > > > SocketServer.Processor should catch exception and close the socket > > > properly > > > > in configureNewConnections. > > > > - KAFKA-1836 <https://issues.apache.org/jira/browse/KAFKA-1836>: > > > > metadata.fetch.timeout.ms set to zero blocks forever > > > > - KAFKA-2317 <https://issues.apache.org/jira/browse/KAFKA-2317>: > > > De-register > > > > isrChangeNotificationListener on controller resignation > > > > > > > > Note: KAFKA-2120 <https://issues.apache.org/jira/browse/KAFKA-2120> > & > > > > KAFKA-2421 <https://issues.apache.org/jira/browse/KAFKA-2421> were > > > > mentioned in previous emails, but are not in the list because they > are > > > not > > > > committed yet. > > > > > > > > Hope that helps the effort. > > > > > > > > Thanks, > > > > Grant > > > > > > > > On Mon, Aug 17, 2015 at 12:09 AM, Grant Henke <ghe...@cloudera.com> > > > wrote: > > > > > > > >> +1 to that suggestion. Though I suspect that requires a committer to > > do. > > > >> Making it part of the standard commit process could work too. > > > >> On Aug 16, 2015 11:01 PM, "Gwen Shapira" <g...@confluent.io> wrote: > > > >> > > > >>> BTW. I think it will be great for Apache Kafka to have a 0.8.2 > > "release > > > >>> manager" who's role is to cherrypick low-risk bug-fixes into the > > 0.8.2 > > > >>> branch and once enough bug fixes happened (or if sufficiently > > critical > > > >>> fixes happened) to roll out a new maintenance release (with every 3 > > > month > > > >>> as a reasonable bugfix release target). > > > >>> > > > >>> This will add some predictability regarding how fast we release > fixes > > > for > > > >>> bugs. > > > >>> > > > >>> Gwen > > > >>> > > > >>> On Sun, Aug 16, 2015 at 8:09 PM, Jeff Holoman < > jholo...@cloudera.com > > > > > > >>> wrote: > > > >>> > > > >>> > +1 for the release and also including > > > >>> > > > > >>> > https://issues.apache.org/jira/browse/KAFKA-2114 > > > >>> > > > > >>> > Thanks > > > >>> > > > > >>> > Jeff > > > >>> > > > > >>> > On Sun, Aug 16, 2015 at 2:51 PM, Stevo Slavić <ssla...@gmail.com > > > > > >>> wrote: > > > >>> > > > > >>> > > +1 (non-binding) for 0.8.2.2 release > > > >>> > > > > > >>> > > Would be nice to include in that release new producer > resiliency > > > bug > > > >>> > fixes > > > >>> > > https://issues.apache.org/jira/browse/KAFKA-1788 and > > > >>> > > https://issues.apache.org/jira/browse/KAFKA-2120 > > > >>> > > > > > >>> > > On Fri, Aug 14, 2015 at 4:03 PM, Gwen Shapira < > g...@confluent.io > > > > > > >>> wrote: > > > >>> > > > > > >>> > > > Will be nice to include Kafka-2308 and fix two critical > snappy > > > >>> issues > > > >>> > in > > > >>> > > > the maintenance release. > > > >>> > > > > > > >>> > > > Gwen > > > >>> > > > On Aug 14, 2015 6:16 AM, "Grant Henke" <ghe...@cloudera.com> > > > >>> wrote: > > > >>> > > > > > > >>> > > > > Just to clarify. Will KAFKA-2189 be the only patch in the > > > >>> release? > > > >>> > > > > > > > >>> > > > > On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy < > > > >>> > ku...@nmsworks.co.in > > > >>> > > > > > > >>> > > > > wrote: > > > >>> > > > > > > > >>> > > > > > +1 for 0.8.2.2 release > > > >>> > > > > > > > > >>> > > > > > On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma < > > > >>> ism...@juma.me.uk> > > > >>> > > > wrote: > > > >>> > > > > > > > > >>> > > > > > > I think this is a good idea as the change is minimal on > > our > > > >>> side > > > >>> > > and > > > >>> > > > it > > > >>> > > > > > has > > > >>> > > > > > > been tested in production for some time by the > reporter. > > > >>> > > > > > > > > > >>> > > > > > > Best, > > > >>> > > > > > > Ismael > > > >>> > > > > > > > > > >>> > > > > > > On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao < > > j...@confluent.io > > > > > > > >>> > wrote: > > > >>> > > > > > > > > > >>> > > > > > > > Hi, Everyone, > > > >>> > > > > > > > > > > >>> > > > > > > > Since the release of Kafka 0.8.2.1, a number of > people > > > have > > > >>> > > > reported > > > >>> > > > > an > > > >>> > > > > > > > issue with snappy compression ( > > > >>> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-2189). > > > >>> Basically, > > > >>> > if > > > >>> > > > > they > > > >>> > > > > > > use > > > >>> > > > > > > > snappy in 0.8.2.1, they will experience a 2-3X space > > > >>> increase. > > > >>> > > The > > > >>> > > > > > issue > > > >>> > > > > > > > has since been fixed in trunk (just a snappy jar > > > upgrade). > > > >>> > Since > > > >>> > > > > 0.8.3 > > > >>> > > > > > is > > > >>> > > > > > > > still a few months away, it may make sense to do an > > > 0.8.2.2 > > > >>> > > release > > > >>> > > > > > just > > > >>> > > > > > > to > > > >>> > > > > > > > fix this issue. Any objections? > > > >>> > > > > > > > > > > >>> > > > > > > > Thanks, > > > >>> > > > > > > > > > > >>> > > > > > > > Jun > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > -- > > > >>> > > > > Grant Henke > > > >>> > > > > Software Engineer | Cloudera > > > >>> > > > > gr...@cloudera.com | twitter.com/gchenke | > > > >>> > linkedin.com/in/granthenke > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > > >>> > > > > >>> > -- > > > >>> > Jeff Holoman > > > >>> > Systems Engineer > > > >>> > > > > >>> > > > >> > > > > > > > > > > > > -- > > > > Grant Henke > > > > Software Engineer | Cloudera > > > > gr...@cloudera.com | twitter.com/gchenke | > linkedin.com/in/granthenke > > > > > > > > > > > > > > > > -- > > > Grant Henke > > > Software Engineer | Cloudera > > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke > > > > > >