[ https://issues.apache.org/jira/browse/KAFKA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293589#comment-15293589 ]
Clint Hillerman commented on KAFKA-3730: ---------------------------------------- Thanks for the response. I'm fairly certain I updated the code. I actually tried both 0.9.0.0 and 0.9.0.1. Removed the old install dir, including the bin dir where I run kafka from. Stopped kafka and replaced with the new version. Updated the config to have the `inter.broker.protocol.version=0.8.2.X`. And restarted kafka on all the boxes. I did this process on all boxes one at a time. Everything appeared to work fine. Then when I tryed to bump the version up to 9.0.0.0 it started printing that error. I just checked and all of my files in the lib dirs have the 0.9.0.0 (or 0.9.0.1 when I try that version). I only have three nodes, so it's easy to check them all and make sure. Would having the zookeepers be on the same boxes cause any trouble? Do I need to do change any settings or anything on the zookeeper side? The thing I noticed now is the cluster says that it's in sync with 4 nodes even though I only have 3. I'm taking this cluster over from someone else, so I wasn't involved with the initial setup, but I'm fairly certain there is no forth node. In my configs I have the broker ids set to 100, 101, and 102, but when I do a describe I see a node 12. Node 12 is sometimes the leader and it's in sync. I made sure I don't have two version of kafka running on one of the nodes or something. Is there a way for me to check what zookeeper thinks node 12 is? Or do you have any advice on figuring out why it think node 12 is in sync? Could it be that at one point 102 was typed as 12 and kafka is just handling 102 and 12 the same? > Problem when updating from 0.8.2 to 0.9.0 > ----------------------------------------- > > Key: KAFKA-3730 > URL: https://issues.apache.org/jira/browse/KAFKA-3730 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.2.1, 0.9.0.0 > Environment: SUSE SLE 10.3 64bit > Reporter: Clint Hillerman > Priority: Critical > Labels: newbie > > Hello, > I'm having trouble upgrading a 3 node kafka cluster from 0.8.2.1 to 0.9.0.0. > I have followed the steps in the upgrade guide here: > http://kafka.apache.org/documentation.html > Also, my zookeepers are on the same box as kafka. Each node is both a > zookeeper and a broker. > Here's what I did: > On each box one at a time I, > - stopped kafka. > - replaced the code with the new version. Just removed the old kafka dir and > untared the new 0.9.0.0 version into it's place. Note: the data dir is in a > different location and was not deleted. > - copied the server.properties file from the 0.8.2.1 version to the 0.9.0.0 > config dir. > - added the "inter.broker.protocol.version=0.8.2.X" line to the > server.properties in 0.9.0.0's config dir. > - restarted kafka > After I completed that process on all 3 broker/zookeeper boxes, I switched > the version to 0.9.0.0 in the server.properties on one broker and restarted > kafka. > This caused an error in my server.log. About one every few seconds: > [2016-05-18 15:00:27,956] WARN [ReplicaFetcherThread-0-12], Error in fetch > kafka.server.ReplicaFetcherThread$FetchRequest@45597bba. Possible cause: > org.apache.kafka.common.protocol.types.SchemaException: Error reading field > 'responses': Error reading field 'topic': java.nio.BufferUnderflowException > (kafka.server.ReplicaFetcherThread) > A few other things I tried: > Restarting zookeepers. There status was also correct when I ran "server > mapr-zookeeper" qstatus. > The same process with 9.1 and got this error instead: > [2016-05-18 14:07:15,545] WARN [ReplicaFetcherThread-0-12], Error in fetch > kafka.server.ReplicaFetcherThread$FetchRequest@484ad173. Possible cause: > org.apache.kafka.common.protocol.types.SchemaException: Error reading field > 'responses': Error reading array of size 1078124, only 176 bytes available > (kafka.server.ReplicaFetcherThread) > Restarting everything at once (all broker and zookeeper processes) > Please let me know if I should provide more information or if posted this in > the wrong location. I'm also not sure if this is the right location to post > bugs like this. If there is a forum or something where this is more > appropriate please point in that direction. > Thanks, > cmhillerman -- This message was sent by Atlassian JIRA (v6.3.4#6332)