Thanks Carl. Always fun to do this stuff in production... ;)
Appreciate the input. I'll try a full cycle and see how that works. In your opinion, if I stop all brokers and all Zookeeper nodes, then restart all Zookeepers...at that point can I start both brokers at the same time, or should I let one broker fully start and read all the unflushed segments from disk before starting the second broker? Again, many thanks. Chris On Fri, Jul 21, 2017 at 12:13 PM, Carl Haferd <chaf...@groupon.com.invalid> wrote: > I have encountered similar difficulties in a test environment and it may be > necessary to stop the Kafka process on each broker and take Zookeeper > offline before removing the files and zookeeper paths. Otherwise there may > be a race condition between brokers which could cause the cluster to retain > information for the topic. > > Carl > > On Fri, Jul 21, 2017 at 9:06 AM, Chris Neal <cwn...@gmail.com> wrote: > > > Welp. Surprisingly, that did not fix the problem. :( > > > > I cleaned out all the entries for these topics from /config/topics, and > > removed the logs from the file system for those topics, and the messages > > are still flying by in the server.log file. > > > > Also, more concerning, when I was looking through the log files for the > > other broker in the cluster, I noticed the same type of message for a > topic > > that should actually be there: > > > > [2017-07-21 16:03:29,140] ERROR Conditional update of path > > /brokers/topics/perf_dstorage_raw/partitions/4/state with data > > {"controller_epoch":34,"leader":0,"version":1,"leader_ > epoch":0,"isr":[0]} > > and expected version 0 failed due to > > org.apache.zookeeper.KeeperException$BadVersionException: > KeeperErrorCode > > = > > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/4/state > > (kafka.utils.ZkUtils$) > > [2017-07-21 16:03:29,142] ERROR Conditional update of path > > /brokers/topics/perf_dstorage_raw/partitions/0/state with data > > {"controller_epoch":34,"leader":0,"version":1,"leader_ > epoch":0,"isr":[0]} > > and expected version 0 failed due to > > org.apache.zookeeper.KeeperException$BadVersionException: > KeeperErrorCode > > = > > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/0/state > > (kafka.utils.ZkUtils$) > > [2017-07-21 16:03:29,142] ERROR Conditional update of path > > /brokers/topics/perf_dstorage_raw/partitions/0/state with data > > {"controller_epoch":34,"leader":0,"version":1,"leader_ > epoch":0,"isr":[0]} > > and expected version 0 failed due to > > org.apache.zookeeper.KeeperException$BadVersionException: > KeeperErrorCode > > = > > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/0/state > > (kafka.utils.ZkUtils$) > > > > So, the issue is not isolated to just these "should-have-been-removed" > > topics, unfortunately. > > > > Really appreciate the input so far everyone. Still looking though for a > > solution. Many thanks. :) > > > > Chris > > > > On Fri, Jul 21, 2017 at 10:58 AM, M. Manna <manme...@gmail.com> wrote: > > > > > Just to add (in case the platoform is Windows) > > > > > > For Windows based cluster implementation, log/topic cleanup doesn't > work > > > out of the box. Users are more or less aware of it, and doing their own > > > maintenance as workaround. > > > If you have issues on Topic deletion not working properly on Windows > > (i.e. > > > with topic deletion enable and all other settings). then you have to > > > manually delete the files. > > > > > > > > > On 21 July 2017 at 16:53, Chris Neal <cwn...@gmail.com> wrote: > > > > > > > @Carl, > > > > > > > > There is nothing under /admin/delete_topics other than > > > > > > > > [] > > > > > > > > And nothing under /admin other than delete_topics :) > > > > > > > > The topics DO exist, however, under /config/topics! We may be on to > > > > something. I will remove them here and see if that clears it up. > > > > > > > > Thanks so much for all the help! > > > > Chris > > > > > > > > On Thu, Jul 20, 2017 at 10:37 PM, Chris Neal <cwn...@gmail.com> > wrote: > > > > > > > > > Thanks again for the replies. VERY much appreciated. I'll check > > both > > > > > /admin/delete_topics and /config/topics. > > > > > > > > > > Chris > > > > > > > > > > On Thu, Jul 20, 2017 at 9:22 PM, Carl Haferd > > > <chaf...@groupon.com.invalid > > > > > > > > > > wrote: > > > > > > > > > >> If delete normally works, there would hopefully be some log > entries > > > when > > > > >> it > > > > >> fails. Are there any unusual zookeeper entries in the > > > > >> /admin/delete_topics > > > > >> path or in the other /admin folders? > > > > >> > > > > >> Does the topic name still exist in zookeeper under /config/topics? > > If > > > > so, > > > > >> that should probably deleted as well. > > > > >> > > > > >> Carl > > > > >> > > > > >> On Thu, Jul 20, 2017 at 6:42 PM, Chris Neal <cwn...@gmail.com> > > wrote: > > > > >> > > > > >> > Delete is definitely there. The delete worked fine, based on > the > > > fact > > > > >> that > > > > >> > there is nothing in Zookeeper, and that the controller reported > > that > > > > the > > > > >> > delete was successful, it's just something seems to have gotten > > out > > > of > > > > >> > sync. > > > > >> > > > > > >> > delete.topic.enabled is true. I've successfully deleted topics > in > > > the > > > > >> > past, so I know it *should* work. :) > > > > >> > > > > > >> > I also had already checked in Zookeeper, and there is no > directory > > > for > > > > >> the > > > > >> > topics under /brokers/topics.... Very strange indeed. > > > > >> > > > > > >> > If I just remove the log directories from the filesystem, is > that > > > > >> enough to > > > > >> > get the broker to stop asking about the topics? I would guess > > there > > > > >> would > > > > >> > need to be more than just that, but I could be wrong. > > > > >> > > > > > >> > Thanks guys for the suggestions though! > > > > >> > > > > > >> > On Thu, Jul 20, 2017 at 8:19 PM, Stephen Powis < > > > spo...@salesforce.com > > > > > > > > > >> > wrote: > > > > >> > > > > > >> > > I could be totally wrong, but I seem to recall that delete > > wasn't > > > > >> fully > > > > >> > > implemented in 0.8.x? > > > > >> > > > > > > >> > > On Fri, Jul 21, 2017 at 10:10 AM, Carl Haferd > > > > >> > <chaf...@groupon.com.invalid > > > > >> > > > > > > > >> > > wrote: > > > > >> > > > > > > >> > > > Chris, > > > > >> > > > > > > > >> > > > You could first check to make sure that delete.topic.enable > is > > > > true > > > > >> and > > > > >> > > try > > > > >> > > > deleting again if not. If that doesn't work with 0.8.1.1 > you > > > > might > > > > >> > need > > > > >> > > to > > > > >> > > > manually remove the topic's log files from the configured > > > log.dirs > > > > >> > folder > > > > >> > > > on each broker in addition to removing the topic's zookeeper > > > path. > > > > >> > > > > > > > >> > > > Carl > > > > >> > > > > > > > >> > > > On Thu, Jul 20, 2017 at 10:06 AM, Chris Neal < > > cwn...@gmail.com> > > > > >> wrote: > > > > >> > > > > > > > >> > > > > Hi all, > > > > >> > > > > > > > > >> > > > > I have a weird situation here. I have deleted a few > topics > > on > > > > my > > > > >> > > 0.8.1.1 > > > > >> > > > > cluster (old, I know...). The deletes succeeded according > > to > > > > the > > > > >> > > > > controller.log: > > > > >> > > > > > > > > >> > > > > [2017-07-20 16:40:31,175] INFO [TopicChangeListener on > > > > Controller > > > > >> 1]: > > > > >> > > New > > > > >> > > > > topics: [Set()], deleted topics: > > > > >> > > > > [Set(perf_doorway-supplier-adapter-uat_raw)], new > partition > > > > >> replica > > > > >> > > > > assignment [Map()] > > > > >> > > > > (kafka.controller.PartitionStateMachine$ > > TopicChangeListener) > > > > >> > > > > [2017-07-20 16:40:33,507] INFO [TopicChangeListener on > > > > Controller > > > > >> 1]: > > > > >> > > New > > > > >> > > > > topics: [Set()], deleted topics: > > > > >> > > > > [Set(perf_doorway-supplier-scheduler-uat_raw)], new > > partition > > > > >> > replica > > > > >> > > > > assignment [Map()] > > > > >> > > > > (kafka.controller.PartitionStateMachine$ > > TopicChangeListener) > > > > >> > > > > [2017-07-20 16:40:36,504] INFO [TopicChangeListener on > > > > Controller > > > > >> 1]: > > > > >> > > New > > > > >> > > > > topics: [Set()], deleted topics: > > > [Set(perf_gocontent-uat_raw)], > > > > >> new > > > > >> > > > > partition replica assignment [Map()] > > > > >> > > > > (kafka.controller.PartitionStateMachine$ > > TopicChangeListener) > > > > >> > > > > [2017-07-20 16:40:38,290] INFO [TopicChangeListener on > > > > Controller > > > > >> 1]: > > > > >> > > New > > > > >> > > > > topics: [Set()], deleted topics: > > > [Set(perf_goplatform-uat_raw)] > > > > , > > > > >> new > > > > >> > > > > partition replica assignment [Map()] > > > > >> > > > > (kafka.controller.PartitionStateMachine$ > > TopicChangeListener) > > > > >> > > > > > > > > >> > > > > I query Zookeeper and the path is not there under > > > > /brokers/topics > > > > >> as > > > > >> > > > well. > > > > >> > > > > > > > > >> > > > > But, one of the nodes in my cluster continues to try and > use > > > > them: > > > > >> > > > > > > > > >> > > > > [2017-07-20 17:04:36,723] ERROR Conditional update of path > > > > >> > > > > /brokers/topics/perf_doorway-supplier-scheduler-uat_raw/ > > > > >> > > > partitions/3/state > > > > >> > > > > with data > > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_ > > > > >> > > > > epoch":2,"isr":[1,0]} > > > > >> > > > > and expected version 69 failed due to > > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > >> > KeeperErrorCode > > > > >> > > = > > > > >> > > > > NoNode for > > > > >> > > > > /brokers/topics/perf_doorway-supplier-scheduler-uat_raw/ > > > > >> > > > partitions/3/state > > > > >> > > > > (kafka.utils.ZkUtils$) > > > > >> > > > > [2017-07-20 17:04:36,723] INFO Partition > > > > >> > > > > [perf_doorway-supplier-scheduler-uat_raw,3] on broker 1: > > > Cached > > > > >> > > > zkVersion > > > > >> > > > > [69] not equal to that in zookeeper, skip updating ISR > > > > >> > > > > (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,723] INFO Partition > > > > >> > > > > [perf_doorway-supplier-scheduler-uat_raw,3] on broker 1: > > > Cached > > > > >> > > > zkVersion > > > > >> > > > > [69] not equal to that in zookeeper, skip updating ISR > > > > >> > > > > (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,764] INFO Partition > > > > >> [perf_goplatform-uat_raw,2] > > > > >> > on > > > > >> > > > > broker 1: Shrinking ISR for partition > > > > [perf_goplatform-uat_raw,2] > > > > >> > from > > > > >> > > > 1,0 > > > > >> > > > > to 1 (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,764] INFO Partition > > > > >> [perf_goplatform-uat_raw,2] > > > > >> > on > > > > >> > > > > broker 1: Shrinking ISR for partition > > > > [perf_goplatform-uat_raw,2] > > > > >> > from > > > > >> > > > 1,0 > > > > >> > > > > to 1 (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,765] ERROR Conditional update of path > > > > >> > > > > /brokers/topics/perf_goplatform-uat_raw/partitions/ > 2/state > > > with > > > > >> data > > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_ > > > > >> > > > epoch":2,"isr":[1]} > > > > >> > > > > and expected version 70 failed due to > > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > >> > KeeperErrorCode > > > > >> > > = > > > > >> > > > > NoNode for /brokers/topics/perf_ > > > goplatform-uat_raw/partitions/ > > > > >> > 2/state > > > > >> > > > > (kafka.utils.ZkUtils$) > > > > >> > > > > [2017-07-20 17:04:36,765] ERROR Conditional update of path > > > > >> > > > > /brokers/topics/perf_goplatform-uat_raw/partitions/ > 2/state > > > with > > > > >> data > > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_ > > > > >> > > > epoch":2,"isr":[1]} > > > > >> > > > > and expected version 70 failed due to > > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > >> > KeeperErrorCode > > > > >> > > = > > > > >> > > > > NoNode for /brokers/topics/perf_ > > > goplatform-uat_raw/partitions/ > > > > >> > 2/state > > > > >> > > > > (kafka.utils.ZkUtils$) > > > > >> > > > > [2017-07-20 17:04:36,765] INFO Partition > > > > >> [perf_goplatform-uat_raw,2] > > > > >> > on > > > > >> > > > > broker 1: Cached zkVersion [70] not equal to that in > > > zookeeper, > > > > >> skip > > > > >> > > > > updating ISR (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,765] INFO Partition > > > > >> [perf_goplatform-uat_raw,2] > > > > >> > on > > > > >> > > > > broker 1: Cached zkVersion [70] not equal to that in > > > zookeeper, > > > > >> skip > > > > >> > > > > updating ISR (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,981] INFO Partition > > > > >> [perf_gocontent-uat_raw,1] > > > > >> > on > > > > >> > > > > broker 1: Shrinking ISR for partition > > > [perf_gocontent-uat_raw,1] > > > > >> from > > > > >> > > 1,0 > > > > >> > > > > to 1 (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,981] INFO Partition > > > > >> [perf_gocontent-uat_raw,1] > > > > >> > on > > > > >> > > > > broker 1: Shrinking ISR for partition > > > [perf_gocontent-uat_raw,1] > > > > >> from > > > > >> > > 1,0 > > > > >> > > > > to 1 (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,988] ERROR Conditional update of path > > > > >> > > > > /brokers/topics/perf_gocontent-uat_raw/partitions/1/state > > > with > > > > >> data > > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_ > > > > >> > > > epoch":4,"isr":[1]} > > > > >> > > > > and expected version 90 failed due to > > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > >> > KeeperErrorCode > > > > >> > > = > > > > >> > > > > NoNode for /brokers/topics/perf_gocontent > > > > >> -uat_raw/partitions/1/state > > > > >> > > > > (kafka.utils.ZkUtils$) > > > > >> > > > > [2017-07-20 17:04:36,988] ERROR Conditional update of path > > > > >> > > > > /brokers/topics/perf_gocontent-uat_raw/partitions/1/state > > > with > > > > >> data > > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_ > > > > >> > > > epoch":4,"isr":[1]} > > > > >> > > > > and expected version 90 failed due to > > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > >> > KeeperErrorCode > > > > >> > > = > > > > >> > > > > NoNode for /brokers/topics/perf_gocontent > > > > >> -uat_raw/partitions/1/state > > > > >> > > > > (kafka.utils.ZkUtils$) > > > > >> > > > > [2017-07-20 17:04:36,988] INFO Partition > > > > >> [perf_gocontent-uat_raw,1] > > > > >> > on > > > > >> > > > > broker 1: Cached zkVersion [90] not equal to that in > > > zookeeper, > > > > >> skip > > > > >> > > > > updating ISR (kafka.cluster.Partition) > > > > >> > > > > [2017-07-20 17:04:36,988] INFO Partition > > > > >> [perf_gocontent-uat_raw,1] > > > > >> > on > > > > >> > > > > broker 1: Cached zkVersion [90] not equal to that in > > > zookeeper, > > > > >> skip > > > > >> > > > > updating ISR (kafka.cluster.Partition) > > > > >> > > > > > > > > >> > > > > I've tried a rolling restart of the cluster to see if that > > > fixed > > > > >> it, > > > > >> > > but > > > > >> > > > it > > > > >> > > > > did not. > > > > >> > > > > > > > > >> > > > > Can someone please help me out here? I'm not sure how I > can > > > get > > > > >> > things > > > > >> > > > > back in sync. > > > > >> > > > > > > > > >> > > > > Thank you so much for your time. > > > > >> > > > > Chris > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >