Hi,

We got the below error in our logs and our consumers stopped consuming any data 
?.It worked only after restart.

We would like to confirm that it's because we are running with 0.8-beta version 
and not 0.8 release version to convince "THE MGMT" guys.

Please let me know if it's this KAFKA-1382 causing the issue.

Thanks,

Balaji

From: Gulia, Vikram
Sent: Wednesday, September 24, 2014 8:43 AM
To: Sharma, Navdeep; #IT-MAD DES; #IT-MAA
Cc: Alam, Mohammad Shah
Subject: RE: 9/23 prod issue - offline kafka partitions.

Adding full MAA distro.

DES Offshore looked in to the logs on kafka servers and seems like the issue we 
encountered yesterday may be described in these threads, please have a look -

http://permalink.gmane.org/gmane.comp.apache.kafka.user/1904

https://issues.apache.org/jira/browse/KAFKA-1382 (it describes the fix/patch 
which is available in 0.8.1.2/0.8.2)

Thank You,
Vikram Gulia

From: Sharma, Navdeep
Sent: Wednesday, September 24, 2014 6:53 AM
To: Gulia, Vikram; #IT-MAD DES
Cc: #IT-MAA Offshore; Alam, Mohammad Shah
Subject: RE: 9/23 prod issue - offline kafka partitions.

Hi Vikram,

We analyzed  below mentioned issue with MAA-Offshore (Abhishek) and found that 
the error occurred only on 23 Sept. This is  not historical as we checked last 
4 days logs.

It looks like that consumer got stopped on September 22 2014 for Linux patching 
activty.MAA started consumer September  23 2014 at 1:00 AM.

Issue in  server log   " BadVersion for 
/brokers/topics/rain-burn-in/partitions/121/state"  but it is not present in 
previous 4 days logs.
More detail of this error can be found at-
http://permalink.gmane.org/gmane.comp.apache.kafka.user/1904

We are not sure about data loss in this scenario and working on this.

[cid:image001.png@01CFD7D3.80A9B2B0]

[cid:image002.png@01CFD7D3.80A9B2B0]

Let us know if any concerns.

[cid:image003.gif@01CFD7D3.80A9B2B0]

Navdeep Sharma
Developer - offshore,  Middleware Applications & Development
o India: 0120-4532000 - 2234
c: +91-9911698102








From: Gulia, Vikram
Sent: Tuesday, September 23, 2014 6:17 PM
To: #IT-MAD DES
Subject: FW: 9/23 prod issue - offline kafka partitions.

DES Offshore dev,

Please work with MAA offshore to monitor the kafka broker as we had this 
incident where lot of partitions went offline around 1.45 PM MST and MAA has to 
restart the kafka servers. We may have lost messages and we need to see if 
there is a way to figure out what was the impact.

Also, check the logs for kafka servers and see if we can figure out why did 
partitions go offline or are un-available? Let us know if you find anything 
relevant.

Thank You,
Vikram Gulia

From: Nielsen, Andy
Sent: Tuesday, September 23, 2014 5:04 PM
To: #IT-MAD DES; Gulia, Vikram
Cc: #IT-MAA
Subject: 9/23 prod issue - offline kafka partitions.

desadmin@pc1mwdpl01 ~/bin $ ./kafka.sh topic --unavailable-partitions
topic: account-access   partition: 21   leader: -1      replicas: 4,6,1 isr: 1
topic: account-access   partition: 51   leader: -1      replicas: 4,6,1 isr:
topic: account-access   partition: 81   leader: -1      replicas: 4,6,1 isr: 1
topic: account-access   partition: 111  leader: -1      replicas: 4,6,1 isr: 1
topic: account-activated        partition: 13   leader: -1      replicas: 4,6,1 
isr:
topic: account-activated        partition: 43   leader: -1      replicas: 4,6,1 
isr:
topic: account-activated        partition: 73   leader: -1      replicas: 4,6,1 
isr:
topic: account-activated        partition: 103  leader: -1      replicas: 4,6,1 
isr: 1
topic: account-adjustment-issued        partition: 27   leader: -1      
replicas: 4,6,1 isr:
topic: account-adjustment-issued        partition: 57   leader: -1      
replicas: 4,6,1 isr:
topic: account-adjustment-issued        partition: 87   leader: -1      
replicas: 4,6,1 isr: 1
topic: account-adjustment-issued        partition: 117  leader: -1      
replicas: 4,6,1 isr:
topic: account-created  partition: 11   leader: -1      replicas: 4,6,1 isr:
topic: account-created  partition: 41   leader: -1      replicas: 4,6,1 isr:
topic: account-created  partition: 71   leader: -1      replicas: 4,6,1 isr:
topic: account-created  partition: 101  leader: -1      replicas: 4,6,1 isr:
topic: account-info-updated     partition: 7    leader: -1      replicas: 4,6,1 
isr: 1
topic: account-info-updated     partition: 37   leader: -1      replicas: 4,6,1 
isr: 1
topic: account-info-updated     partition: 67   leader: -1      replicas: 4,6,1 
isr:
topic: account-info-updated     partition: 97   leader: -1      replicas: 4,6,1 
isr: 1
topic: account-info-updated     partition: 127  leader: -1      replicas: 4,6,1 
isr: 1
topic: application-access       partition: 21   leader: -1      replicas: 4,6,1 
isr: 1
topic: application-access       partition: 51   leader: -1      replicas: 4,6,1 
isr: 1
topic: application-access       partition: 81   leader: -1      replicas: 4,6,1 
isr: 1
topic: application-access       partition: 111  leader: -1      replicas: 4,6,1 
isr: 1
topic: bill-generated   partition: 3    leader: -1      replicas: 4,6,1 isr:
topic: bill-generated   partition: 33   leader: -1      replicas: 4,6,1 isr:
topic: bill-generated   partition: 63   leader: -1      replicas: 4,6,1 isr:
topic: bill-generated   partition: 93   leader: -1      replicas: 4,6,1 isr:
topic: bill-generated   partition: 123  leader: -1      replicas: 4,6,1 isr: 1
topic: collected-event  partition: 29   leader: -1      replicas: 4,6,1 isr: 1
topic: collected-event  partition: 59   leader: -1      replicas: 4,6,1 isr:
topic: collected-event  partition: 89   leader: -1      replicas: 4,6,1 isr:
topic: collected-event  partition: 119  leader: -1      replicas: 4,6,1 isr: 1
topic: customer-cues    partition: 27   leader: -1      replicas: 4,6,1 isr:
topic: customer-cues    partition: 57   leader: -1      replicas: 4,6,1 isr:
topic: customer-cues    partition: 87   leader: -1      replicas: 4,6,1 isr: 1
topic: customer-cues    partition: 117  leader: -1      replicas: 4,6,1 isr:
topic: dish-promo-application-access    partition: 23   leader: -1      
replicas: 4,6,1 isr:
topic: dish-promo-application-access    partition: 53   leader: -1      
replicas: 4,6,1 isr:
topic: dish-promo-application-access    partition: 83   leader: -1      
replicas: 4,6,1 isr:
topic: dish-promo-application-access    partition: 113  leader: -1      
replicas: 4,6,1 isr:
topic: event-response   partition: 2    leader: -1      replicas: 4,6,1 isr:
topic: event-response   partition: 32   leader: -1      replicas: 4,6,1 isr:
topic: event-response   partition: 62   leader: -1      replicas: 4,6,1 isr:
topic: event-response   partition: 92   leader: -1      replicas: 4,6,1 isr:
topic: event-response   partition: 122  leader: -1      replicas: 4,6,1 isr: 1
topic: leads-service    partition: 24   leader: -1      replicas: 4,6,1 isr:
topic: leads-service    partition: 54   leader: -1      replicas: 4,6,1 isr:
topic: leads-service    partition: 84   leader: -1      replicas: 4,6,1 isr:
topic: leads-service    partition: 114  leader: -1      replicas: 4,6,1 isr:
topic: logprod_v3       partition: 3    leader: -1      replicas: 4,6,1 isr:
topic: logprod_v3       partition: 33   leader: -1      replicas: 4,6,1 isr: 1
topic: logprod_v3       partition: 63   leader: -1      replicas: 4,6,1 isr:
topic: logprod_v3       partition: 93   leader: -1      replicas: 4,6,1 isr:
topic: logprod_v3       partition: 123  leader: -1      replicas: 4,6,1 isr: 1
topic: online-account-registration-attempted    partition: 21   leader: -1      
replicas: 4,6,1 isr:
topic: online-account-registration-attempted    partition: 51   leader: -1      
replicas: 4,6,1 isr: 1
topic: online-account-registration-attempted    partition: 81   leader: -1      
replicas: 4,6,1 isr:
topic: online-account-registration-attempted    partition: 111  leader: -1      
replicas: 4,6,1 isr:
topic: order-cancelled  partition: 29   leader: -1      replicas: 4,6,1 isr:
topic: order-cancelled  partition: 59   leader: -1      replicas: 4,6,1 isr:
topic: order-cancelled  partition: 89   leader: -1      replicas: 4,6,1 isr:
topic: order-cancelled  partition: 119  leader: -1      replicas: 4,6,1 isr: 1
topic: order-completed  partition: 24   leader: -1      replicas: 4,6,1 isr:
topic: order-completed  partition: 54   leader: -1      replicas: 4,6,1 isr:
topic: order-completed  partition: 84   leader: -1      replicas: 4,6,1 isr: 1
topic: order-completed  partition: 114  leader: -1      replicas: 4,6,1 isr:
topic: order-created    partition: 25   leader: -1      replicas: 4,6,1 isr:
topic: order-created    partition: 55   leader: -1      replicas: 4,6,1 isr:
topic: order-created    partition: 85   leader: -1      replicas: 4,6,1 isr:
topic: order-created    partition: 115  leader: -1      replicas: 4,6,1 isr:
topic: order-modified   partition: 8    leader: -1      replicas: 4,6,1 isr: 1
topic: order-modified   partition: 38   leader: -1      replicas: 4,6,1 isr:
topic: order-modified   partition: 68   leader: -1      replicas: 4,6,1 isr:
topic: order-modified   partition: 98   leader: -1      replicas: 4,6,1 isr:
topic: order-modified   partition: 128  leader: -1      replicas: 4,6,1 isr: 1
topic: order-request    partition: 24   leader: -1      replicas: 4,6,1 isr:
topic: order-request    partition: 54   leader: -1      replicas: 4,6,1 isr:
topic: order-request    partition: 84   leader: -1      replicas: 4,6,1 isr: 1
topic: order-request    partition: 114  leader: -1      replicas: 4,6,1 isr:
topic: order-response   partition: 27   leader: -1      replicas: 4,6,1 isr: 1
topic: order-response   partition: 57   leader: -1      replicas: 4,6,1 isr:
topic: order-response   partition: 87   leader: -1      replicas: 4,6,1 isr:
topic: order-response   partition: 117  leader: -1      replicas: 4,6,1 isr:
topic: outbound-call-attempted  partition: 13   leader: -1      replicas: 4,6,1 
isr:
topic: outbound-call-attempted  partition: 43   leader: -1      replicas: 4,6,1 
isr: 1
topic: outbound-call-attempted  partition: 73   leader: -1      replicas: 4,6,1 
isr: 1
topic: outbound-call-attempted  partition: 103  leader: -1      replicas: 4,6,1 
isr:
topic: outbound-communications  partition: 4    leader: -1      replicas: 4,6,1 
isr:
topic: outbound-communications  partition: 34   leader: -1      replicas: 4,6,1 
isr:
topic: outbound-communications  partition: 64   leader: -1      replicas: 4,6,1 
isr:
topic: outbound-communications  partition: 94   leader: -1      replicas: 4,6,1 
isr: 1
topic: outbound-communications  partition: 124  leader: -1      replicas: 4,6,1 
isr: 1
topic: postal-mail-undeliverable        partition: 15   leader: -1      
replicas: 4,6,1 isr: 1
topic: postal-mail-undeliverable        partition: 45   leader: -1      
replicas: 4,6,1 isr:
topic: postal-mail-undeliverable        partition: 75   leader: -1      
replicas: 4,6,1 isr:
topic: postal-mail-undeliverable        partition: 105  leader: -1      
replicas: 4,6,1 isr:
topic: rain-burn-in     partition: 4    leader: -1      replicas: 4,6,1 isr:
topic: rain-burn-in     partition: 34   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-burn-in     partition: 64   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-burn-in     partition: 94   leader: -1      replicas: 4,6,1 isr:
topic: rain-burn-in     partition: 124  leader: -1      replicas: 4,6,1 isr:
topic: rain-enhanced    partition: 26   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-enhanced    partition: 56   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-enhanced    partition: 86   leader: -1      replicas: 4,6,1 isr:
topic: rain-enhanced    partition: 116  leader: -1      replicas: 4,6,1 isr: 1
topic: rain-listener    partition: 23   leader: -1      replicas: 4,6,1 isr:
topic: rain-listener    partition: 53   leader: -1      replicas: 4,6,1 isr:
topic: rain-listener    partition: 83   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-listener    partition: 113  leader: -1      replicas: 4,6,1 isr: 1
topic: rain-load-test   partition: 8    leader: -1      replicas: 4,6,1 isr:
topic: rain-load-test   partition: 38   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-load-test   partition: 68   leader: -1      replicas: 4,6,1 isr:
topic: rain-load-test   partition: 98   leader: -1      replicas: 4,6,1 isr: 1
topic: rain-load-test   partition: 128  leader: -1      replicas: 4,6,1 isr:
topic: submit-agreement partition: 2    leader: -1      replicas: 4,6,1 isr:
topic: submit-agreement partition: 32   leader: -1      replicas: 4,6,1 isr: 1
topic: submit-agreement partition: 62   leader: -1      replicas: 4,6,1 isr:
topic: submit-agreement partition: 92   leader: -1      replicas: 4,6,1 isr:
topic: submit-agreement partition: 122  leader: -1      replicas: 4,6,1 isr:
topic: threshold-exceeded       partition: 14   leader: -1      replicas: 4,6,1 
isr:
topic: threshold-exceeded       partition: 44   leader: -1      replicas: 4,6,1 
isr:
topic: threshold-exceeded       partition: 74   leader: -1      replicas: 4,6,1 
isr:
topic: threshold-exceeded       partition: 104  leader: -1      replicas: 4,6,1 
isr: 1

Andy Nielsen
Middleware Application Admin
303-723-2347
cell:720-971-2856

Reply via email to