Re: Zookeeper load burst during replication

2014-12-03 Thread Yury Ruchin
Hi Guozhang,

No consumer clients were consuming when the broker was replicating, at
least none that would commit offsets to ZK. As for the controller log,
that's a good idea, I will try it out. Thanks!

2014-12-02 22:28 GMT+03:00 Guozhang Wang :

> Kafka brokers uses ZK for metadata storage, and Kafka consumer clients uses
> ZK for offset and member management.
>
> For metadata storage, when there is replica state changes (for example like
> the new replica added after a broker restart in your case) the controller
> will try to write to ZK recording such changes, but this should just be one
> time and will not largely increase ZK load.
>
> So you can try to check:
>
> 1. If there are consumer clients running at the same time which would be
> writing to ZK heavily for committing offsets?
> 2. The controller log on the broker to see if it abnormally update such
> metadata to ZK during the period?
>
> Guozhang
>
> On Tue, Dec 2, 2014 at 7:38 AM, Yury Ruchin  wrote:
>
> > Hello,
> >
> > In a multi-broker Kafka 0.8.1.1 setup, I had one broker crashed. I
> > restarted it after some noticeable time, so it started catching up the
> > leader very intensively. During the replication, I see that the disk load
> > on the ZK leader bursts abnormally, resulting in ZK performance
> > degradation. What could cause that? How does Kafka use ZK during
> > replication?
> >
> > Thanks,
> > Yury
> >
>
>
>
> --
> -- Guozhang
>


KafkaException: Should not set log end offset on partition

2014-12-03 Thread svante karlsson
I've installed (for ansible scripting testing purposes) 3 VM's each
containing kafka & zookeeer clustered together

Ubuntu 14.04
Zookeepers are 3.4.6 and kafka 2.11-0.8.2-beta
The kafka servers have broker id's 2, 4, 6

The zookeepers seems happy.
The kafka servers start up and seems happy.

I can created two test topics as in the getting started guide
bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic test
bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 3 --partitions 32 --topic test3

I published some messages on each topic and after a while I noticed that I
ran out of disk space and there are millions of logs in syslog similar to
the one below.

Any hints on what can cause this problem?


/svante


Dec  2 07:40:17 ubuntu supervisord: kafka-broker [2014-12-02 07:40:17,120]
WARN [Replica Manager on Broker 4]: Fetch request with  correlation id 21
from client ReplicaFetcherThread-0-6 on partition [test3,3] failed due to
Leader not local for partition [test3, 3] on broker 4
(kafka.server.ReplicaManager)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker [2014-12-02 07:40:17,120]
WARN [Replica Manager on Broker 4]: Fetch request with correlation id 21
from client ReplicaFetcherThread-0-6 on partition [test3,4] failed due to
Leader not local for partition [test3, 4] on broker 4
(kafka.server.ReplicaManager)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker [2014-12-02 07:40:17,120]
WARN [Replica Manager on Broker 4]: Fetch request with  correlation id 21
from client ReplicaFetcherThread-0-6 on partition [test3,26] failed due to
Leader not local for partition [test3,26] on broker 4
(kafka.server.ReplicaManager)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker [2014-12-02 07:40:17,121]
ERROR [KafkaApi-4] error when handling request Name: FetchRequest; Version:
0; CorrelationId: 21; ClientId: ReplicaFetcherThread-0-6; ReplicaId: 4;
MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [test3,22] ->
PartitionFetchInfo(0,1048576),[test3,31] ->
PartitionFetchInfo(0,1048576),[test3,21]
->PartitionFetchInfo(0,1048576),[test3,13] ->
PartitionFetchInfo(0,1048576),[test3,9] ->
PartitionFetchInfo(0,1048576),[test3,28] ->
PartitionFetchInfo(0,1048576),[test3,27] ->
PartitionFetchInfo(0,1048576),[test3,15] ->
PartitionFetchInfo(0,1048576),[test3,7] ->
PartitionFetchInfo(0,1048576),[test3,8] ->
PartitionFetchInfo(0,1048576),[test3,2] ->
PartitionFetchInfo(0,1048576),[test3,19] ->
PartitionFetchInfo(0,1048576),[test3,25] ->
PartitionFetchInfo(0,1048576),[test3,20] ->
PartitionFetchInfo(0,1048576),[test3,14] ->
PartitionFetchInfo(0,1048576),[test2,0] ->
PartitionFetchInfo(0,1048576),[test3,16] ->
PartitionFetchInfo(0,1048576),[test3,1] ->
PartitionFetchInfo(0,1048576),[test3,10] ->
PartitionFetchInfo(0,1048576),[test3,3] ->
PartitionFetchInfo(0,1048576),[test3,4] ->
PartitionFetchInfo(0,1048576),[test3,26] -> PartitionFetchInfo(0,1048576)
(kafka.server.KafkaApis)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker
kafka.common.KafkaException: Should not set log end offset on partition
[test3,22]'s local replica 4
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.cluster.Replica.logEndOffset_$eq(Replica.scala:52)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.ReplicaManager.updateReplicaLEOAndPartitionHW(ReplicaManager.scala:565)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.KafkaApis$$anonfun$recordFollowerLogEndOffsets$2.apply(KafkaApis.scala:348)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.KafkaApis$$anonfun$recordFollowerLogEndOffsets$2.apply(KafkaApis.scala:346)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
scala.collection.MapLike$MappedValues.foreach(MapLike.scala:245)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.KafkaApis.recordFollowerLogEndOffsets(KafkaApis.scala:346)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:311)
Dec  2 07:40:17 ubuntu supervisord: kafka-broker #011at
kafka.server.KafkaApis.handle(KafkaApis.scala:60)
Dec  2 07:40:17 ubuntu supervisord: kafka-b

Re: KafkaException: Should not set log end offset on partition

2014-12-03 Thread svante karlsson
I found some logs like this before everything started to go wrong

...
[2014-12-02 07:08:11,722] WARN Partition [test3,13] on broker 2: No
checkpointed highwatermark is found for partition [test3,7]
(kafka.cluster.Partition)
[2014-12-02 07:08:11,722] WARN Partition [test3,7] on broker 2: No
checkpointed highwatermark is found for partition [test3,7]
(kafka.cluster.Partition)


and further down

[2014-12-02 07:08:11,740] INFO Truncating log test3-13 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,740] INFO Truncating log test3-9 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,740] INFO Truncating log test3-6 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,740] INFO Truncating log test3-28 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,741] INFO Truncating log test3-27 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,741] INFO Truncating log test3-15 to offset 0.
(kafka.log.Log)
[2014-12-02 07:08:11,741] INFO Truncating log test3-7 to offset 0.
(kafka.log.Log)


after that the logs from the first post starts. I don't know if that helps
or explains anything

/svante