Re: [DISCUSS] Release 0.8.2-beta before 0.8.2?

2014-10-21 Thread Olson,Andrew
https://issues.apache.org/jira/browse/KAFKA-1647 sounds serious enough to 
include in 0.8.2-beta if possible.

CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Kafka log compression change in 0.8.2.1?

2015-05-11 Thread Olson,Andrew
After a recent 0.8.2.1 upgrade we noticed a significant increase in used 
filesystem space for our Kafka log data. We have another Kafka cluster still on 
0.8.1.1 whose Kafka data is being copied over to the upgraded cluster, and it 
is clear that the disk consumption is higher on 0.8.2.1 for the same message 
data. The log retention config for the two clusters is the same also.

We ran some tests to figure out what was happening, and it appears that in 
0.8.2.1 the Kafka brokers re-compress each message individually (we’re using 
Snappy), while in 0.8.1.1 they applied the compression across an entire batch 
of messages written to the log. For producers sending large batches of small 
similar messages, the difference can be quite substantial (in our case, it 
looks like a little over 2x). Is this a bug, or the expected new behavior?

thanks,
Andrew

CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Re: Kafka log compression change in 0.8.2.1?

2015-05-12 Thread Olson,Andrew
Hi Jun,

I figured it out this morning and opened
https://issues.apache.org/jira/browse/KAFKA-2189 --
it turned out to be a bug in versions 1.1.1.2 through 1.1.1.6 of
snappy-java that has recently
been fixed (I was very happy to see their new unit test named
"batchingOfWritesShouldNotAffectCompressedDataSize"). We will be patching
1.1.1.7 out to our
clusters as soon as we can.

Regarding the mirror maker question, we wrote our own custom replication
code and are not
using the mirror maker to copy the data between clusters. We¹re still
using the old java
producer, and confirmed the issue was present with both the 0.8.1.1 and
0.8.2.1 old producer
client.

thanks,
Andrew

On 5/12/15, 3:08 PM, "Jun Rao"  wrote:

>Andrew,
>
>The recompression logic didn't change in 0.8.2.1. The broker still takes
>all messages in a single request, assigns offsets and recompresses them
>into a single compressed message.
>
>Are you using mirror maker to copy data from the 0.8.1 cluster to the
>0.8.2
>cluster? If so, this may have to do with the batching in the producer in
>mirror maker. Did you enable the new java producer in mirror maker?
>
>Thanks,
>
>Jun
>
>
>On Mon, May 11, 2015 at 12:53 PM, Olson,Andrew  wrote:
>
>> After a recent 0.8.2.1 upgrade we noticed a significant increase in used
>> filesystem space for our Kafka log data. We have another Kafka cluster
>> still on 0.8.1.1 whose Kafka data is being copied over to the upgraded
>> cluster, and it is clear that the disk consumption is higher on 0.8.2.1
>>for
>> the same message data. The log retention config for the two clusters is
>>the
>> same also.
>>
>> We ran some tests to figure out what was happening, and it appears that
>>in
>> 0.8.2.1 the Kafka brokers re-compress each message individually (we¹re
>> using Snappy), while in 0.8.1.1 they applied the compression across an
>> entire batch of messages written to the log. For producers sending large
>> batches of small similar messages, the difference can be quite
>>substantial
>> (in our case, it looks like a little over 2x). Is this a bug, or the
>> expected new behavior?
>>
>> thanks,
>> Andrew
>>
>> CONFIDENTIALITY NOTICE This message and any included attachments are
>>from
>> Cerner Corporation and are intended only for the addressee. The
>>information
>> contained in this message is confidential and may constitute inside or
>> non-public information under international, federal, or state securities
>> laws. Unauthorized forwarding, printing, copying, distribution, or use
>>of
>> such information is strictly prohibited and may be unlawful. If you are
>>not
>> the addressee, please promptly delete this message and notify the
>>sender of
>> the delivery error by e-mail or you may call Cerner's corporate offices
>>in
>> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>>



Re: Questions about unclean leader election and "Halting because log truncation is not allowed"

2016-02-26 Thread Olson,Andrew
unclean.leader.election.enable is actually a valid topic-level configuration, I 
opened https://issues.apache.org/jira/browse/KAFKA-3298 to get the 
documentation updated.




That code comment doesn’t tell the complete story and could probably be updated 
for clarity as we’ve learned a lot since then. It’s still theoretically 
possible in certain severe split-brain situations such as the one your 
reproduction scenario introduces. Hopefully 
https://issues.apache.org/jira/browse/KAFKA-2143 helps to prevent the 
possibility from arising however.

On 2/25/16, 3:46 PM, "James Cheng"  wrote:

>Hi,
>
>I ran into a scenario where one of my brokers would continually shutdown, with 
>the error message:
>[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because 
>log truncation is not allowed for topic test, Current leader 1's latest offset 
>0 is less than replica 2's latest offset 151 
>(kafka.server.ReplicaFetcherThread)
>
>I managed to reproduce it with the following scenario:
>1. Start broker1, with unclean.leader.election.enable=false
>2. Start broker2, with unclean.leader.election.enable=false
>
>3. Create topic, single partition, with replication-factor 2.
>4. Write data to the topic.
>
>5. At this point, both brokers are in the ISR. Broker1 is the partition leader.
>
>6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2 gets 
>dropped out of ISR. Broker1 is still the leader. I can still write data to the 
>partition.
>
>7. Shutdown Broker1. Hard or controlled, doesn't matter.
>
>8. rm -rf the log directory of broker1. (This simulates a disk replacement or 
>full hardware replacement)
>
>9. Resume broker2. It attempts to connect to broker1, but doesn't succeed 
>because broker1 is down. At this point, the partition is offline. Can't write 
>to it.
>
>10. Resume broker1. Broker1 resumes leadership of the topic. Broker2 attempts 
>to join ISR, and immediately halts with the error message:
>[2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because 
>log truncation is not allowed for topic test, Current leader 1's latest offset 
>0 is less than replica 2's latest offset 151 
>(kafka.server.ReplicaFetcherThread)
>
>I am able to recover by setting unclean.leader.election.enable=true on my 
>brokers.
>
>I'm trying to understand a couple things:
>* Is my scenario a valid supported one, or is it along the lines of "don't 
>ever do that"?
>* In step 10, why is broker1 allowed to resume leadership even though it has 
>no data?
>* In step 10, why is it necessary to stop the entire broker due to one 
>partition that is in this state? Wouldn't it be possible for the broker to 
>continue to serve traffic for all the other topics, and just mark this one as 
>unavailable?
>* Would it make sense to allow an operator to manually specify which broker 
>they want to become the new master? This would give me more control over how 
>much data loss I am willing to handle. In this case, I would want broker2 to 
>become the new master. Or, is that possible and I just don't know how to do it?
>* Would it be possible to make unclean.leader.election.enable to be a 
>per-topic configuration? This would let me control how much data loss I am 
>willing to handle.
>
>Btw, the comment in the source code for that error message indicates:
>https://github.com/apache/kafka/blob/01aeea7c7bca34f1edce40116b7721335938b13b/core/src/main/scala/kafka/server/ReplicaFetcherThread.scala#L164-L166
>
>  // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>  // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>  // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>
>But I don't believe that happened. I never changed the configuration. But I 
>did venture into "unclean leader election" territory, so I'm not sure if the 
>comment still applies.
>
>Thanks,
>-James
>
>
>
>
>
>This email and any attachments may contain confidential and privileged 
>material for the sole use of the intended recipient. Any review, copying, or 
>distribution of this email (or any attachments) by others is prohibited. If 
>you are not the intended recipient, please contact the sender immediately and 
>permanently delete this email and any attachments. No employee or agent of 
>TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo 
>Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed 
>written agreement.

CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws

Re: Turn off retention.ms

2016-03-01 Thread Olson,Andrew
A very high retention time (e.g. 50 years in milliseconds) would effectively 
accomplish this.



On 3/1/16, 8:39 AM, "Roshan Punnoose"  wrote:

>Hi,
>
>Is it possible to set up a topic with only a size limit (retention.bytes)
>and not a time limit (retention.ms)?
>
>Roshan

CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Re: session.timeout.ms limit - Kafka Consumer

2016-03-02 Thread Olson,Andrew
This topic is currently being discussed at 
https://issues.apache.org/jira/browse/KAFKA-2986 and 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records


On 3/2/16, 8:11 AM, "Vanessa Gligor"  wrote:

>Hello,
>
>I am using Kafka higher consumer 0.9.0. I am not using the auto commit for
>the offsets, so after I consume the messaged (poll from Kafka) I will have
>to commit the offsets manually.
>
>The issue that I have is actually that the processing of the messages takes
>longer than 30s (and I cannot call poll again, before these messages are
>processed) and when I try to commit the offset a exception is thrown:
>ERROR o.a.k.c.c.i.ConsumerCoordinator - Error ILLEGAL_GENERATION occurred
>while committing offsets for group MetadataConsumerSpout.
>(I have found on stackoverflow this explanation: so if you wait for longer
>that the timeout request then the coordinator for the topic will kickout
>the consumer because it will think is dead and it will rebalance the group)
>
>In order to get rid of this I have thought about a couple of solutions:
>
>1. The configuration session.timeout.ms has a maximum value, so if I try to
>set it to 60 seconds, also I get an exception, because this value is not in
>the valid interval.
>
>2. I have tried to find a solution to get a paginated request when the
>polling method is called - no success.
>
>3. I have tried to send a heart beat from the outside of the poll (because
>this method sends the heartbeats) - no success.
>
>
>Thank you.

CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Re: Kafka Mock

2014-04-30 Thread Olson,Andrew
You might take a look at the KafkaServerTestHarness [1] if you’re not already 
familiar with it. It’s not exactly “mock”, but it makes it easy to test Kafka 
functionality without needing to connect to any external broker or ZK 
processes. We ported this test harness to Java for our junit and integration 
tests, and use a junit suite to keep it running for all tests.

[1] 
https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala


CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.