RE: Exactly-once publication behaviour

2016-02-21 Thread Andrew Schofield
Hi,
That's good, but somewhat mysterious :) I'd like to help accelerate exactly-one 
behaviour.  Perhaps it might be a candidate for the release after 0.10, if 
there's sufficient interest in the community.

Could someone share the latest thinking on this subject? I'm not sure what the 
appropriate forum is. How have you previously been sharing high-level 
architectural plans for people to pick up and deliver? Since there's no KIP I 
guess it's a bit premature to discuss on the KIP call.

Thanks
Andrew

> Subject: Re: Exactly-once publication behaviour
> From: b...@confluent.io
> Date: Fri, 19 Feb 2016 15:13:07 -0800
> To: users@kafka.apache.org
> 
> Hi Andrew
> 
> There are plans to add exactly once behaviour. This will likely be a little 
> more than Idempotent producers with the motivation being to provide better 
> delivery guarantees for Connect, Streams and Mirror Maker. 
> 
> B
> 
> 
> 
>> On 19 Feb 2016, at 13:54, Andrew Schofield  wrote:
>> 
>> When publishing messages to Kafka, you make a choice between at-most-once 
>> and at-least-once delivery, depending on whether you wait for 
>> acknowledgments and whether you retry on failures. In most cases, those 
>> options are good enough. However, some systems offer exactly-once 
>> reliability too. Although my view is that the practical use of exactly-once 
>> is limited in the situations that Kafka is generally used for, when you're 
>> connecting other systems to Kafka or bridging between protocols, I think 
>> there is value in propagating the reliability level that the other system 
>> expects.
>> 
>> As a consumer, you can manage your offset and get exactly-once delivery, or 
>> more likely exactly-once processing, of the messages.
>> 
>> I've read about idempotent producers 
>> (https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer) and 
>> I know there's been some discussion about transactions too.
>> 
>> Is there a plan to provide the tools to enable exactly-once publication 
>> behaviour? Is this a planned enhancement to Kafka Connect? Is there already 
>> some technique that people are using effectively to get exactly-once?
>> 
>> Andrew Schofield 
> 
  

Re: Exactly-once publication behaviour

2016-02-21 Thread Jay Kreps
Hey Andrew,

Yeah I think the current state is that we did several design and prototypes
(both the transaction work and the idempotence design and the conditional
write KIP), but none of these offshoots is really fully rationalized with
the other ones. Slow progress in this area has been mainly due to time
constraints--no one working on it full time. We're interested in picking up
the work at Confluent and hope to get some design ideas out on wiki for
discussion in the next month or so.

-Jay

On Sun, Feb 21, 2016 at 5:52 AM, Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> That's good, but somewhat mysterious :) I'd like to help accelerate
> exactly-one behaviour.  Perhaps it might be a candidate for the release
> after 0.10, if there's sufficient interest in the community.
>
> Could someone share the latest thinking on this subject? I'm not sure what
> the appropriate forum is. How have you previously been sharing high-level
> architectural plans for people to pick up and deliver? Since there's no KIP
> I guess it's a bit premature to discuss on the KIP call.
>
> Thanks
> Andrew
>
> > Subject: Re: Exactly-once publication behaviour
> > From: b...@confluent.io
> > Date: Fri, 19 Feb 2016 15:13:07 -0800
> > To: users@kafka.apache.org
> >
> > Hi Andrew
> >
> > There are plans to add exactly once behaviour. This will likely be a
> little more than Idempotent producers with the motivation being to provide
> better delivery guarantees for Connect, Streams and Mirror Maker.
> >
> > B
> >
> >
> >
> >> On 19 Feb 2016, at 13:54, Andrew Schofield  wrote:
> >>
> >> When publishing messages to Kafka, you make a choice between
> at-most-once and at-least-once delivery, depending on whether you wait for
> acknowledgments and whether you retry on failures. In most cases, those
> options are good enough. However, some systems offer exactly-once
> reliability too. Although my view is that the practical use of exactly-once
> is limited in the situations that Kafka is generally used for, when you're
> connecting other systems to Kafka or bridging between protocols, I think
> there is value in propagating the reliability level that the other system
> expects.
> >>
> >> As a consumer, you can manage your offset and get exactly-once
> delivery, or more likely exactly-once processing, of the messages.
> >>
> >> I've read about idempotent producers (
> https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer)
> and I know there's been some discussion about transactions too.
> >>
> >> Is there a plan to provide the tools to enable exactly-once publication
> behaviour? Is this a planned enhancement to Kafka Connect? Is there already
> some technique that people are using effectively to get exactly-once?
> >>
> >> Andrew Schofield
> >
>
>


RE: Exactly-once publication behaviour

2016-02-21 Thread Andrew Schofield
Hi Jay,
Thanks for the response. Happy to engage in the discussions once there's 
something concrete on the wiki.

Andrew


> Date: Sun, 21 Feb 2016 10:23:55 -0800
> Subject: Re: Exactly-once publication behaviour
> From: j...@confluent.io
> To: users@kafka.apache.org
>
> Hey Andrew,
>
> Yeah I think the current state is that we did several design and prototypes
> (both the transaction work and the idempotence design and the conditional
> write KIP), but none of these offshoots is really fully rationalized with
> the other ones. Slow progress in this area has been mainly due to time
> constraints--no one working on it full time. We're interested in picking up
> the work at Confluent and hope to get some design ideas out on wiki for
> discussion in the next month or so.
>
> -Jay
>
> On Sun, Feb 21, 2016 at 5:52 AM, Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi,
>> That's good, but somewhat mysterious :) I'd like to help accelerate
>> exactly-one behaviour. Perhaps it might be a candidate for the release
>> after 0.10, if there's sufficient interest in the community.
>>
>> Could someone share the latest thinking on this subject? I'm not sure what
>> the appropriate forum is. How have you previously been sharing high-level
>> architectural plans for people to pick up and deliver? Since there's no KIP
>> I guess it's a bit premature to discuss on the KIP call.
>>
>> Thanks
>> Andrew
>>
>>> Subject: Re: Exactly-once publication behaviour
>>> From: b...@confluent.io
>>> Date: Fri, 19 Feb 2016 15:13:07 -0800
>>> To: users@kafka.apache.org
>>>
>>> Hi Andrew
>>>
>>> There are plans to add exactly once behaviour. This will likely be a
>> little more than Idempotent producers with the motivation being to provide
>> better delivery guarantees for Connect, Streams and Mirror Maker.
>>>
>>> B
>>>
>>>
>>>
 On 19 Feb 2016, at 13:54, Andrew Schofield wrote:

 When publishing messages to Kafka, you make a choice between
>> at-most-once and at-least-once delivery, depending on whether you wait for
>> acknowledgments and whether you retry on failures. In most cases, those
>> options are good enough. However, some systems offer exactly-once
>> reliability too. Although my view is that the practical use of exactly-once
>> is limited in the situations that Kafka is generally used for, when you're
>> connecting other systems to Kafka or bridging between protocols, I think
>> there is value in propagating the reliability level that the other system
>> expects.

 As a consumer, you can manage your offset and get exactly-once
>> delivery, or more likely exactly-once processing, of the messages.

 I've read about idempotent producers (
>> https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer)
>> and I know there's been some discussion about transactions too.

 Is there a plan to provide the tools to enable exactly-once publication
>> behaviour? Is this a planned enhancement to Kafka Connect? Is there already
>> some technique that people are using effectively to get exactly-once?

 Andrew Schofield
>>>
>>
>>
  

Leader was set to -1 for some topic partitions

2016-02-21 Thread Raju Bairishetti
Hello,
   We are using 0.8.2 kafka version. We are running 5 brokers in the prod
cluster. Each topic is having two partitions. We are seeing some issues
with some topic partitions.

For some topic partitions leader was set to -1. I am not seeing any errors
in the controller and server logs. After server restart leader was set to
some topic partitions. Will it be a data loss of that topic partition.
Looks like, there is no data loss according to my application metrics but I
do not have any server logs to prove it from kafka side.

*kafka-topics --zookeeper localhost:2181 --describe --topic click_json*

Topic: click_json PartitionCount:2 ReplicationFactor:3
Configs:retention.bytes=42949672960

Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5

Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *


*Why leader was set to -1?*

*What is the impact in case if leader was set to -1?*

*How to recover from this error? Which option would be better --> Restart
of broker or choosing leader by running prefer leader election script?


FYI, we have set unclean.leader.election.enable to false on 3 machines and
unclean.leader.election.enable to true on 2 machines.


Thanks in advance!!!

--
Thanks,
Raju Bairishetti,
www.lazada.com


Re: Leader was set to -1 for some topic partitions

2016-02-21 Thread Salman Ahmed
We saw a similar issue a while back. If leader is -1, I believe you won't
have ingestion work for that partition. Was there any data ingestion dip?
On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti  wrote:

> Hello,
>We are using 0.8.2 kafka version. We are running 5 brokers in the prod
> cluster. Each topic is having two partitions. We are seeing some issues
> with some topic partitions.
>
> For some topic partitions leader was set to -1. I am not seeing any errors
> in the controller and server logs. After server restart leader was set to
> some topic partitions. Will it be a data loss of that topic partition.
> Looks like, there is no data loss according to my application metrics but I
> do not have any server logs to prove it from kafka side.
>
> *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
>
> Topic: click_json PartitionCount:2 ReplicationFactor:3
> Configs:retention.bytes=42949672960
>
> Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
>
> Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
>
>
> *Why leader was set to -1?*
>
> *What is the impact in case if leader was set to -1?*
>
> *How to recover from this error? Which option would be better --> Restart
> of broker or choosing leader by running prefer leader election script?
>
>
> FYI, we have set unclean.leader.election.enable to false on 3 machines and
> unclean.leader.election.enable to true on 2 machines.
>
>
> Thanks in advance!!!
>
> --
> Thanks,
> Raju Bairishetti,
> www.lazada.com
>


Re: Leader was set to -1 for some topic partitions

2016-02-21 Thread Raju Bairishetti
On Mon, Feb 22, 2016 at 1:13 PM, Salman Ahmed 
wrote:

> We saw a similar issue a while back. If leader is -1, I believe you won't
> have ingestion work for that partition. Was there any data ingestion dip?
>


*No,  I am not seeing any data dip for topic but no data for that partition
whose leader was set to -1.*

On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti  wrote:
>
> > Hello,
> >We are using 0.8.2 kafka version. We are running 5 brokers in the prod
> > cluster. Each topic is having two partitions. We are seeing some issues
> > with some topic partitions.
> >
> > For some topic partitions leader was set to -1. I am not seeing any
> errors
> > in the controller and server logs. After server restart leader was set to
> > some topic partitions. Will it be a data loss of that topic partition.
> > Looks like, there is no data loss according to my application metrics
> but I
> > do not have any server logs to prove it from kafka side.
> >
> > *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
> >
> > Topic: click_json PartitionCount:2 ReplicationFactor:3
> > Configs:retention.bytes=42949672960
> >
> > Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
> >
> > Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
> >
> >
> > *Why leader was set to -1?*
> >
> > *What is the impact in case if leader was set to -1?*
> >
> > *How to recover from this error? Which option would be better --> Restart
> > of broker or choosing leader by running prefer leader election script?
> >
> >
> > FYI, we have set unclean.leader.election.enable to false on 3 machines
> and
> > unclean.leader.election.enable to true on 2 machines.
> >
> >
> > Thanks in advance!!!
> >
> > --
> > Thanks,
> > Raju Bairishetti,
> > www.lazada.com
> >
>



-- 
Thanks,
Raju Bairishetti,

www.lazada.com


build for kafka-avro-serializer

2016-02-21 Thread Venkatesh Rudraraju
How do I include "kafka-avro-serializer" in my maven build. It's not
available in the maven repo as mentioned here ->

http://docs.confluent.io/1.0/app-development.html#java-applications-serializers

Thanks,
Venkatesh