Kafka 8.2.1 : ISR updation skipped

2015-12-18 Thread Mazhar Shaikh
Hi All,

I'm Using a 2-node cluster (with 3rd zookeeper running on one of these
machines).

Due to some reason, the data is not being replicated to another kafka
process.

Kafka Version : kafka_2.10-0.8.2.1

# ./bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic topic1
Topic:topic1PartitionCount:16   ReplicationFactor:2 Configs:
Topic: topic1   Partition: 0Leader: 1   Replicas: 0,1
Isr: 1
Topic: topic1   Partition: 1Leader: 1   Replicas: 1,0
Isr: 1


server.log:

[2015-12-18 03:42:10,071] INFO Partition [topic1,1] on broker 1: Expanding
ISR for partition [topic1,1] from 1 to 1,0 (kafka.cluster.Partition)
[2015-12-18 03:42:10,082] INFO Partition [topic1,1] on broker 1: Cached
zkVersion [6] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)
[2015-12-18 03:42:10,084] INFO Partition [topic1,0] on broker 1: Expanding
ISR for partition [topic1,0] from 1 to 1,0 (kafka.cluster.Partition)
[2015-12-18 03:42:10,092] INFO Partition [topic1,0] on broker 1: Cached
zkVersion [5] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)


Looks like some version mismatch wrt zookeepers.

Can someone please help on this.


Thank you.

Regards,
Mazhar Shaikh.


Re: reassign __consumer_offsets partitions

2015-12-18 Thread Damian Guy
I was just trying to get it generate the json for reassignment and the
output was empty, i.e.,

offsets.json
=
{"topics": [
  {"topic": "__consumer_offsets"}
],
 "version":1
}



bin/kafka-reassign-partitions.sh --zookeeper blah
--topics-to-move-json-file ~/offsets.json --broker-list
"2006,2007,2008,2009,2010" --generate
Current partition replica assignment

{"version":1,"partitions":[]}
Proposed partition reassignment configuration

{"version":1,"partitions":[]}


On 17 December 2015 at 15:32, Ben Stopford  wrote:

> Hi Damian
>
> The reassignment should treat the offsets topic as any other topic. I did
> a quick test and it seemed to work for me. Do you see anything suspicious
> in the controller log?
>
> B
> > On 16 Dec 2015, at 14:51, Damian Guy  wrote:
> >
> > Hi,
> >
> >
> > We have had some temporary nodes in our kafka cluster and i now need to
> > move assigned partitions off of those nodes onto the permanent members.
> I'm
> > familiar with the kafka-reassign-partitions script, but ... How do i get
> it
> > to work with the __consumer_offsets partition? It currently seems to
> ignore
> > it.
> >
> > Thanks,
> > Damian
>
>


Re: reassign __consumer_offsets partitions

2015-12-18 Thread Damian Guy
And in doing so i've answered my own question ( i think! ) - i don't
believe the topic has been created on that cluster yet...

On 18 December 2015 at 10:56, Damian Guy  wrote:

> I was just trying to get it generate the json for reassignment and the
> output was empty, i.e.,
>
> offsets.json
> =
> {"topics": [
>   {"topic": "__consumer_offsets"}
> ],
>  "version":1
> }
>
>
>
> bin/kafka-reassign-partitions.sh --zookeeper blah
> --topics-to-move-json-file ~/offsets.json --broker-list
> "2006,2007,2008,2009,2010" --generate
> Current partition replica assignment
>
> {"version":1,"partitions":[]}
> Proposed partition reassignment configuration
>
> {"version":1,"partitions":[]}
>
>
> On 17 December 2015 at 15:32, Ben Stopford  wrote:
>
>> Hi Damian
>>
>> The reassignment should treat the offsets topic as any other topic. I did
>> a quick test and it seemed to work for me. Do you see anything suspicious
>> in the controller log?
>>
>> B
>> > On 16 Dec 2015, at 14:51, Damian Guy  wrote:
>> >
>> > Hi,
>> >
>> >
>> > We have had some temporary nodes in our kafka cluster and i now need to
>> > move assigned partitions off of those nodes onto the permanent members.
>> I'm
>> > familiar with the kafka-reassign-partitions script, but ... How do i
>> get it
>> > to work with the __consumer_offsets partition? It currently seems to
>> ignore
>> > it.
>> >
>> > Thanks,
>> > Damian
>>
>>
>


how to reset kafka offset in zookeeper

2015-12-18 Thread Akhilesh Pathodia
Hi,

I want to reset the kafka offset in zookeeper so that the consumer will
start reading messages from first offset. I am using flume as a consumer to
kafka. I have set the kafka property kafka.auto.offset.reset to "smallest",
but it does not reset the offset in zookeeper and that's why flume will not
read messages from first offset.

Is there any way to reset kafka offset in zookeeper?

Thanks,
Akhilesh


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Todd Palino
The way to reset to smallest is to stop the consumer, delete the consumer
group from Zookeeper, and then restart with the property set to smallest.
Once your consumer has recreated the group and committed offsets, you can
change the auto.offset.reset property back to largest (if that is your
preference).

-Todd

On Friday, December 18, 2015, Akhilesh Pathodia 
wrote:

> Hi,
>
> I want to reset the kafka offset in zookeeper so that the consumer will
> start reading messages from first offset. I am using flume as a consumer to
> kafka. I have set the kafka property kafka.auto.offset.reset to "smallest",
> but it does not reset the offset in zookeeper and that's why flume will not
> read messages from first offset.
>
> Is there any way to reset kafka offset in zookeeper?
>
> Thanks,
> Akhilesh
>


-- 
*—-*
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Jens Rantil
Hi,

I noticed that a consumer in the new consumer API supports setting the
offset for a partition to beginning. I assume doing so also would update
the offset in Zookeeper eventually.

Cheers,
Jens

On Friday, December 18, 2015, Akhilesh Pathodia 
wrote:

> Hi,
>
> I want to reset the kafka offset in zookeeper so that the consumer will
> start reading messages from first offset. I am using flume as a consumer to
> kafka. I have set the kafka property kafka.auto.offset.reset to "smallest",
> but it does not reset the offset in zookeeper and that's why flume will not
> read messages from first offset.
>
> Is there any way to reset kafka offset in zookeeper?
>
> Thanks,
> Akhilesh
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Marko Bonaći
You can also do this:
1. stop consumers
2. export offsets from ZK
3. make changes to the exported file
4. import offsets to ZK
5. start consumers

e.g.
bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group group-name
--output-file /tmp/zk-offsets --zkconnect localhost:2181
bin/kafka-run-class.sh kafka.tools.ImportZkOffsets --input-file
/tmp/zk-offsets --zkconnect localhost:2181

Marko Bonaći
Monitoring | Alerting | Anomaly Detection | Centralized Log Management
Solr & Elasticsearch Support
Sematext  | Contact


On Fri, Dec 18, 2015 at 4:06 PM, Jens Rantil  wrote:

> Hi,
>
> I noticed that a consumer in the new consumer API supports setting the
> offset for a partition to beginning. I assume doing so also would update
> the offset in Zookeeper eventually.
>
> Cheers,
> Jens
>
> On Friday, December 18, 2015, Akhilesh Pathodia <
> pathodia.akhil...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I want to reset the kafka offset in zookeeper so that the consumer will
> > start reading messages from first offset. I am using flume as a consumer
> to
> > kafka. I have set the kafka property kafka.auto.offset.reset to
> "smallest",
> > but it does not reset the offset in zookeeper and that's why flume will
> not
> > read messages from first offset.
> >
> > Is there any way to reset kafka offset in zookeeper?
> >
> > Thanks,
> > Akhilesh
> >
>
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook  Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter 
>


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Todd Palino
That works if you want to set to an arbitrary offset, Marko. However in the
case the OP described, wanting to reset to smallest, it is better to just
delete the consumer group and start the consumer with auto.offset.reset set
to smallest. The reason is that while you can pull the current smallest
offsets from the brokers and set them in Zookeeper for the consumer, by the
time you do that the smallest offset is likely no longer valid. This means
you’re going to resort to the offset reset logic anyways.

-Todd


On Fri, Dec 18, 2015 at 7:10 AM, Marko Bonaći 
wrote:

> You can also do this:
> 1. stop consumers
> 2. export offsets from ZK
> 3. make changes to the exported file
> 4. import offsets to ZK
> 5. start consumers
>
> e.g.
> bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group group-name
> --output-file /tmp/zk-offsets --zkconnect localhost:2181
> bin/kafka-run-class.sh kafka.tools.ImportZkOffsets --input-file
> /tmp/zk-offsets --zkconnect localhost:2181
>
> Marko Bonaći
> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> Solr & Elasticsearch Support
> Sematext  | Contact
> 
>
> On Fri, Dec 18, 2015 at 4:06 PM, Jens Rantil  wrote:
>
> > Hi,
> >
> > I noticed that a consumer in the new consumer API supports setting the
> > offset for a partition to beginning. I assume doing so also would update
> > the offset in Zookeeper eventually.
> >
> > Cheers,
> > Jens
> >
> > On Friday, December 18, 2015, Akhilesh Pathodia <
> > pathodia.akhil...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I want to reset the kafka offset in zookeeper so that the consumer will
> > > start reading messages from first offset. I am using flume as a
> consumer
> > to
> > > kafka. I have set the kafka property kafka.auto.offset.reset to
> > "smallest",
> > > but it does not reset the offset in zookeeper and that's why flume will
> > not
> > > read messages from first offset.
> > >
> > > Is there any way to reset kafka offset in zookeeper?
> > >
> > > Thanks,
> > > Akhilesh
> > >
> >
> >
> > --
> > Jens Rantil
> > Backend engineer
> > Tink AB
> >
> > Email: jens.ran...@tink.se
> > Phone: +46 708 84 18 32
> > Web: www.tink.se
> >
> > Facebook  Linkedin
> > <
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > >
> >  Twitter 
> >
>



-- 
*—-*
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Marko Bonaći
Hmm, I guess you're right Tod :)
Just to confirm, you meant that, while you're changing the exported file it
might happen that one of the segment files becomes eligible for cleanup by
retention, which would then make the imported offsets out of range?

Marko Bonaći
Monitoring | Alerting | Anomaly Detection | Centralized Log Management
Solr & Elasticsearch Support
Sematext  | Contact


On Fri, Dec 18, 2015 at 6:29 PM, Todd Palino  wrote:

> That works if you want to set to an arbitrary offset, Marko. However in the
> case the OP described, wanting to reset to smallest, it is better to just
> delete the consumer group and start the consumer with auto.offset.reset set
> to smallest. The reason is that while you can pull the current smallest
> offsets from the brokers and set them in Zookeeper for the consumer, by the
> time you do that the smallest offset is likely no longer valid. This means
> you’re going to resort to the offset reset logic anyways.
>
> -Todd
>
>
> On Fri, Dec 18, 2015 at 7:10 AM, Marko Bonaći 
> wrote:
>
> > You can also do this:
> > 1. stop consumers
> > 2. export offsets from ZK
> > 3. make changes to the exported file
> > 4. import offsets to ZK
> > 5. start consumers
> >
> > e.g.
> > bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group group-name
> > --output-file /tmp/zk-offsets --zkconnect localhost:2181
> > bin/kafka-run-class.sh kafka.tools.ImportZkOffsets --input-file
> > /tmp/zk-offsets --zkconnect localhost:2181
> >
> > Marko Bonaći
> > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > Solr & Elasticsearch Support
> > Sematext  | Contact
> > 
> >
> > On Fri, Dec 18, 2015 at 4:06 PM, Jens Rantil 
> wrote:
> >
> > > Hi,
> > >
> > > I noticed that a consumer in the new consumer API supports setting the
> > > offset for a partition to beginning. I assume doing so also would
> update
> > > the offset in Zookeeper eventually.
> > >
> > > Cheers,
> > > Jens
> > >
> > > On Friday, December 18, 2015, Akhilesh Pathodia <
> > > pathodia.akhil...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I want to reset the kafka offset in zookeeper so that the consumer
> will
> > > > start reading messages from first offset. I am using flume as a
> > consumer
> > > to
> > > > kafka. I have set the kafka property kafka.auto.offset.reset to
> > > "smallest",
> > > > but it does not reset the offset in zookeeper and that's why flume
> will
> > > not
> > > > read messages from first offset.
> > > >
> > > > Is there any way to reset kafka offset in zookeeper?
> > > >
> > > > Thanks,
> > > > Akhilesh
> > > >
> > >
> > >
> > > --
> > > Jens Rantil
> > > Backend engineer
> > > Tink AB
> > >
> > > Email: jens.ran...@tink.se
> > > Phone: +46 708 84 18 32
> > > Web: www.tink.se
> > >
> > > Facebook  Linkedin
> > > <
> > >
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > > >
> > >  Twitter 
> > >
> >
>
>
>
> --
> *—-*
> *Todd Palino*
> Staff Site Reliability Engineer
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Dana Powers
If you don't like messing w/ ZK directly, another alternative is to
manually seek to offset 0 on all relevant topic-partitions (via
OffsetCommitRequest or your favorite client api) and change the
auto-offset-reset policy on your consumer to earliest/smallest. Bonus is
that this should also work for consumers that use kafka-backed offset
commit storage.

-Dana

On Fri, Dec 18, 2015 at 9:38 AM, Marko Bonaći 
wrote:

> Hmm, I guess you're right Tod :)
> Just to confirm, you meant that, while you're changing the exported file it
> might happen that one of the segment files becomes eligible for cleanup by
> retention, which would then make the imported offsets out of range?
>
> Marko Bonaći
> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> Solr & Elasticsearch Support
> Sematext  | Contact
> 
>
> On Fri, Dec 18, 2015 at 6:29 PM, Todd Palino  wrote:
>
> > That works if you want to set to an arbitrary offset, Marko. However in
> the
> > case the OP described, wanting to reset to smallest, it is better to just
> > delete the consumer group and start the consumer with auto.offset.reset
> set
> > to smallest. The reason is that while you can pull the current smallest
> > offsets from the brokers and set them in Zookeeper for the consumer, by
> the
> > time you do that the smallest offset is likely no longer valid. This
> means
> > you’re going to resort to the offset reset logic anyways.
> >
> > -Todd
> >
> >
> > On Fri, Dec 18, 2015 at 7:10 AM, Marko Bonaći  >
> > wrote:
> >
> > > You can also do this:
> > > 1. stop consumers
> > > 2. export offsets from ZK
> > > 3. make changes to the exported file
> > > 4. import offsets to ZK
> > > 5. start consumers
> > >
> > > e.g.
> > > bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group group-name
> > > --output-file /tmp/zk-offsets --zkconnect localhost:2181
> > > bin/kafka-run-class.sh kafka.tools.ImportZkOffsets --input-file
> > > /tmp/zk-offsets --zkconnect localhost:2181
> > >
> > > Marko Bonaći
> > > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > > Solr & Elasticsearch Support
> > > Sematext  | Contact
> > > 
> > >
> > > On Fri, Dec 18, 2015 at 4:06 PM, Jens Rantil 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I noticed that a consumer in the new consumer API supports setting
> the
> > > > offset for a partition to beginning. I assume doing so also would
> > update
> > > > the offset in Zookeeper eventually.
> > > >
> > > > Cheers,
> > > > Jens
> > > >
> > > > On Friday, December 18, 2015, Akhilesh Pathodia <
> > > > pathodia.akhil...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I want to reset the kafka offset in zookeeper so that the consumer
> > will
> > > > > start reading messages from first offset. I am using flume as a
> > > consumer
> > > > to
> > > > > kafka. I have set the kafka property kafka.auto.offset.reset to
> > > > "smallest",
> > > > > but it does not reset the offset in zookeeper and that's why flume
> > will
> > > > not
> > > > > read messages from first offset.
> > > > >
> > > > > Is there any way to reset kafka offset in zookeeper?
> > > > >
> > > > > Thanks,
> > > > > Akhilesh
> > > > >
> > > >
> > > >
> > > > --
> > > > Jens Rantil
> > > > Backend engineer
> > > > Tink AB
> > > >
> > > > Email: jens.ran...@tink.se
> > > > Phone: +46 708 84 18 32
> > > > Web: www.tink.se
> > > >
> > > > Facebook  Linkedin
> > > > <
> > > >
> > >
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > > > >
> > > >  Twitter 
> > > >
> > >
> >
> >
> >
> > --
> > *—-*
> > *Todd Palino*
> > Staff Site Reliability Engineer
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>


Re: how to reset kafka offset in zookeeper

2015-12-18 Thread Todd Palino
Yes, that’s right. It’s just work for no real gain :)

-Todd

On Fri, Dec 18, 2015 at 9:38 AM, Marko Bonaći 
wrote:

> Hmm, I guess you're right Tod :)
> Just to confirm, you meant that, while you're changing the exported file it
> might happen that one of the segment files becomes eligible for cleanup by
> retention, which would then make the imported offsets out of range?
>
> Marko Bonaći
> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> Solr & Elasticsearch Support
> Sematext  | Contact
> 
>
> On Fri, Dec 18, 2015 at 6:29 PM, Todd Palino  wrote:
>
> > That works if you want to set to an arbitrary offset, Marko. However in
> the
> > case the OP described, wanting to reset to smallest, it is better to just
> > delete the consumer group and start the consumer with auto.offset.reset
> set
> > to smallest. The reason is that while you can pull the current smallest
> > offsets from the brokers and set them in Zookeeper for the consumer, by
> the
> > time you do that the smallest offset is likely no longer valid. This
> means
> > you’re going to resort to the offset reset logic anyways.
> >
> > -Todd
> >
> >
> > On Fri, Dec 18, 2015 at 7:10 AM, Marko Bonaći  >
> > wrote:
> >
> > > You can also do this:
> > > 1. stop consumers
> > > 2. export offsets from ZK
> > > 3. make changes to the exported file
> > > 4. import offsets to ZK
> > > 5. start consumers
> > >
> > > e.g.
> > > bin/kafka-run-class.sh kafka.tools.ExportZkOffsets --group group-name
> > > --output-file /tmp/zk-offsets --zkconnect localhost:2181
> > > bin/kafka-run-class.sh kafka.tools.ImportZkOffsets --input-file
> > > /tmp/zk-offsets --zkconnect localhost:2181
> > >
> > > Marko Bonaći
> > > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > > Solr & Elasticsearch Support
> > > Sematext  | Contact
> > > 
> > >
> > > On Fri, Dec 18, 2015 at 4:06 PM, Jens Rantil 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I noticed that a consumer in the new consumer API supports setting
> the
> > > > offset for a partition to beginning. I assume doing so also would
> > update
> > > > the offset in Zookeeper eventually.
> > > >
> > > > Cheers,
> > > > Jens
> > > >
> > > > On Friday, December 18, 2015, Akhilesh Pathodia <
> > > > pathodia.akhil...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I want to reset the kafka offset in zookeeper so that the consumer
> > will
> > > > > start reading messages from first offset. I am using flume as a
> > > consumer
> > > > to
> > > > > kafka. I have set the kafka property kafka.auto.offset.reset to
> > > > "smallest",
> > > > > but it does not reset the offset in zookeeper and that's why flume
> > will
> > > > not
> > > > > read messages from first offset.
> > > > >
> > > > > Is there any way to reset kafka offset in zookeeper?
> > > > >
> > > > > Thanks,
> > > > > Akhilesh
> > > > >
> > > >
> > > >
> > > > --
> > > > Jens Rantil
> > > > Backend engineer
> > > > Tink AB
> > > >
> > > > Email: jens.ran...@tink.se
> > > > Phone: +46 708 84 18 32
> > > > Web: www.tink.se
> > > >
> > > > Facebook  Linkedin
> > > > <
> > > >
> > >
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > > > >
> > > >  Twitter 
> > > >
> > >
> >
> >
> >
> > --
> > *—-*
> > *Todd Palino*
> > Staff Site Reliability Engineer
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>



-- 
*—-*
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino


RE: Local Storage

2015-12-18 Thread Heath Ivie
Thank you very much Gwen

-Original Message-
From: Gwen Shapira [mailto:g...@confluent.io] 
Sent: Thursday, December 17, 2015 3:45 PM
To: users@kafka.apache.org
Subject: Re: Local Storage

Hi,

Kafka *is* a data store. It writes data to files on the OS file system. One 
directory per partition, and a new file every specific amount of time (you can 
control this with log.roll.ms). The data format is specific to Kafka.

Hope this helps,

Gwen

On Thu, Dec 17, 2015 at 3:32 PM, Heath Ivie  wrote:

> Maybe someone can answer this question for, because I cannot seem to 
> find it.
>
> What is the data store that Kafka uses when it writes the logs to disk?
>
> I thought I saw a reference to KahaDB, but I am not sure if that is 
> correct.
>
>
> Heath Ivie
> Solutions Architect
>
>
> Warning: This e-mail may contain information proprietary to 
> AutoAnything Inc. and is intended only for the use of the intended 
> recipient(s). If the reader of this message is not the intended 
> recipient(s), you have received this message in error and any review, 
> dissemination, distribution or copying of this message is strictly 
> prohibited. If you have received this message in error, please notify 
> the sender immediately and delete all copies.
>


Re: kafka-connect-jdbc: ids, timestamps, and transactions

2015-12-18 Thread Mark Drago
Ewen,

Thanks for the reply.  We'll proceed while keeping all of your points in
mind.  I looked around for a more focused forum for the jdbc connector
before posting here but didn't come across the confluent-platform group.
I'll direct any more questions about the jdbc connector there.  I'll also
close the github issue with a link to this thread.

Thanks again,
Mark.

On Wed, Dec 16, 2015 at 9:51 PM Ewen Cheslack-Postava 
wrote:

> Mark,
>
> There are definitely limitations to using JDBC for change data capture.
> Using a database-specific implementation, especially if you can read
> directly off the database's log, will be able to handle more situations
> like this. Cases like the one you describe are difficult to address
> efficiently working only with simple queries.
>
> The JDBC connector offers a few different modes for handling incremental
> queries. One of them uses both a timestamp and a unique ID, which will be
> more robust to issues like these. However, even with both, you can still
> come up with variants that can cause issues like the one you describe. You
> also have the option of using a custom query which might help if you can do
> something smarter by making assumptions about your table, but for now
> that's pretty limited for constructing incremental queries since the
> connector doesn't provide a way to track offset columns with custom
> queries. I'd like to improve the support for this in the future, but at
> some point it starts making sense to look at database-specific connectors.
>
> (By the way, this gets even messier once you start thinking about the
> variety of different isolation levels people may be using...)
>
> -Ewen
>
> P.S. Where to ask these questions is a bit confusing since Connect is part
> of Kafka. In general, for specific connectors I'd suggest asking on the
> corresponding mailing list for the project, which in the case of the JDBC
> connector would be the Confluent Platform mailing list here:
> https://groups.google.com/forum/#!forum/confluent-platform
>
> On Wed, Dec 16, 2015 at 5:27 AM, Mark Drago  wrote:
>
> > I had asked this in a github issue but I'm reposting here to try and get
> an
> > answer from a wider audience.
> >
> > Has any thought gone into how kafka-connect-jdbc will be impacted by SQL
> > transactions committing IDs and timestamps out-of-order?  Let me give an
> > example with two connections.
> >
> > 1: begin transaction
> > 1: insert (get id 1)
> > 2: begin transaction
> > 2: insert (get id 2)
> > 2: commit (recording id 2)
> > kafka-connect-jdbc runs and thinks it has handled everything through id 2
> > 1: commit (recording id 1)
> >
> > This would result in kafka-connect-jdbc missing id 1. The same thing
> could
> > happen with timestamps. I've read through some of the kafka-connect-jdbc
> > code and I think it may be susceptible to this problem, but I haven't run
> > it or verified that it would be an issue. Has this come up before? Are
> > there plans to deal with this situation?
> >
> > Obviously something like bottled-water for postgresql would handle this
> > nicely as it would get the changes once they're committed.
> >
> >
> > Thanks for any insight,
> >
> > Mark.
> >
> >
> > Original github issue:
> > https://github.com/confluentinc/kafka-connect-jdbc/issues/27
> >
>
>
>
> --
> Thanks,
> Ewen
>


Monitoring MirrorMaker

2015-12-18 Thread Rajiv Jivan
I have a couple of questions on how to monitor MirrorMaker using 
ConsumerOffsetChecker. 
1. When viewing a topic I see multiple rows. Is each row for one partition
2. I am looking to write a nagios plugin to alert if MirrorMaker is running but 
isn't keeping up. For example there is a network connectivity issue. Is summing 
up the lag and triggering alerts based on the value recommended.

Also curios to know what other folks have been doing.

Regards,

Rajiv


Better strategy for sending a message to multiple topics

2015-12-18 Thread Abel .
Hi,

I have this scenario where I need to send a message to multiple topics. I
create a single KafkaProducer, prepare the payload and then I call the send
method of the producer for each topic with the correspoding ProducerRecord
for the topic and the fixed message. However, I have noticed that this
procedure takes some time depending on the number of topics. For instance,
to send a message to 30 topics it takes more than 3s because each request
takes about 100ms to return from the send method. Is there a better way to
accomplish this same task? Any recommendation?

Regards,

Abel.


Re: Better strategy for sending a message to multiple topics

2015-12-18 Thread Jens Rantil
Hi,


Why don't your consumers instead subscribe to a single topic used to broadcast 
to all of them? That way your consumers and producer will be much simpler.




Cheers,

Jens





–
Skickat från Mailbox

On Fri, Dec 18, 2015 at 4:16 PM, Abel .  wrote:

> Hi,
> I have this scenario where I need to send a message to multiple topics. I
> create a single KafkaProducer, prepare the payload and then I call the send
> method of the producer for each topic with the correspoding ProducerRecord
> for the topic and the fixed message. However, I have noticed that this
> procedure takes some time depending on the number of topics. For instance,
> to send a message to 30 topics it takes more than 3s because each request
> takes about 100ms to return from the send method. Is there a better way to
> accomplish this same task? Any recommendation?
> Regards,
> Abel.

Producing when broker goes down

2015-12-18 Thread Buck Tandyco
I'm stress testing my kafka setup. I have a producer that is working just fine 
and then I kill off one of the two brokers that I have running with replication 
factor of 2.  I'm able to keep receiving from my consumer thread but my 
producer generates this exception: "kafka.common.FailedToSendMessageException: 
Failed to send messages after 3 tries"
I've tried messing with the producer config such as timeouts, reconnect 
intervals, etc. But haven't made any progress.
Does anyone have any ideas of what I might try?
Thanks,Zack


Re: kafka-connect-jdbc: ids, timestamps, and transactions

2015-12-18 Thread James Cheng
Mark, what database are you using?

If you are using MySQL...



There is a not-yet-finished Kafka MySQL Connector at 
https://github.com/wushujames/kafka-mysql-connector. It tails the MySQL binlog, 
and so will handle the situation you describe.

But, as I mentioned, I haven't finished it yet.

If you are using MySQL and don't specifically need/want Kafka Connect, then 
there are a bunch of other options. There is a list of them at 
https://github.com/wushujames/mysql-cdc-projects. But, I'd recommend using the 
Kafka Connect framework, since it was built for this exact purpose.



-James

> On Dec 18, 2015, at 12:08 PM, Mark Drago  wrote:
>
> Ewen,
>
> Thanks for the reply.  We'll proceed while keeping all of your points in
> mind.  I looked around for a more focused forum for the jdbc connector
> before posting here but didn't come across the confluent-platform group.
> I'll direct any more questions about the jdbc connector there.  I'll also
> close the github issue with a link to this thread.
>
> Thanks again,
> Mark.
>
> On Wed, Dec 16, 2015 at 9:51 PM Ewen Cheslack-Postava 
> wrote:
>
>> Mark,
>>
>> There are definitely limitations to using JDBC for change data capture.
>> Using a database-specific implementation, especially if you can read
>> directly off the database's log, will be able to handle more situations
>> like this. Cases like the one you describe are difficult to address
>> efficiently working only with simple queries.
>>
>> The JDBC connector offers a few different modes for handling incremental
>> queries. One of them uses both a timestamp and a unique ID, which will be
>> more robust to issues like these. However, even with both, you can still
>> come up with variants that can cause issues like the one you describe. You
>> also have the option of using a custom query which might help if you can do
>> something smarter by making assumptions about your table, but for now
>> that's pretty limited for constructing incremental queries since the
>> connector doesn't provide a way to track offset columns with custom
>> queries. I'd like to improve the support for this in the future, but at
>> some point it starts making sense to look at database-specific connectors.
>>
>> (By the way, this gets even messier once you start thinking about the
>> variety of different isolation levels people may be using...)
>>
>> -Ewen
>>
>> P.S. Where to ask these questions is a bit confusing since Connect is part
>> of Kafka. In general, for specific connectors I'd suggest asking on the
>> corresponding mailing list for the project, which in the case of the JDBC
>> connector would be the Confluent Platform mailing list here:
>> https://groups.google.com/forum/#!forum/confluent-platform
>>
>> On Wed, Dec 16, 2015 at 5:27 AM, Mark Drago  wrote:
>>
>>> I had asked this in a github issue but I'm reposting here to try and get
>> an
>>> answer from a wider audience.
>>>
>>> Has any thought gone into how kafka-connect-jdbc will be impacted by SQL
>>> transactions committing IDs and timestamps out-of-order?  Let me give an
>>> example with two connections.
>>>
>>> 1: begin transaction
>>> 1: insert (get id 1)
>>> 2: begin transaction
>>> 2: insert (get id 2)
>>> 2: commit (recording id 2)
>>> kafka-connect-jdbc runs and thinks it has handled everything through id 2
>>> 1: commit (recording id 1)
>>>
>>> This would result in kafka-connect-jdbc missing id 1. The same thing
>> could
>>> happen with timestamps. I've read through some of the kafka-connect-jdbc
>>> code and I think it may be susceptible to this problem, but I haven't run
>>> it or verified that it would be an issue. Has this come up before? Are
>>> there plans to deal with this situation?
>>>
>>> Obviously something like bottled-water for postgresql would handle this
>>> nicely as it would get the changes once they're committed.
>>>
>>>
>>> Thanks for any insight,
>>>
>>> Mark.
>>>
>>>
>>> Original github issue:
>>> https://github.com/confluentinc/kafka-connect-jdbc/issues/27
>>>
>>
>>
>>
>> --
>> Thanks,
>> Ewen
>>




This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


Re: Producing when broker goes down

2015-12-18 Thread Alex Loddengaard
Hi Buck,

What are your settings for:

   - acks
   - request.timeout.ms
   - timeout.ms
   - min.insync.replicas (on the broker)

Thanks,

Alex

On Fri, Dec 18, 2015 at 1:23 PM, Buck Tandyco 
wrote:

> I'm stress testing my kafka setup. I have a producer that is working just
> fine and then I kill off one of the two brokers that I have running with
> replication factor of 2.  I'm able to keep receiving from my consumer
> thread but my producer generates this exception:
> "kafka.common.FailedToSendMessageException: Failed to send messages after 3
> tries"
> I've tried messing with the producer config such as timeouts, reconnect
> intervals, etc. But haven't made any progress.
> Does anyone have any ideas of what I might try?
> Thanks,Zack
>


Re: Monitoring MirrorMaker

2015-12-18 Thread Pablo Fischer
We ran through this a few months ago, here is a list of things and tools
I'd recommend you:

 - install burrow. It monitors the consumers and make sure they are not
lagging behind, it also covers other corner cases that can get tricky with
the offset checker. We query burrow (has an API) and then generate alerts
tho it also has email/alert support.
 - not sure how it is on 0.9 (we are still on 0.8.2) but we use a different
mirrormaker that preserves partitions. If I remember correctly when we used
the official one we found it was mirroring everything to a single partition
:/. But regardless, we also query the logs and look for errors and report
errors and even auto restart the process if needed.


On Friday, December 18, 2015, Rajiv Jivan  wrote:

> I have a couple of questions on how to monitor MirrorMaker using
> ConsumerOffsetChecker.
> 1. When viewing a topic I see multiple rows. Is each row for one partition
> 2. I am looking to write a nagios plugin to alert if MirrorMaker is
> running but isn't keeping up. For example there is a network connectivity
> issue. Is summing up the lag and triggering alerts based on the value
> recommended.
>
> Also curios to know what other folks have been doing.
>
> Regards,
>
> Rajiv
>


-- 
Pablo