Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Steve Morin
That would be great!

On Mon, Nov 10, 2014 at 9:45 PM, Jun Rao  wrote:

> Otis,
>
> We don't have an api for that now. We can probably expose this as a JMX as
> part of kafka-1481.
>
> Thanks,
>
> Jun
>
> On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> otis.gospodne...@gmail.com> wrote:
>
> > Hi,
> >
> > Is there a way to detect which version of Kafka one is running?
> > Is there an API for that, or a constant with this value, or maybe an
> MBean
> > or some other way to get to this info?
> >
> > Thanks,
> > Otis
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
>


Re: Adding replicas to existing topic cause data loss in some partitions

2014-11-11 Thread Shangan Chen
sorry, I didn't pay attention to these log. I did it by running a script in
the night, and received feedback in the morning, I found the server log
overflow by printing OutOfRange exception, so didn't catch any clue. I'll
take care next time. Anyway, thanks a lot.

On Mon, Nov 10, 2014 at 9:59 AM, Jun Rao  wrote:

> Any error in the controller/state-change log when you increased the
> replication factor? If you describe those topics, are both replicas in ISR?
> The answers to those questions will help us understand whether this is a
> broker side or consumer-side issue.
>
> Thanks,
>
> Jun
>
> On Thu, Nov 6, 2014 at 11:56 PM, Shangan Chen 
> wrote:
>
> > I have a kafka cluster, every topic in it has only one replica. Recently
> I
> > extend every topic with 2 replicas. Most topics work fine, but some large
> > topics have some problems with part of partitions. Consumer throw offset
> > OutOfRange exception, the fact is consumer request offset is bigger than
> > the latest offset. I doubt there is some bug with the tool which can add
> > replicas to existing topic.
> >
> > I add replicas by the following  guide:
> >
> >
> http://kafka.apache.org/081/documentation.html#basic_ops_increase_replication_factor
> >
> >
> > Does anyone  face the same problem before and can figure out how to avoid
> > this ?
> >
> > --
> > have a good day!
> > chenshang'an
> >
>



-- 
have a good day!
chenshang'an


Re: expanding cluster and reassigning parititions without restarting producer

2014-11-11 Thread Shlomi Hazan
Neha, I understand that the producer kafka.javaapi.producer.Producer shown
in examples is old,
and that a new producer (org.apache.kafka.clients.producer) is avail? is it
available for 0.8.1.1?
how does it work? does it have a trigger fired when partitions are added or
does the producer refresh some cache every some given time period?

Shlomi


On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede 
wrote:

> How can I auto refresh keyed producers to use new partitions as these
> partitions are added?
>
> Try using the new producer under org.apache.kafka.clients.producer.
>
> Thanks,
> Neha
>
> On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com>
> wrote:
>
> > I had different experience with expanding partition for new producer and
> > its impact.  I only tried for non-key message.I would always advice
> to
> > keep batch size relatively low or plan for expansion with new java
> producer
> > in advance or since inception otherwise running producer code is
> impacted.
> >
> > Here is mail chain:
> >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E
> >
> > Thanks,
> >
> > Bhavesh
> >
> > On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan  wrote:
> >
> > > Hmmm..
> > > The Java producer example seems to ignore added partitions too...
> > > How can I auto refresh keyed producers to use new partitions as these
> > > partitions are added?
> > >
> > >
> > > On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan 
> wrote:
> > >
> > > > One more thing:
> > > > I saw that the Python client is also unaffected by addition of
> > partitions
> > > > to a topic and that it continues to send requests only to the old
> > > > partitions.
> > > > is this also handled appropriately by the Java producer? Will he see
> > the
> > > > change and produce to the new partitions as well?
> > > > Shlomi
> > > >
> > > > On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan 
> > wrote:
> > > >
> > > >> No I don't see anything like that, the question was aimed at
> learning
> > if
> > > >> it is worthwhile to make the effort of reimplementing the Python
> > > producer
> > > >> in Java, I so I will not make all the effort just to be disappointed
> > > >> afterwards.
> > > >> understand I have nothing to worry about, so I will try to simulate
> > this
> > > >> situation in small scale...
> > > >> maybe 3 brokers, one topic with one partition and then add
> partitions.
> > > >> we'll see.
> > > >> thanks for clarifying.
> > > >> Oh, Good luck with Confluent!!
> > > >> :)
> > > >>
> > > >> On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede <
> > neha.narkh...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >>> The producer might get an error code if the leader of the
> partitions
> > > >>> being
> > > >>> reassigned also changes. However it should retry and succeed. Do
> you
> > > see
> > > >>> a
> > > >>> behavior that suggests otherwise?
> > > >>>
> > > >>> On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan 
> > > wrote:
> > > >>>
> > > >>> > Hi All,
> > > >>> > I recently had an issue producing from python where expanding a
> > > cluster
> > > >>> > from 3 to 5 nodes and reassigning partitions forced me to restart
> > the
> > > >>> > producer b/c of KeyError thrown.
> > > >>> > Is this situation handled by the Java producer automatically or
> > need
> > > I
> > > >>> do
> > > >>> > something to have the java producer refresh itself to see the
> > > >>> reassigned
> > > >>> > partition layout and produce away ?
> > > >>> > Shlomi
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>


zookeeper snapshot files eat up disk space

2014-11-11 Thread Shlomi Hazan
Hi,
My zookeeper 'dataLogDir' is eating up my disk with tons of snapshot files.
what are these files? what files can I delete? are week old files
disposable?
This folder only gets bigger...
How can I avoid blowing my disk?
Thanks,
Shlomi


Re: zookeeper snapshot files eat up disk space

2014-11-11 Thread Joe Stein
http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/

On Tue, Nov 11, 2014 at 9:24 AM, Shlomi Hazan  wrote:

> Hi,
> My zookeeper 'dataLogDir' is eating up my disk with tons of snapshot files.
> what are these files? what files can I delete? are week old files
> disposable?
> This folder only gets bigger...
> How can I avoid blowing my disk?
> Thanks,
> Shlomi
>


Re: zookeeper snapshot files eat up disk space

2014-11-11 Thread Shlomi Hazan
That looks like a complete answer.
BUT
just to be sure: it says "Automatic purging of the snapshots and
corresponding transaction logs was introduced in version 3.4.0".
using 0.8.1.1 means that I will have to purge manually, right?
Is there some convention for kafka users? e.g.: delete all but last X=3
maybe?
Shlomi

On Tue, Nov 11, 2014 at 4:27 PM, Joe Stein  wrote:

>
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
> On Tue, Nov 11, 2014 at 9:24 AM, Shlomi Hazan  wrote:
>
> > Hi,
> > My zookeeper 'dataLogDir' is eating up my disk with tons of snapshot
> files.
> > what are these files? what files can I delete? are week old files
> > disposable?
> > This folder only gets bigger...
> > How can I avoid blowing my disk?
> > Thanks,
> > Shlomi
> >
>


Re: kafka test jars in sbt?

2014-11-11 Thread Markus Jais
Thanks Joe, 

that does the trick with sbt and the 0.8.2.-beta-test jar.

Regards,

Markus


Joe Crobak  schrieb am 23:28 Sonntag, 9.November 2014:
 

>
>
>For sbt, you need to use something like:
>
>"org.apache.kafka" %% "kafka" %"0.8.2-beta" % "test" classifier "test"
>
>That tells sbt to pull in the kafka artifact with the "test" classifier
>only when running tests. The %% tells sbt to fill in the scala version (so
>it'll map to "kafka_2.10" like in your example).
>
>
>On Thu, Nov 6, 2014 at 6:56 PM, Jun Rao  wrote:
>
>> The following is how samza references the kafka test jar in gradle.
>>
>> testCompile "org.apache.kafka:kafka_$scalaVersion:$kafkaVersion:test"
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Thu, Nov 6, 2014 at 6:38 AM, Markus Jais  wrote:
>>
>> > Hello,
>> >
>> > I want to use the kafka_2.10-0.8.2-beta-test.jar in my Scala project.
>> >
>> > It can be found here:
>> > http://repo.maven.apache.org/maven2/org/apache/kafka/kafka_2.10/0.8.1.1/
>> >
>> >
>> > In my build.sbt I write the following definition:
>> > "org.apache.kafka" % "kafka_2.10" % "0.8.2-beta-test"
>> >
>> >
>> > But sbt cannot find it. Has anybody has any success with this?
>> >
>> > I already used "gradle testJar" and it test jar gets published to :
>> >
>> > .m2/repository/org/apache/kafka/kafka_2.10/0.8.2-beta
>> >
>> >
>> > but sbt is looking for a:
>> >
>> >
>> .m2/repository/org/apache/kafka/kafka_2.10/0.8.2-beta-test/kafka_2.10-0.8.2-beta-test.pom
>> >
>> > any tips on how to use the kafka test jar (together with the regular
>> kafka
>> > jar) in an build.sbt file?
>> >
>> > I want to start a kafka cluster for a unit test.
>> >
>> > Cheers,
>> >
>> > Marus
>>
>
>
>

Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Otis Gospodnetic
Hi Jun,

Sounds good.  But is the version number stored anywhere from where it could
be gotten?

Thanks,
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:

> Otis,
>
> We don't have an api for that now. We can probably expose this as a JMX as
> part of kafka-1481.
>
> Thanks,
>
> Jun
>
> On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> otis.gospodne...@gmail.com> wrote:
>
> > Hi,
> >
> > Is there a way to detect which version of Kafka one is running?
> > Is there an API for that, or a constant with this value, or maybe an
> MBean
> > or some other way to get to this info?
> >
> > Thanks,
> > Otis
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
>


Re: zookeeper snapshot files eat up disk space

2014-11-11 Thread Ray Rodriguez
There is a sidecar jvm project called exhibitor that manages this for you.  
It's from Netflix so it's a bit aws-centric but still a good source for how to 
manage those log files.  You may also want to look into some zk config settings 
to make sure your logs are not growing too large by truncating based on space 
or time.

On Tue, Nov 11, 2014 at 9:40 AM, Joe Stein  wrote:

> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
> On Tue, Nov 11, 2014 at 9:24 AM, Shlomi Hazan  wrote:
>> Hi,
>> My zookeeper 'dataLogDir' is eating up my disk with tons of snapshot files.
>> what are these files? what files can I delete? are week old files
>> disposable?
>> This folder only gets bigger...
>> How can I avoid blowing my disk?
>> Thanks,
>> Shlomi
>>

Re: JavaKafkaWordCount not working under Spark Streaming

2014-11-11 Thread Akhil Das
Here's a simple working version.


import com.google.common.collect.Lists;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka.KafkaUtils;
import scala.Tuple2;

import java.util.HashMap;
import java.util.Map;

/**
 * Created by akhld on 11/11/14.
 */

public class KafkaWordcount {

public static void main(String[] args) {

// Location of the Spark directory
String sparkHome = "/home/akhld/mobi/localcluster/spark-1";

// URL of the Spark cluster
String sparkUrl = "spark://akhldz:7077";

// Location of the required JAR files
String jarFiles =
"/home/akhld/mobi/temp/kafkwc.jar,/home/akhld/.ivy2/cache/org.apache.spark/spark-streaming-kafka_2.10/jars/spark-streaming-kafka_2.10-1.1.0.jar,/home/akhld/.ivy2/cache/com.101tec/zkclient/jars/zkclient-0.3.jar,/home/akhld/.ivy2/cache/org.apache.kafka/kafka_2.10/jars/kafka_2.10-0.8.0.jar,/home/akhld/.ivy2/cache/com.yammer.metrics/metrics-core/jars/metrics-core-2.2.0.jar";

SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("JavaKafkaWordCount");
sparkConf.setJars(new String[]{jarFiles});
sparkConf.setMaster(sparkUrl);
sparkConf.setSparkHome(sparkHome);

//These are the minimal things that are required

*Map topicMap = new HashMap();*
*topicMap.put("test", 1);*
*String kafkaGroup = "groups";*
*String zkQuorum = "localhost:2181";*

// Create the context with a 1 second batch size
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new
Duration(2000));

JavaPairDStream messages =
KafkaUtils.createStream(jssc, zkQuorum,
kafkaGroup, topicMap);


JavaDStream lines = messages.map(new
Function, String>() {
@Override
public String call(Tuple2 tuple2) {
return tuple2._2();
}
});

JavaDStream words = lines.flatMap(new
FlatMapFunction() {
@Override
public Iterable call(String x) {
return Lists.newArrayList(x.split(" "));
}
});

JavaPairDStream wordCounts = words.mapToPair(
new PairFunction() {
@Override
public Tuple2 call(String s) {
return new Tuple2(s, 1);
}
}).reduceByKey(new Function2() {
@Override
public Integer call(Integer i1, Integer i2) {
return i1 + i2;
}
});

wordCounts.print();
jssc.start();
jssc.awaitTermination();


}

}


[image: Inline image 1]

Thanks
Best Regards

On Tue, Nov 11, 2014 at 5:37 AM, Something Something <
mailinglist...@gmail.com> wrote:

> I am not running locally.  The Spark master is:
>
> "spark://:7077"
>
>
>
> On Mon, Nov 10, 2014 at 3:47 PM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> What is the Spark master that you are using. Use local[4], not local
>> if you are running locally.
>>
>> On Mon, Nov 10, 2014 at 3:01 PM, Something Something
>>  wrote:
>> > I am embarrassed to admit but I can't get a basic 'word count' to work
>> under
>> > Kafka/Spark streaming.  My code looks like this.  I  don't see any word
>> > counts in console output.  Also, don't see any output in UI.  Needless
>> to
>> > say, I am newbie in both 'Spark' as well as 'Kafka'.
>> >
>> > Please help.  Thanks.
>> >
>> > Here's the code:
>> >
>> > public static void main(String[] args) {
>> > if (args.length < 4) {
>> > System.err.println("Usage: JavaKafkaWordCount 
>> 
>> >  ");
>> > System.exit(1);
>> > }
>> >
>> > //StreamingExamples.setStreamingLogLevels();
>> > //SparkConf sparkConf = new
>> > SparkConf().setAppName("JavaKafkaWordCount");
>> >
>> > // Location of the Spark directory
>> > String sparkHome = "/opt/mapr/spark/spark-1.0.2/";
>> >
>> > // URL of the Spark cluster
>> > String sparkUrl = "spark://mymachine:7077";
>> >
>> > // Location of the required JAR files
>> > String jarFiles =
>> >
>> "./spark-streaming-kafka_2.10-1.1.0.jar,./DlSpark-1.0-SNAPSHOT.jar,./zkclient-0.3.jar,./kafka_2.10-0.8.1.1.jar,./metrics-core-2.2.0.jar";
>> >
>> > SparkConf sparkConf = new SparkConf();
>> > sparkConf.setAppName("JavaKafkaWordCount");
>> > sparkConf.setJars(new String[]{jarFiles});
>> > sparkConf.setMaster(sparkUrl);
>> > spark

Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Jun Rao
Currently, the version number is only stored in our build config file,
gradle.properties. Not sure how we can automatically extract it and expose
it in an mbean. How do other projects do this?

Thanks,

Jun

On Tue, Nov 11, 2014 at 7:05 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi Jun,
>
> Sounds good.  But is the version number stored anywhere from where it could
> be gotten?
>
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:
>
> > Otis,
> >
> > We don't have an api for that now. We can probably expose this as a JMX
> as
> > part of kafka-1481.
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> > otis.gospodne...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Is there a way to detect which version of Kafka one is running?
> > > Is there an API for that, or a constant with this value, or maybe an
> > MBean
> > > or some other way to get to this info?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> >
>


Re: change retention for a topic on the fly does not work

2014-11-11 Thread Chen Wang
For those who might need to do the same thing, the command is
bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic yourconfig
 --config retention.ms=17280

On Mon, Nov 10, 2014 at 4:46 PM, Chen Wang 
wrote:

> Hey guys,
> i am using kafka_2.9.2-0.8.1.1
>
>  bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic my_topic
>  --config log.retention.hours.per.topic=48
>
> It says:
> Error while executing topic command requirement failed: Unknown
> configuration "log.retention.hours.per.topic".
> java.lang.IllegalArgumentException: requirement failed: Unknown
> configuration "log.retention.hours.per.topic".
> at scala.Predef$.require(Predef.scala:214)
> at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:138)
> at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:137)
> at scala.collection.Iterator$class.foreach(Iterator.scala:772)
> at
> scala.collection.JavaConversions$JEnumerationWrapper.foreach(JavaConversions.scala:578)
> at kafka.log.LogConfig$.validateNames(LogConfig.scala:137)
> at kafka.log.LogConfig$.validate(LogConfig.scala:145)
>
> Any idea?
> Chen
>
>
>


Re: Cannot connect to Kafka from outside of EC2

2014-11-11 Thread Sameer Yami
Hi Guozhang,

I was wondering if you found anything wrong in the logs/

thanks


On Fri, Nov 7, 2014 at 4:19 PM, Sameer Yami  wrote:

> Hi Guozhang,
>
> Attached are the two logs with debug enabled.
>
> Thanks!
>
> On Fri, Nov 7, 2014 at 2:09 PM, Sameer Yami  wrote:
>
>> The version is kafka_2.10-0.8.1.1. It is not the latest trunk.
>> Will try enabling debug version.
>>
>> thanks
>>
>>
>> On Thu, Nov 6, 2014 at 9:37 PM, Guozhang Wang  wrote:
>>
>>> Sameer,
>>>
>>> The server logs do not contain any non-INFO logs, which is a bit wired.
>>> Did
>>> you deploy the current trunk of Kafka? Also could you enable DEBUG level
>>> logging on Kafka brokers?
>>>
>>> Guozhang
>>>
>>> On Wed, Nov 5, 2014 at 3:50 PM, Sameer Yami  wrote:
>>>
>>> > The server.log was taken separately.
>>> > We ran the test again and the server and producer logs are below (to
>>> get
>>> > same timings).
>>> >
>>> >
>>> > Thanks!
>>> >
>>> >
>>> >
>>> 
>>> >
>>> >
>>> >
>>> > Producer Logs -
>>> >
>>> >
>>> > 2014-11-05 23:38:58,693
>>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
>>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
>>> sessionid:
>>> > 0x1498251e8680002 after 0ms
>>> > 2014-11-05 23:39:00,695
>>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
>>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
>>> sessionid:
>>> > 0x1498251e8680002 after 0ms
>>> > 2014-11-05 23:39:02,696
>>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
>>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
>>> sessionid:
>>> > 0x1498251e8680002 after 0ms
>>> > 2014-11-05 23:39:02,828 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Verifying properties
>>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Property auto.commit.interval.ms
>>> is
>>> > overridden to 1000
>>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Property auto.offset.reset is
>>> > overridden to smallest
>>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Property consumer.timeout.ms is
>>> > overridden to 10
>>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Property group.id is overridden
>>> to
>>> > TestCheck
>>> > 2014-11-05 23:39:02,830 pool-13-thread-2   WARN
>>> > kafka.utils.VerifiableProperties-83: Property serializer.class is not
>>> valid
>>> > 2014-11-05 23:39:02,830 pool-13-thread-2   INFO
>>> > kafka.utils.VerifiableProperties-68: Property zookeeper.connect is
>>> > overridden to 172.31.25.198:2181
>>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
>>> > kafka.consumer.ZookeeperConsumerConnector-68:
>>> > [TestCheck_ip-172-31-25-198-1415230742830-f3dfc362], Connecting to
>>> > zookeeper instance at 172.31.25.198:2181
>>> > 2014-11-05 23:39:02,831 pool-13-thread-2  DEBUG
>>> > org.I0Itec.zkclient.ZkConnection-63: Creating new ZookKeeper instance
>>> to
>>> > connect to 172.31.25.198:2181.
>>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
>>> > org.apache.zookeeper.ZooKeeper-379: Initiating client connection,
>>> > connectString=172.31.25.198:2181 sessionTimeout=6000
>>> > watcher=org.I0Itec.zkclient.ZkClient@3903b165
>>> > 2014-11-05 23:39:02,831 ZkClient-EventThread-29-172.31.25.198:2181
>>>  INFO
>>> > org.I0Itec.zkclient.ZkEventThread-64: Starting ZkClient event thread.
>>> > 2014-11-05 23:39:02,831 pool-13-thread-1   INFO
>>> > kafka.utils.VerifiableProperties-68: Verifying properties
>>> > 2014-11-05 23:39:02,836 pool-13-thread-2-SendThread()   INFO
>>> > org.apache.zookeeper.ClientCnxn-1061: Opening socket connection to
>>> server /
>>> > 172.31.25.198:2181
>>> > 2014-11-05 23:39:02,836 pool-13-thread-1   WARN
>>> > kafka.utils.VerifiableProperties-83: Property batch.size is not valid
>>> > 2014-11-05 23:39:02,832 pool-13-thread-2  DEBUG
>>> > org.I0Itec.zkclient.ZkClient-878: Awaiting connection to Zookeeper
>>> server
>>> > 2014-11-05 23:39:02,836 pool-13-thread-1   INFO
>>> > kafka.utils.VerifiableProperties-68: Property message.send.max.retries
>>> is
>>> > overridden to 10
>>> > 2014-11-05 23:39:02,836 pool-13-thread-2  DEBUG
>>> > org.I0Itec.zkclient.ZkClient-628: Waiting for keeper state
>>> SyncConnected
>>> > 2014-11-05 23:39:02,837 pool-13-thread-1   INFO
>>> > kafka.utils.VerifiableProperties-68: Property metadata.broker.list is
>>> > overridden to 172.31.25.198:9092
>>> > 2014-11-05 23:39:02,837 pool-13-thread-1   INFO
>>> > kafka.utils.VerifiableProperties-68: Property retry.backoff.ms is
>>> > overridden to 1000
>>> > 2014-11-05 23:39:02,837 pool-13-thread-1   INFO
>>> > kafka.utils.VerifiableProperties-68: Property serializer.class is
>>> > overridden to kafka.serializer.StringEncoder
>>> > 2014-11-05 23:39:02,837
>>> >
>>> >
>>> pool

Re: Cannot connect to Kafka from outside of EC2

2014-11-11 Thread Guozhang Wang
Hi Sameer,

I think apache mailing list has blocked your attachment. If it is too long
to include in the email body could you paste it somewhere and give me the
link?

Guozhang

On Tue, Nov 11, 2014 at 10:01 AM, Sameer Yami  wrote:

> Hi Guozhang,
>
> I was wondering if you found anything wrong in the logs/
>
> thanks
>
>
> On Fri, Nov 7, 2014 at 4:19 PM, Sameer Yami  wrote:
>
> > Hi Guozhang,
> >
> > Attached are the two logs with debug enabled.
> >
> > Thanks!
> >
> > On Fri, Nov 7, 2014 at 2:09 PM, Sameer Yami  wrote:
> >
> >> The version is kafka_2.10-0.8.1.1. It is not the latest trunk.
> >> Will try enabling debug version.
> >>
> >> thanks
> >>
> >>
> >> On Thu, Nov 6, 2014 at 9:37 PM, Guozhang Wang 
> wrote:
> >>
> >>> Sameer,
> >>>
> >>> The server logs do not contain any non-INFO logs, which is a bit wired.
> >>> Did
> >>> you deploy the current trunk of Kafka? Also could you enable DEBUG
> level
> >>> logging on Kafka brokers?
> >>>
> >>> Guozhang
> >>>
> >>> On Wed, Nov 5, 2014 at 3:50 PM, Sameer Yami  wrote:
> >>>
> >>> > The server.log was taken separately.
> >>> > We ran the test again and the server and producer logs are below (to
> >>> get
> >>> > same timings).
> >>> >
> >>> >
> >>> > Thanks!
> >>> >
> >>> >
> >>> >
> >>>
> 
> >>> >
> >>> >
> >>> >
> >>> > Producer Logs -
> >>> >
> >>> >
> >>> > 2014-11-05 23:38:58,693
> >>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> >>> sessionid:
> >>> > 0x1498251e8680002 after 0ms
> >>> > 2014-11-05 23:39:00,695
> >>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> >>> sessionid:
> >>> > 0x1498251e8680002 after 0ms
> >>> > 2014-11-05 23:39:02,696
> >>> > Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> >>> sessionid:
> >>> > 0x1498251e8680002 after 0ms
> >>> > 2014-11-05 23:39:02,828 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Verifying properties
> >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property
> auto.commit.interval.ms
> >>> is
> >>> > overridden to 1000
> >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property auto.offset.reset is
> >>> > overridden to smallest
> >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property consumer.timeout.ms is
> >>> > overridden to 10
> >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property group.id is overridden
> >>> to
> >>> > TestCheck
> >>> > 2014-11-05 23:39:02,830 pool-13-thread-2   WARN
> >>> > kafka.utils.VerifiableProperties-83: Property serializer.class is not
> >>> valid
> >>> > 2014-11-05 23:39:02,830 pool-13-thread-2   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property zookeeper.connect is
> >>> > overridden to 172.31.25.198:2181
> >>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
> >>> > kafka.consumer.ZookeeperConsumerConnector-68:
> >>> > [TestCheck_ip-172-31-25-198-1415230742830-f3dfc362], Connecting to
> >>> > zookeeper instance at 172.31.25.198:2181
> >>> > 2014-11-05 23:39:02,831 pool-13-thread-2  DEBUG
> >>> > org.I0Itec.zkclient.ZkConnection-63: Creating new ZookKeeper instance
> >>> to
> >>> > connect to 172.31.25.198:2181.
> >>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
> >>> > org.apache.zookeeper.ZooKeeper-379: Initiating client connection,
> >>> > connectString=172.31.25.198:2181 sessionTimeout=6000
> >>> > watcher=org.I0Itec.zkclient.ZkClient@3903b165
> >>> > 2014-11-05 23:39:02,831 ZkClient-EventThread-29-172.31.25.198:2181
> >>>  INFO
> >>> > org.I0Itec.zkclient.ZkEventThread-64: Starting ZkClient event thread.
> >>> > 2014-11-05 23:39:02,831 pool-13-thread-1   INFO
> >>> > kafka.utils.VerifiableProperties-68: Verifying properties
> >>> > 2014-11-05 23:39:02,836 pool-13-thread-2-SendThread()   INFO
> >>> > org.apache.zookeeper.ClientCnxn-1061: Opening socket connection to
> >>> server /
> >>> > 172.31.25.198:2181
> >>> > 2014-11-05 23:39:02,836 pool-13-thread-1   WARN
> >>> > kafka.utils.VerifiableProperties-83: Property batch.size is not valid
> >>> > 2014-11-05 23:39:02,832 pool-13-thread-2  DEBUG
> >>> > org.I0Itec.zkclient.ZkClient-878: Awaiting connection to Zookeeper
> >>> server
> >>> > 2014-11-05 23:39:02,836 pool-13-thread-1   INFO
> >>> > kafka.utils.VerifiableProperties-68: Property
> message.send.max.retries
> >>> is
> >>> > overridden to 10
> >>> > 2014-11-05 23:39:02,836 pool-13-thread-2  DEBUG
> >>> > org.I0Itec.zkclient.ZkClient-628: Waiting for keeper state
> >>> SyncConnected
> >>> > 2014-11-05 23:39:02,837 pool-13-thread-1   INFO
> >>> >

Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Gwen Shapira
In Sqoop we do the following:

Maven runs a shell script, passing the version as a parameter.
The shell-script generates a small java class, which is then built with a
Maven plugin.
Our code references this generated class when we expose "getVersion()".

Its complex and ugly, so I'm kind of hoping that there's a better way to do
it :)

Gwen

On Tue, Nov 11, 2014 at 9:42 AM, Jun Rao  wrote:

> Currently, the version number is only stored in our build config file,
> gradle.properties. Not sure how we can automatically extract it and expose
> it in an mbean. How do other projects do this?
>
> Thanks,
>
> Jun
>
> On Tue, Nov 11, 2014 at 7:05 AM, Otis Gospodnetic <
> otis.gospodne...@gmail.com> wrote:
>
> > Hi Jun,
> >
> > Sounds good.  But is the version number stored anywhere from where it
> could
> > be gotten?
> >
> > Thanks,
> > Otis
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:
> >
> > > Otis,
> > >
> > > We don't have an api for that now. We can probably expose this as a JMX
> > as
> > > part of kafka-1481.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> > > otis.gospodne...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Is there a way to detect which version of Kafka one is running?
> > > > Is there an API for that, or a constant with this value, or maybe an
> > > MBean
> > > > or some other way to get to this info?
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > >
> >
>


Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Joey Echeverria
In Kite, we parse the version from the POM properites file that Maven
builds into our jar for us:

private String getVersion() {
String location = "/META-INF/maven/org.kitesdk/kite-tools/pom.properties";
String version = "unknown";
InputStream pomPropertiesStream = null;
try {
  Properties pomProperties = new Properties();
  pomPropertiesStream = Main.class.getResourceAsStream(location);
  pomProperties.load(pomPropertiesStream);

  version = pomProperties.getProperty("version");
} catch (Exception ex) {
  if (debug) {
console.warn("Unable to determine version from the {} file", location);
console.warn("Exception:", ex);
  } else {
console.warn("Unable to determine version from the {} file: {}",
location, ex.getMessage());
  }
} finally {
  Closeables.closeQuietly(pomPropertiesStream);
}

return version;
  }

On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira  wrote:
> In Sqoop we do the following:
>
> Maven runs a shell script, passing the version as a parameter.
> The shell-script generates a small java class, which is then built with a
> Maven plugin.
> Our code references this generated class when we expose "getVersion()".
>
> Its complex and ugly, so I'm kind of hoping that there's a better way to do
> it :)
>
> Gwen
>
> On Tue, Nov 11, 2014 at 9:42 AM, Jun Rao  wrote:
>
>> Currently, the version number is only stored in our build config file,
>> gradle.properties. Not sure how we can automatically extract it and expose
>> it in an mbean. How do other projects do this?
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Nov 11, 2014 at 7:05 AM, Otis Gospodnetic <
>> otis.gospodne...@gmail.com> wrote:
>>
>> > Hi Jun,
>> >
>> > Sounds good.  But is the version number stored anywhere from where it
>> could
>> > be gotten?
>> >
>> > Thanks,
>> > Otis
>> > --
>> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> > Solr & Elasticsearch Support * http://sematext.com/
>> >
>> >
>> > On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:
>> >
>> > > Otis,
>> > >
>> > > We don't have an api for that now. We can probably expose this as a JMX
>> > as
>> > > part of kafka-1481.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
>> > > otis.gospodne...@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > Is there a way to detect which version of Kafka one is running?
>> > > > Is there an API for that, or a constant with this value, or maybe an
>> > > MBean
>> > > > or some other way to get to this info?
>> > > >
>> > > > Thanks,
>> > > > Otis
>> > > > --
>> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
>> Management
>> > > > Solr & Elasticsearch Support * http://sematext.com/
>> > > >
>> > >
>> >
>>



-- 
Joey Echeverria


Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Bhavesh Mistry
If is maven artifact then you will get following pre-build property file
from maven build called pom.properties under
/META-INF/maven/groupid/artifactId/pom.properties folder.

Here is sample:
#Generated by Maven
#Mon Oct 10 10:44:31 EDT 2011
version=10.0.1
groupId=com.google.guava
artifactId=guava

Thanks,

Bhavesh

On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira 
wrote:

> In Sqoop we do the following:
>
> Maven runs a shell script, passing the version as a parameter.
> The shell-script generates a small java class, which is then built with a
> Maven plugin.
> Our code references this generated class when we expose "getVersion()".
>
> Its complex and ugly, so I'm kind of hoping that there's a better way to do
> it :)
>
> Gwen
>
> On Tue, Nov 11, 2014 at 9:42 AM, Jun Rao  wrote:
>
> > Currently, the version number is only stored in our build config file,
> > gradle.properties. Not sure how we can automatically extract it and
> expose
> > it in an mbean. How do other projects do this?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Nov 11, 2014 at 7:05 AM, Otis Gospodnetic <
> > otis.gospodne...@gmail.com> wrote:
> >
> > > Hi Jun,
> > >
> > > Sounds good.  But is the version number stored anywhere from where it
> > could
> > > be gotten?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > > On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:
> > >
> > > > Otis,
> > > >
> > > > We don't have an api for that now. We can probably expose this as a
> JMX
> > > as
> > > > part of kafka-1481.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> > > > otis.gospodne...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Is there a way to detect which version of Kafka one is running?
> > > > > Is there an API for that, or a constant with this value, or maybe
> an
> > > > MBean
> > > > > or some other way to get to this info?
> > > > >
> > > > > Thanks,
> > > > > Otis
> > > > > --
> > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > Management
> > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > >
> > > >
> > >
> >
>


Re: expanding cluster and reassigning parititions without restarting producer

2014-11-11 Thread Neha Narkhede
The new producer is available in 0.8.2-beta (the most recent Kafka
release). The old producer only detects new partitions at an interval
configured by topic.metadata.refresh.interval.ms. This constraint is no
longer true for the new producer and you would likely end up with an even
distribution of data across all partitions. If you want to stay with the
old producer on 0.8.1.1, you can try reducing
topic.metadata.refresh.interval.ms but it may have some performance impact
on the Kafka cluster since it ends up sending topic metadata requests to
the broker at that interval.

Thanks,
Neha

On Tue, Nov 11, 2014 at 1:45 AM, Shlomi Hazan  wrote:

> Neha, I understand that the producer kafka.javaapi.producer.Producer shown
> in examples is old,
> and that a new producer (org.apache.kafka.clients.producer) is avail? is it
> available for 0.8.1.1?
> how does it work? does it have a trigger fired when partitions are added or
> does the producer refresh some cache every some given time period?
>
> Shlomi
>
>
> On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede 
> wrote:
>
> > How can I auto refresh keyed producers to use new partitions as these
> > partitions are added?
> >
> > Try using the new producer under org.apache.kafka.clients.producer.
> >
> > Thanks,
> > Neha
> >
> > On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry <
> > mistry.p.bhav...@gmail.com>
> > wrote:
> >
> > > I had different experience with expanding partition for new producer
> and
> > > its impact.  I only tried for non-key message.I would always advice
> > to
> > > keep batch size relatively low or plan for expansion with new java
> > producer
> > > in advance or since inception otherwise running producer code is
> > impacted.
> > >
> > > Here is mail chain:
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E
> > >
> > > Thanks,
> > >
> > > Bhavesh
> > >
> > > On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan 
> wrote:
> > >
> > > > Hmmm..
> > > > The Java producer example seems to ignore added partitions too...
> > > > How can I auto refresh keyed producers to use new partitions as these
> > > > partitions are added?
> > > >
> > > >
> > > > On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan 
> > wrote:
> > > >
> > > > > One more thing:
> > > > > I saw that the Python client is also unaffected by addition of
> > > partitions
> > > > > to a topic and that it continues to send requests only to the old
> > > > > partitions.
> > > > > is this also handled appropriately by the Java producer? Will he
> see
> > > the
> > > > > change and produce to the new partitions as well?
> > > > > Shlomi
> > > > >
> > > > > On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan 
> > > wrote:
> > > > >
> > > > >> No I don't see anything like that, the question was aimed at
> > learning
> > > if
> > > > >> it is worthwhile to make the effort of reimplementing the Python
> > > > producer
> > > > >> in Java, I so I will not make all the effort just to be
> disappointed
> > > > >> afterwards.
> > > > >> understand I have nothing to worry about, so I will try to
> simulate
> > > this
> > > > >> situation in small scale...
> > > > >> maybe 3 brokers, one topic with one partition and then add
> > partitions.
> > > > >> we'll see.
> > > > >> thanks for clarifying.
> > > > >> Oh, Good luck with Confluent!!
> > > > >> :)
> > > > >>
> > > > >> On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede <
> > > neha.narkh...@gmail.com
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >>> The producer might get an error code if the leader of the
> > partitions
> > > > >>> being
> > > > >>> reassigned also changes. However it should retry and succeed. Do
> > you
> > > > see
> > > > >>> a
> > > > >>> behavior that suggests otherwise?
> > > > >>>
> > > > >>> On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan 
> > > > wrote:
> > > > >>>
> > > > >>> > Hi All,
> > > > >>> > I recently had an issue producing from python where expanding a
> > > > cluster
> > > > >>> > from 3 to 5 nodes and reassigning partitions forced me to
> restart
> > > the
> > > > >>> > producer b/c of KeyError thrown.
> > > > >>> > Is this situation handled by the Java producer automatically or
> > > need
> > > > I
> > > > >>> do
> > > > >>> > something to have the java producer refresh itself to see the
> > > > >>> reassigned
> > > > >>> > partition layout and produce away ?
> > > > >>> > Shlomi
> > > > >>> >
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>


Re: Cannot connect to Kafka from outside of EC2

2014-11-11 Thread Sameer Yami
Is it ok, if I send you directly?

On Tue, Nov 11, 2014 at 10:17 AM, Guozhang Wang  wrote:

> Hi Sameer,
>
> I think apache mailing list has blocked your attachment. If it is too long
> to include in the email body could you paste it somewhere and give me the
> link?
>
> Guozhang
>
> On Tue, Nov 11, 2014 at 10:01 AM, Sameer Yami  wrote:
>
> > Hi Guozhang,
> >
> > I was wondering if you found anything wrong in the logs/
> >
> > thanks
> >
> >
> > On Fri, Nov 7, 2014 at 4:19 PM, Sameer Yami  wrote:
> >
> > > Hi Guozhang,
> > >
> > > Attached are the two logs with debug enabled.
> > >
> > > Thanks!
> > >
> > > On Fri, Nov 7, 2014 at 2:09 PM, Sameer Yami  wrote:
> > >
> > >> The version is kafka_2.10-0.8.1.1. It is not the latest trunk.
> > >> Will try enabling debug version.
> > >>
> > >> thanks
> > >>
> > >>
> > >> On Thu, Nov 6, 2014 at 9:37 PM, Guozhang Wang 
> > wrote:
> > >>
> > >>> Sameer,
> > >>>
> > >>> The server logs do not contain any non-INFO logs, which is a bit
> wired.
> > >>> Did
> > >>> you deploy the current trunk of Kafka? Also could you enable DEBUG
> > level
> > >>> logging on Kafka brokers?
> > >>>
> > >>> Guozhang
> > >>>
> > >>> On Wed, Nov 5, 2014 at 3:50 PM, Sameer Yami 
> wrote:
> > >>>
> > >>> > The server.log was taken separately.
> > >>> > We ran the test again and the server and producer logs are below
> (to
> > >>> get
> > >>> > same timings).
> > >>> >
> > >>> >
> > >>> > Thanks!
> > >>> >
> > >>> >
> > >>> >
> > >>>
> >
> 
> > >>> >
> > >>> >
> > >>> >
> > >>> > Producer Logs -
> > >>> >
> > >>> >
> > >>> > 2014-11-05 23:38:58,693
> > >>> >
> Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> > >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> > >>> sessionid:
> > >>> > 0x1498251e8680002 after 0ms
> > >>> > 2014-11-05 23:39:00,695
> > >>> >
> Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> > >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> > >>> sessionid:
> > >>> > 0x1498251e8680002 after 0ms
> > >>> > 2014-11-05 23:39:02,696
> > >>> >
> Thread-3-SendThread(ip-172-31-25-198.us-west-1.compute.internal:2181)
> > >>> > DEBUG org.apache.zookeeper.ClientCnxn-759: Got ping response for
> > >>> sessionid:
> > >>> > 0x1498251e8680002 after 0ms
> > >>> > 2014-11-05 23:39:02,828 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Verifying properties
> > >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Property
> > auto.commit.interval.ms
> > >>> is
> > >>> > overridden to 1000
> > >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Property auto.offset.reset is
> > >>> > overridden to smallest
> > >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Property consumer.timeout.ms
> is
> > >>> > overridden to 10
> > >>> > 2014-11-05 23:39:02,829 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Property group.id is
> overridden
> > >>> to
> > >>> > TestCheck
> > >>> > 2014-11-05 23:39:02,830 pool-13-thread-2   WARN
> > >>> > kafka.utils.VerifiableProperties-83: Property serializer.class is
> not
> > >>> valid
> > >>> > 2014-11-05 23:39:02,830 pool-13-thread-2   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Property zookeeper.connect is
> > >>> > overridden to 172.31.25.198:2181
> > >>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
> > >>> > kafka.consumer.ZookeeperConsumerConnector-68:
> > >>> > [TestCheck_ip-172-31-25-198-1415230742830-f3dfc362], Connecting to
> > >>> > zookeeper instance at 172.31.25.198:2181
> > >>> > 2014-11-05 23:39:02,831 pool-13-thread-2  DEBUG
> > >>> > org.I0Itec.zkclient.ZkConnection-63: Creating new ZookKeeper
> instance
> > >>> to
> > >>> > connect to 172.31.25.198:2181.
> > >>> > 2014-11-05 23:39:02,831 pool-13-thread-2   INFO
> > >>> > org.apache.zookeeper.ZooKeeper-379: Initiating client connection,
> > >>> > connectString=172.31.25.198:2181 sessionTimeout=6000
> > >>> > watcher=org.I0Itec.zkclient.ZkClient@3903b165
> > >>> > 2014-11-05 23:39:02,831 ZkClient-EventThread-29-172.31.25.198:2181
> > >>>  INFO
> > >>> > org.I0Itec.zkclient.ZkEventThread-64: Starting ZkClient event
> thread.
> > >>> > 2014-11-05 23:39:02,831 pool-13-thread-1   INFO
> > >>> > kafka.utils.VerifiableProperties-68: Verifying properties
> > >>> > 2014-11-05 23:39:02,836 pool-13-thread-2-SendThread()   INFO
> > >>> > org.apache.zookeeper.ClientCnxn-1061: Opening socket connection to
> > >>> server /
> > >>> > 172.31.25.198:2181
> > >>> > 2014-11-05 23:39:02,836 pool-13-thread-1   WARN
> > >>> > kafka.utils.VerifiableProperties-83: Property batch.size is not
> valid
> > >>> > 2014-11-05 23:39:02,832 pool-13-thread-2  DEBUG
> > >>> > org.I0Itec.zkclient.ZkClient-878: Awaiting connection to Zookeeper
> > >>> server
> >

Re: spikes in producer requests/sec

2014-11-11 Thread Magnus Edenhill
Hi Wes,

are you monitoring librdkafka statistics as well?
If so, are there any correlating spikes in the per-broker and per-partition
statistics?
Such as:
 - brokers..rtt.avg  <--- broker round-trip-time (latency)
 - brokers..waitresp_cnt  <-- requests in flight
 - topics..partitions..msgq_cnt   <-- internal message queue
 - topics..partitions..xmitq_cnt   <-- transmit queue

Regards,
Magnus



2014-11-11 19:50 GMT+01:00 Wes Chow :

>
> We're seeing periodic spikes in req/sec rates across our nodes. Our
> cluster is 10 nodes, and the topic has a replication factor of 3. We push
> around 200k messages / sec into Kafka.
>
>
> The machines are running the most recent version of Kafka and we're
> connecting via librdkafka. pingstream02-10 are using the CMS garbage
> collector, but I switched pingstream01 to use G1GC under the theory that
> maybe these were GC pauses. The graph shows that likely didn't improve the
> situation.
>
> My next thought is that maybe this is the effect of log rolling. Checking
> in the logs, I see a lot of this:
>
> [2014-11-11 13:46:45,836] 72952071 [ReplicaFetcherThread-0-7] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-342' in 3 ms.
> [2014-11-11 13:46:47,116] 72953351 [kafka-request-handler-0] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-186' in 2 ms.
> [2014-11-11 13:46:48,155] 72954390 [ReplicaFetcherThread-0-8] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-253' in 3 ms.
> [2014-11-11 13:46:48,408] 72954643 [ReplicaFetcherThread-0-4] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-209' in 3 ms.
> [2014-11-11 13:46:48,436] 72954671 [ReplicaFetcherThread-0-4] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-299' in 2 ms.
> [2014-11-11 13:46:48,687] 72954922 [kafka-request-handler-0] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-506' in 2 ms.
>
> The "pings" topic in question has 512 partitions, so it does this 512
> times every so often. We have an effective retention period of a bit less
> than 30 min, so rolling happens pretty frequently. Still, if I assume worst
> case that rolling locks up the process for 2ms and there are 512 rolls
> every few minutes, I'd expect halting to happen for about a second at a
> time. The graphs seem to indicate much longer dips, but it's hard for me to
> know if I'm looking at real data or some sort of artifact.
>
> Fwiw, the producers are not reporting any errors, so it does not seem like
> we're losing data.
>
> I'm new to Kafka. Should I be worried? If so, how should I be debugging
> this?
>
> Thanks,
> Wes
>
>


Re: Programmatic Kafka version detection/extraction?

2014-11-11 Thread Gwen Shapira
So it looks like we can use Gradle to add properties to manifest file and
then use getResourceAsStream to read the file and parse it.

The Gradle part would be something like:
jar.manifest {
attributes('Implementation-Title': project.name,
'Implementation-Version': project.version,
'Built-By': System.getProperty('user.name'),
'Built-JDK': System.getProperty('java.version'),
'Built-Host': getHostname(),
'Source-Compatibility': project.sourceCompatibility,
'Target-Compatibility': project.targetCompatibility
)
}

The code part would be:
this.getClass().getClassLoader().getResourceAsStream("/META-INF/MANIFEST.MF")

Does that look like the right approach?

Gwen

On Tue, Nov 11, 2014 at 10:43 AM, Bhavesh Mistry  wrote:

> If is maven artifact then you will get following pre-build property file
> from maven build called pom.properties under
> /META-INF/maven/groupid/artifactId/pom.properties folder.
>
> Here is sample:
> #Generated by Maven
> #Mon Oct 10 10:44:31 EDT 2011
> version=10.0.1
> groupId=com.google.guava
> artifactId=guava
>
> Thanks,
>
> Bhavesh
>
> On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira 
> wrote:
>
> > In Sqoop we do the following:
> >
> > Maven runs a shell script, passing the version as a parameter.
> > The shell-script generates a small java class, which is then built with a
> > Maven plugin.
> > Our code references this generated class when we expose "getVersion()".
> >
> > Its complex and ugly, so I'm kind of hoping that there's a better way to
> do
> > it :)
> >
> > Gwen
> >
> > On Tue, Nov 11, 2014 at 9:42 AM, Jun Rao  wrote:
> >
> > > Currently, the version number is only stored in our build config file,
> > > gradle.properties. Not sure how we can automatically extract it and
> > expose
> > > it in an mbean. How do other projects do this?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Nov 11, 2014 at 7:05 AM, Otis Gospodnetic <
> > > otis.gospodne...@gmail.com> wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > Sounds good.  But is the version number stored anywhere from where it
> > > could
> > > > be gotten?
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > > >
> > > > On Tue, Nov 11, 2014 at 12:45 AM, Jun Rao  wrote:
> > > >
> > > > > Otis,
> > > > >
> > > > > We don't have an api for that now. We can probably expose this as a
> > JMX
> > > > as
> > > > > part of kafka-1481.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Nov 10, 2014 at 7:17 PM, Otis Gospodnetic <
> > > > > otis.gospodne...@gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Is there a way to detect which version of Kafka one is running?
> > > > > > Is there an API for that, or a constant with this value, or maybe
> > an
> > > > > MBean
> > > > > > or some other way to get to this info?
> > > > > >
> > > > > > Thanks,
> > > > > > Otis
> > > > > > --
> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > > Management
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: spikes in producer requests/sec

2014-11-11 Thread Jay Kreps
There are some fixes in 0.8.2-beta for periodic latency spikes if you are
using acks=-1 in the producer.

-Jay

On Tue, Nov 11, 2014 at 10:50 AM, Wes Chow  wrote:

>
> We're seeing periodic spikes in req/sec rates across our nodes. Our
> cluster is 10 nodes, and the topic has a replication factor of 3. We push
> around 200k messages / sec into Kafka.
>
>
> The machines are running the most recent version of Kafka and we're
> connecting via librdkafka. pingstream02-10 are using the CMS garbage
> collector, but I switched pingstream01 to use G1GC under the theory that
> maybe these were GC pauses. The graph shows that likely didn't improve the
> situation.
>
> My next thought is that maybe this is the effect of log rolling. Checking
> in the logs, I see a lot of this:
>
> [2014-11-11 13:46:45,836] 72952071 [ReplicaFetcherThread-0-7] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-342' in 3 ms.
> [2014-11-11 13:46:47,116] 72953351 [kafka-request-handler-0] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-186' in 2 ms.
> [2014-11-11 13:46:48,155] 72954390 [ReplicaFetcherThread-0-8] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-253' in 3 ms.
> [2014-11-11 13:46:48,408] 72954643 [ReplicaFetcherThread-0-4] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-209' in 3 ms.
> [2014-11-11 13:46:48,436] 72954671 [ReplicaFetcherThread-0-4] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-299' in 2 ms.
> [2014-11-11 13:46:48,687] 72954922 [kafka-request-handler-0] INFO
> kafka.log.Log  - Rolled new log segment for 'pings-506' in 2 ms.
>
> The "pings" topic in question has 512 partitions, so it does this 512
> times every so often. We have an effective retention period of a bit less
> than 30 min, so rolling happens pretty frequently. Still, if I assume worst
> case that rolling locks up the process for 2ms and there are 512 rolls
> every few minutes, I'd expect halting to happen for about a second at a
> time. The graphs seem to indicate much longer dips, but it's hard for me to
> know if I'm looking at real data or some sort of artifact.
>
> Fwiw, the producers are not reporting any errors, so it does not seem like
> we're losing data.
>
> I'm new to Kafka. Should I be worried? If so, how should I be debugging
> this?
>
> Thanks,
> Wes
>
>


Re: expanding cluster and reassigning parititions without restarting producer

2014-11-11 Thread Jun Rao
Just to extend what Neha said. The new producer also picks up the new
partitions by refreshing the metadata periodically (controlled
metadata.max.age.ms). The new producer distributes the data more evenly to
all partitions than the old producer.

Thanks,

Jun

On Tue, Nov 11, 2014 at 11:19 AM, Neha Narkhede 
wrote:

> The new producer is available in 0.8.2-beta (the most recent Kafka
> release). The old producer only detects new partitions at an interval
> configured by topic.metadata.refresh.interval.ms. This constraint is no
> longer true for the new producer and you would likely end up with an even
> distribution of data across all partitions. If you want to stay with the
> old producer on 0.8.1.1, you can try reducing
> topic.metadata.refresh.interval.ms but it may have some performance impact
> on the Kafka cluster since it ends up sending topic metadata requests to
> the broker at that interval.
>
> Thanks,
> Neha
>
> On Tue, Nov 11, 2014 at 1:45 AM, Shlomi Hazan  wrote:
>
> > Neha, I understand that the producer kafka.javaapi.producer.Producer
> shown
> > in examples is old,
> > and that a new producer (org.apache.kafka.clients.producer) is avail? is
> it
> > available for 0.8.1.1?
> > how does it work? does it have a trigger fired when partitions are added
> or
> > does the producer refresh some cache every some given time period?
> >
> > Shlomi
> >
> >
> > On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede 
> > wrote:
> >
> > > How can I auto refresh keyed producers to use new partitions as these
> > > partitions are added?
> > >
> > > Try using the new producer under org.apache.kafka.clients.producer.
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry <
> > > mistry.p.bhav...@gmail.com>
> > > wrote:
> > >
> > > > I had different experience with expanding partition for new producer
> > and
> > > > its impact.  I only tried for non-key message.I would always
> advice
> > > to
> > > > keep batch size relatively low or plan for expansion with new java
> > > producer
> > > > in advance or since inception otherwise running producer code is
> > > impacted.
> > > >
> > > > Here is mail chain:
> > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E
> > > >
> > > > Thanks,
> > > >
> > > > Bhavesh
> > > >
> > > > On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan 
> > wrote:
> > > >
> > > > > Hmmm..
> > > > > The Java producer example seems to ignore added partitions too...
> > > > > How can I auto refresh keyed producers to use new partitions as
> these
> > > > > partitions are added?
> > > > >
> > > > >
> > > > > On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan 
> > > wrote:
> > > > >
> > > > > > One more thing:
> > > > > > I saw that the Python client is also unaffected by addition of
> > > > partitions
> > > > > > to a topic and that it continues to send requests only to the old
> > > > > > partitions.
> > > > > > is this also handled appropriately by the Java producer? Will he
> > see
> > > > the
> > > > > > change and produce to the new partitions as well?
> > > > > > Shlomi
> > > > > >
> > > > > > On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan 
> > > > wrote:
> > > > > >
> > > > > >> No I don't see anything like that, the question was aimed at
> > > learning
> > > > if
> > > > > >> it is worthwhile to make the effort of reimplementing the Python
> > > > > producer
> > > > > >> in Java, I so I will not make all the effort just to be
> > disappointed
> > > > > >> afterwards.
> > > > > >> understand I have nothing to worry about, so I will try to
> > simulate
> > > > this
> > > > > >> situation in small scale...
> > > > > >> maybe 3 brokers, one topic with one partition and then add
> > > partitions.
> > > > > >> we'll see.
> > > > > >> thanks for clarifying.
> > > > > >> Oh, Good luck with Confluent!!
> > > > > >> :)
> > > > > >>
> > > > > >> On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede <
> > > > neha.narkh...@gmail.com
> > > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> The producer might get an error code if the leader of the
> > > partitions
> > > > > >>> being
> > > > > >>> reassigned also changes. However it should retry and succeed.
> Do
> > > you
> > > > > see
> > > > > >>> a
> > > > > >>> behavior that suggests otherwise?
> > > > > >>>
> > > > > >>> On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan <
> shl...@viber.com>
> > > > > wrote:
> > > > > >>>
> > > > > >>> > Hi All,
> > > > > >>> > I recently had an issue producing from python where
> expanding a
> > > > > cluster
> > > > > >>> > from 3 to 5 nodes and reassigning partitions forced me to
> > restart
> > > > the
> > > > > >>> > producer b/c of KeyError thrown.
> > > > > >>> > Is this situation handled by the Java producer automatically
> or
> > > > need
> > > > > I
> > > > > >>> do
> > > > > >>> > something to have the java producer refresh itself to see the
> > > > > >>> reassigned
> > > > > >>>

Re: No longer supporting Java 6, if? when?

2014-11-11 Thread Gwen Shapira
Perhaps relevant:

Hadoop is moving toward dropping Java6 in next release.
https://issues.apache.org/jira/browse/HADOOP-10530


On Thu, Nov 6, 2014 at 11:03 AM, Jay Kreps  wrote:

> Yeah it is a little bit silly that people are still using Java 6.
>
> I guess this is a tradeoff--being more conservative in our java support
> means more people can use our software, whereas upgrading gives us
> developers a better experience since we aren't stuck with ancient stuff.
>
> Nonetheless I would argue for being a bit conservative here. Sadly a
> shocking number of people are still using Java 6. The Kafka clients get
> embedded in applications all over the place, and likely having even one
> application not yet upgraded would block adopting the new Kafka version
> that dropped java 6 support. So unless there is something in Java 7 we
> really really want I think it might be good to hold out a bit.
>
> As an example we dropped java 6 support in Samza and immediately had people
> blocked by that, and unlike the Kafka clients, Samza use is pretty
> centralized.
>
> -Jay
>
> On Wed, Nov 5, 2014 at 5:32 PM, Joe Stein  wrote:
>
> > This has been coming up in a lot of projects and for other reasons too I
> > wanted to kick off the discussion about if/when we end support for Java
> 6.
> > Besides any API we may want to use in >= 7 we also compile our binaries
> for
> > 6 for release currently.
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
>


Security in 0.8.2 beta

2014-11-11 Thread Kashyap Mhaisekar
Hi,
Is there a way to secure the topics created in Kafka 0.8.2 beta? The need
is to ensure no one is asked to read data from the topic without
authorization.

Regards
Kashyap


Re: Security in 0.8.2 beta

2014-11-11 Thread Gwen Shapira
Nope.

Here's the JIRA where we are still actively working on security, targeting
0.9:
https://issues.apache.org/jira/browse/KAFKA-1682

Gwen

On Tue, Nov 11, 2014 at 7:37 PM, Kashyap Mhaisekar 
wrote:

> Hi,
> Is there a way to secure the topics created in Kafka 0.8.2 beta? The need
> is to ensure no one is asked to read data from the topic without
> authorization.
>
> Regards
> Kashyap
>


Re: expanding cluster and reassigning parititions without restarting producer

2014-11-11 Thread Shlomi Hazan
Understood.
Thank you guys.

On Wed, Nov 12, 2014 at 4:48 AM, Jun Rao  wrote:

> Just to extend what Neha said. The new producer also picks up the new
> partitions by refreshing the metadata periodically (controlled
> metadata.max.age.ms). The new producer distributes the data more evenly to
> all partitions than the old producer.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 11, 2014 at 11:19 AM, Neha Narkhede 
> wrote:
>
> > The new producer is available in 0.8.2-beta (the most recent Kafka
> > release). The old producer only detects new partitions at an interval
> > configured by topic.metadata.refresh.interval.ms. This constraint is no
> > longer true for the new producer and you would likely end up with an even
> > distribution of data across all partitions. If you want to stay with the
> > old producer on 0.8.1.1, you can try reducing
> > topic.metadata.refresh.interval.ms but it may have some performance
> impact
> > on the Kafka cluster since it ends up sending topic metadata requests to
> > the broker at that interval.
> >
> > Thanks,
> > Neha
> >
> > On Tue, Nov 11, 2014 at 1:45 AM, Shlomi Hazan  wrote:
> >
> > > Neha, I understand that the producer kafka.javaapi.producer.Producer
> > shown
> > > in examples is old,
> > > and that a new producer (org.apache.kafka.clients.producer) is avail?
> is
> > it
> > > available for 0.8.1.1?
> > > how does it work? does it have a trigger fired when partitions are
> added
> > or
> > > does the producer refresh some cache every some given time period?
> > >
> > > Shlomi
> > >
> > >
> > > On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede <
> neha.narkh...@gmail.com>
> > > wrote:
> > >
> > > > How can I auto refresh keyed producers to use new partitions as these
> > > > partitions are added?
> > > >
> > > > Try using the new producer under org.apache.kafka.clients.producer.
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > > On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry <
> > > > mistry.p.bhav...@gmail.com>
> > > > wrote:
> > > >
> > > > > I had different experience with expanding partition for new
> producer
> > > and
> > > > > its impact.  I only tried for non-key message.I would always
> > advice
> > > > to
> > > > > keep batch size relatively low or plan for expansion with new java
> > > > producer
> > > > > in advance or since inception otherwise running producer code is
> > > > impacted.
> > > > >
> > > > > Here is mail chain:
> > > > >
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Bhavesh
> > > > >
> > > > > On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan 
> > > wrote:
> > > > >
> > > > > > Hmmm..
> > > > > > The Java producer example seems to ignore added partitions too...
> > > > > > How can I auto refresh keyed producers to use new partitions as
> > these
> > > > > > partitions are added?
> > > > > >
> > > > > >
> > > > > > On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan  >
> > > > wrote:
> > > > > >
> > > > > > > One more thing:
> > > > > > > I saw that the Python client is also unaffected by addition of
> > > > > partitions
> > > > > > > to a topic and that it continues to send requests only to the
> old
> > > > > > > partitions.
> > > > > > > is this also handled appropriately by the Java producer? Will
> he
> > > see
> > > > > the
> > > > > > > change and produce to the new partitions as well?
> > > > > > > Shlomi
> > > > > > >
> > > > > > > On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan <
> shl...@viber.com>
> > > > > wrote:
> > > > > > >
> > > > > > >> No I don't see anything like that, the question was aimed at
> > > > learning
> > > > > if
> > > > > > >> it is worthwhile to make the effort of reimplementing the
> Python
> > > > > > producer
> > > > > > >> in Java, I so I will not make all the effort just to be
> > > disappointed
> > > > > > >> afterwards.
> > > > > > >> understand I have nothing to worry about, so I will try to
> > > simulate
> > > > > this
> > > > > > >> situation in small scale...
> > > > > > >> maybe 3 brokers, one topic with one partition and then add
> > > > partitions.
> > > > > > >> we'll see.
> > > > > > >> thanks for clarifying.
> > > > > > >> Oh, Good luck with Confluent!!
> > > > > > >> :)
> > > > > > >>
> > > > > > >> On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede <
> > > > > neha.narkh...@gmail.com
> > > > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> The producer might get an error code if the leader of the
> > > > partitions
> > > > > > >>> being
> > > > > > >>> reassigned also changes. However it should retry and succeed.
> > Do
> > > > you
> > > > > > see
> > > > > > >>> a
> > > > > > >>> behavior that suggests otherwise?
> > > > > > >>>
> > > > > > >>> On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan <
> > shl...@viber.com>
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>> > Hi All,
> > > > > > >>> > I recently had an issue producing from pyth

Re: Security in 0.8.2 beta

2014-11-11 Thread Mathias Herberts
Simply encrypt your messages with a PSK between producers and consumers.
On Nov 12, 2014 4:38 AM, "Kashyap Mhaisekar"  wrote:

> Hi,
> Is there a way to secure the topics created in Kafka 0.8.2 beta? The need
> is to ensure no one is asked to read data from the topic without
> authorization.
>
> Regards
> Kashyap
>


Re: Security in 0.8.2 beta

2014-11-11 Thread Joe Stein
I know a few implements that do this "encrypt your messages with a PSK
between producers and consumers". One of them actually writes the
"encrypted " on a different topic foreach downstream
consumer private key that can read the message. This way when you are
consuming you consume from two topics 1) the topic with the message (which
is encrypted) you want 2) the topic that you can use your private key
to decrypt (because your public key was used) the symmetric key and then
use that to decrypt the message (which you join from the two streams by the
uuid so each message has a different secrete key encrypted with your public
key) The other ones I can't talk about =8^) but this one I mention is
interesting solution to this problem with Kafka I really like.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/

On Wed, Nov 12, 2014 at 2:41 AM, Mathias Herberts <
mathias.herbe...@gmail.com> wrote:

> Simply encrypt your messages with a PSK between producers and consumers.
> On Nov 12, 2014 4:38 AM, "Kashyap Mhaisekar"  wrote:
>
> > Hi,
> > Is there a way to secure the topics created in Kafka 0.8.2 beta? The need
> > is to ensure no one is asked to read data from the topic without
> > authorization.
> >
> > Regards
> > Kashyap
> >
>