date:20130107

Best practices for changing partition numbers

2013-01-07 Thread David Ross

Hello, We have found that, for our application, having a number of total partitions as a multiple of the number of consumer hosts is beneficial. Because of this, whenever we add or remove consumer hosts, we have to change the number of partitions in the server config. What are best practices for

LinkedIn's Kafka->Hadoop ETL pipeline is open source

2013-01-07 Thread Jay Kreps

Hey All, There has been interesting in getting something a little more sophisticated then the Input- and OutputFormat we include in contrib for reading Kafka data into HDFS. Internally at LinkedIn we have had a pretty sophisticated system that we use for Kafka ETL. It automatically discovers topi

Re: ETL with Kafka

2013-01-07 Thread Ken Krugler

On Jan 7, 2013, at 2:05pm, Russell Jurney wrote: > I previously posted a link to contrib in this thread. Thanks, I missed that - all I saw was the long URL to the Talend integration doc on Hortonworks. > No, its not a > cascading tap. Its a complete job. One to read kafka events to hdfs, one t

Re: ETL with Kafka

2013-01-07 Thread Russell Jurney

I previously posted a link to contrib in this thread. No, its not a cascading tap. Its a complete job. One to read kafka events to hdfs, one to generate kafka events from hdfs. ETL can happen in between. On Jan 7, 2013 1:51 PM, "Ken Krugler" wrote: > Hi Russell, > > On Jan 7, 2013, at 12:48pm, Ru

Re: ETL with Kafka

2013-01-07 Thread Ken Krugler

Hi Russell, On Jan 7, 2013, at 12:48pm, Russell Jurney wrote: > Just to be clear - a Kafka 'Tap' of sorts exists in contrib: it scans > Hadoop records, which may be ETL'd first, and emits new Kafka events. Can you point me at the code? And just to confirm, you're talking about a Cascading Tap,

Re: Can't start Kafka server with 0.8.0.

2013-01-07 Thread Jason Huang

OK. thanks! Jason On Mon, Jan 7, 2013 at 3:55 PM, Neha Narkhede wrote: > Jason, > > In 0.8, we changed the zookeeper data structures as well. You might want to > either use a new zk namespace or delete all of your zk data and restart 0.8. > > Thanks, > Neha > > > On Mon, Jan 7, 2013 at 12:28 PM

Re: Can't start Kafka server with 0.8.0.

2013-01-07 Thread Neha Narkhede

Jason, In 0.8, we changed the zookeeper data structures as well. You might want to either use a new zk namespace or delete all of your zk data and restart 0.8. Thanks, Neha On Mon, Jan 7, 2013 at 12:28 PM, Jason Huang wrote: > Never mind - I was able to start the server after removing the > p

Re: ETL with Kafka

2013-01-07 Thread Russell Jurney

Just to be clear - a Kafka 'Tap' of sorts exists in contrib: it scans Hadoop records, which may be ETL'd first, and emits new Kafka events. On Mon, Jan 7, 2013 at 9:57 AM, Ken Krugler wrote: > Hi Guy, > > On Jan 6, 2013, at 11:11pm, Guy Doulberg wrote: > > > Hi, > > Thanks David, > > > > I am lo

Re: Can't start Kafka server with 0.8.0.

2013-01-07 Thread Jason Huang

Never mind - I was able to start the server after removing the previous installed 0.7.2 instance of Kafka. Jason On Mon, Jan 7, 2013 at 2:56 PM, Jason Huang wrote: > Hello, > > I am trying out Kafka 0.8 using only one broker and I am unable to > start the server. > > With the instruction from th

Can't start Kafka server with 0.8.0.

2013-01-07 Thread Jason Huang

Hello, I am trying out Kafka 0.8 using only one broker and I am unable to start the server. With the instruction from this link - https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.8+Quick+Start, I was able to download and install 0.8. Since I only have one machine, I did the following co

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

2013-01-07 Thread Felix GV

OOOH that's awesome :D !! I'll take a look at this shiny stuff right away! Thanks a lot :D !! -- Felix On Mon, Jan 7, 2013 at 2:28 PM, Neha Narkhede wrote: > > Finally, I haven't seen anything mentioned about the LinkedIn > > kafka/avro/hadoop ETL stuff we've been hearing about for a while. >

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

2013-01-07 Thread Neha Narkhede

> Finally, I haven't seen anything mentioned about the LinkedIn > kafka/avro/hadoop ETL stuff we've been hearing about for a while. > The LinkedIn ETL kafka/avro/hadoop project is open sourced. See here - https://github.com/linkedin/camus/wiki/Camus-Overview Thanks, Neha

Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

2013-01-07 Thread Felix GV

Hello all, I haven't been reading the list for the past couple weeks, I've quite busy... but I've searched and didn't find any discussions related to my current issue, so I thought I'd ask while I'm still investigating on my own...! We've been running a Kafka 0.7.0 cluster without problem for a w

Re: Graceful termination of kafka broker after draining all the data consumed

2013-01-07 Thread Bae, Jae Hyeon

0.8 sounds really great! OK, I will try after you release stable build of 0.8 Thank you Best, Jae On Sun, Jan 6, 2013 at 10:36 AM, Neha Narkhede wrote: > In 0.8, we will provide a way for your to shutdown the broker in a > controlled fashion. What that would include is moving all the leaders aw

Re: Consumer rebalance per topic

2013-01-07 Thread Pablo Barrera González

Thank you Jun and Neha I was trying to avoid adding more partitions. I have enough partitions if you count all partitions in all topics. I understand the problem with different data load per topic but the current schema does not solve this problem either so we shouldn't be worse is we consider all

Re: Kafka 0.8 - KeyedMessage?

2013-01-07 Thread Jason Huang

I see. This makes sense. thanks Neha, Jason On Mon, Jan 7, 2013 at 1:52 PM, Neha Narkhede wrote: > Jason, > > If you specify a key for a message but do not explicitly wire in a > partitioner, messages with the same key will still land up in the same > partition. This is because we use a defaul

Re: Kafka 0.8 - KeyedMessage?

2013-01-07 Thread Neha Narkhede

Jason, If you specify a key for a message but do not explicitly wire in a partitioner, messages with the same key will still land up in the same partition. This is because we use a default partitioner that does a simple hash(key) % num_partitions. Thanks, Neha On Mon, Jan 7, 2013 at 9:30 AM, Ja

Re: Consumer rebalance per topic

2013-01-07 Thread Neha Narkhede

Pablo, That is a good suggestion. Ideally, the partitions across all topics should be distributed evenly across consumer streams instead of a per-topic based decision. There is no particular advantage to the current scheme of per-topic rebalancing that I can think of. Would you mind filing a JIRA

Re: ETL with Kafka

2013-01-07 Thread Ken Krugler

Hi Guy, On Jan 6, 2013, at 11:11pm, Guy Doulberg wrote: > Hi, > Thanks David, > > I am looking for a product (open source or not), something like Talend or > Pentaho that in which I can design the ETL (from and to kafka), and run the > the ETL in Storm/ IronCount or even maybe I can run it in

Re: Kafka 0.8 - KeyedMessage?

2013-01-07 Thread Jason Huang

Jun, Thanks for the response. If I understand you correctly, messages with the same key will not be automatically stored at the same partition unless I implement a partition function to route the message based on the key? The quick start guide for 0.7 has the following: "Send a message with a par

Re: Consumer rebalance per topic

2013-01-07 Thread Jun Rao

Pablo, Currently, partition is the smallest unit that we distribute data among consumers (in the same consumer group). So, if the # of consumers is larger than the total number of partitions in a Kafka cluster (across all brokers), some consumers will never get any data. Such a decision is done on

Re: Kafka 0.8 - KeyedMessage?

2013-01-07 Thread Jun Rao

Jason, In 0.8, each message can optionally have a key. The key is retained as part of the message and will be stored in the broker. One can design a partition function to route the message based on the key. The default partitioner ignores the key and selects a partition at random. Thanks, Jun O

Consumer rebalance per topic

2013-01-07 Thread Pablo Barrera González

Hello We are starting to use Kafka in production but we found an unexpected (at least for me) behavior with the use of partitions. We have a bunch of topics with a few partitions each. We try to consume all data from several consumers (just one consumer group). The problem is in the rebalance ste

Kafka 0.8 - KeyedMessage?

2013-01-07 Thread Jason Huang

Hello, I did some search on the web but couldn't find any documentation for 0.8 so I am trying to ask here: KeyedMessage is introduced in 0.8.0: class KeyedMessage[K, V](val topic: String, val key: K, val message: V) Does the parameter "key" = "partition key"? If I build a KeyedMessage with a s

Best practices for changing partition numbers

LinkedIn's Kafka->Hadoop ETL pipeline is open source

Re: ETL with Kafka

Re: ETL with Kafka

Re: ETL with Kafka

Re: Can't start Kafka server with 0.8.0.

Re: Can't start Kafka server with 0.8.0.

Re: ETL with Kafka

Re: Can't start Kafka server with 0.8.0.

Can't start Kafka server with 0.8.0.

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

Re: Graceful termination of kafka broker after draining all the data consumed

Re: Consumer rebalance per topic

Re: Kafka 0.8 - KeyedMessage?

Re: Kafka 0.8 - KeyedMessage?

Re: Consumer rebalance per topic

Re: ETL with Kafka

Re: Kafka 0.8 - KeyedMessage?

Re: Consumer rebalance per topic

Re: Kafka 0.8 - KeyedMessage?

Consumer rebalance per topic

Kafka 0.8 - KeyedMessage?

24 matches

Site Navigation

Mail list logo

Footer information