Re: Kafka 0.8 Maven and IntelliJ

2013-08-07 Thread Florin Trofin
An update on this issue: I still can't build the 0.8 branch using Maven. My automated build system uses Maven, so I need to get this working. Here are my steps: - Get the latest version of 0.8: > git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka > cd kafka > git checkout -b 0.8

Re: LeaderNotAvailableException

2013-08-07 Thread Tejas Patil
Controlled shutdown transfers the leadership of the partitions from the broker (to be shutdown) to the anyone of the available in-sync replicas for that partition. It won't change the broker-partition assignments so that there would be atleast one live broker which would lead the partition. In you

Re: WELCOME to users@kafka.apache.org

2013-08-07 Thread Joe Stein
see inline /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop / On Wed, Aug 7, 2013 at 2:4

Exception with kafka 0.8

2013-08-07 Thread 小宇
Hi,I'm new to kafka, and i follow the quickstart guide, when i come to the Step 2,i run bin/kafka-server-start.sh config/server.properties , but got exception: [2013-08-06 09:55:14,603] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector) [2013-08-06 09:55:14,657] ERROR Erro

Re: problem with adapter

2013-08-07 Thread sphinx jiang
Sorry, didn't find what should be set from the start page. Here is the newest setting(with brokerlist): log4j.appender.KAFKA=kafka.producer.KafkaLog4jAppender log4j.appender.KAFKA.zkConnect=127.0.0.1:2180 log4j.appender.KAFKA.BrokerList=0:localhost:9092 log4j.appender.KAFKA.SerializerClass=kafka.s

Sending Data from more than one producer

2013-08-07 Thread Yavar Husain
How can I make multiple producers to write data? I have written a producer that produces some data for 15 seconds on a single machine setup. Now when I run another instance of same producer it says the port is in use (which is natural as I think the first producer is sending data using TCP). So it

Re: Sending Data from more than one producer

2013-08-07 Thread Jay Kreps
Can you provide the error message you are getting? -Jay On Wed, Aug 7, 2013 at 2:55 AM, Yavar Husain wrote: > How can I make multiple producers to write data? I have written a producer > that produces some data for 15 seconds on a single machine setup. Now when > I run another instance of same

Re: LeaderNotAvailableException

2013-08-07 Thread Jun Rao
We do have a tool ReassignPartitionsCommand that allows you to move data from one broker to another. It's still being tested and improved. It will be complete in the 0.8 final release. Thanks, Jun On Tue, Aug 6, 2013 at 11:18 PM, Vadim Keylis wrote: > Your assumption is correct. However I was

Re: Exception with kafka 0.8

2013-08-07 Thread Jun Rao
What's the host/port registered under /brokers/ids/[brokerId] in ZK? Thanks, Jun On Wed, Aug 7, 2013 at 1:40 AM, 小宇 wrote: > Hi,I'm new to kafka, and i follow the quickstart guide, when i come to the > Step 2,i run bin/kafka-server-start.sh config/server.properties , but got > exception: > [

Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
I've thrown together a Pig LoadFunc to read data from Kafka, so you could load data like: QUERY_LOGS = load 'kafka://localhost:9092/logs.query#8' using com.mycompany.pig.KafkaAvroLoader('com.mycompany.Query'); The path part of the uri is the Kafka topic, and the fragment is the number of par

Re: problem with adapter

2013-08-07 Thread Jun Rao
Actually, our default ZK port is 2181, not 2180. Could you try the following setting? log4j.appender.KAFKA.zkConnect=127.0.0.1:218 1 Thanks, Jun On Wed, Aug 7, 2013 at 2:13 AM, sphinx jiang wrote: > Sorry, didn't find what should be set from the start page. > > Here i

Re: Reading Kafka directly from Pig?

2013-08-07 Thread Jun Rao
David, That's interesting. Kafka provides an infinite stream of data whereas Pig works on a finite amount of data. How did you solve the mismatch? Thanks, Jun On Wed, Aug 7, 2013 at 7:41 AM, David Arthur wrote: > I've thrown together a Pig LoadFunc to read data from Kafka, so you could > loa

Re: LeaderNotAvailableException

2013-08-07 Thread Vadim Keylis
Jun, What the process now if we want to add and remove servers? How can I recover from error in mean time? When is the final release? Thanks, Vadim On Wed, Aug 7, 2013 at 7:37 AM, Jun Rao wrote: > We do have a tool ReassignPartitionsCommand that allows you to move data > from one broker to an

Lost of messages at C++ Kafka client

2013-08-07 Thread Tianning Zhang
Hi, I am using Kafka (version 0.7.2) for publishing events from some C++ applications (ca. 20K event /sec). The C++ client is used, which is based on asio. With the asynchronous and batched messaging, one issue I have is that in case Kafka server is broken, the client will not notice the socket

Lost of messages at C++ Kafka client

2013-08-07 Thread Tianning Zhang
Hi,   I am using Kafka (version 0.7.2) for publishing events from some C++ applications (ca. 20K event /sec). The C++ client is used, which is based on asio. With the asynchronous and batched messaging, one issue I have is that in case Kafka server is broken, the client will not notice the socket

Re: Reading Kafka directly from Pig?

2013-08-07 Thread Russell Jurney
David, can you share the code on Github so we can take a look? This sounds awesome. Russell Jurney http://datasyndrome.com On Aug 7, 2013, at 7:49 AM, Jun Rao wrote: > David, > > That's interesting. Kafka provides an infinite stream of data whereas Pig > works on a finite amount of data. How di

Re: LeaderNotAvailableException

2013-08-07 Thread Jun Rao
For now, you can still add servers, but only newly created topics will go there. If you just remove a server, you will be down 1 replica. What you can do is to replace a server with a new one by keeping the same broker id. To recover from your error: (1) bring the 3 old brokers back up; (2) bring

Does C++ client support zookeeper based producer load balancing?

2013-08-07 Thread Jan Rudert
Hi, I am starting with kafka. We use version 0.7.2 currently. Does anyone know wether automatic producer load balancing based on zookeeper is supported by the c++ client? Thank you! -- Jan

Re: Lost of messages at C++ Kafka client

2013-08-07 Thread Philip O'Toole
If I understand what you are asking, I have dealt successfully with the same type of issue. It can take more than one Boost async_write() over a broken connection before the client software notices that the connection is gone. The best way to detect if a connection is broken is not by detecting th

Re: Lost of messages at C++ Kafka client

2013-08-07 Thread Jun Rao
Yes, in 0.7, one of the limitations is that the producer doesn't wait for any acknowledgement from the broker. So, if the broker is down, messages still in the client socket buffer may be lost. In 0.8, we added the option so that the producer can wait for an ack, which will help this particular kin

Does C++ client support zookeeper based producer load balancing?

2013-08-07 Thread Jan Rudert
Hi, I am starting with kafka. We use version 0.7.2 currently. Does anyone know wether automatic producer load balancing based on zookeeper is supported by the c++ client? Thank you! -- Jan

Re: LeaderNotAvailableException

2013-08-07 Thread Vadim Keylis
Thanks so much sir. On Wed, Aug 7, 2013 at 8:00 AM, Jun Rao wrote: > For now, you can still add servers, but only newly created topics will go > there. If you just remove a server, you will be down 1 replica. What you > can do is to replace a server with a new one by keeping the same broker id.

Re: Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
Right now it only terminates if SimpleConsumer hits the timeout. So, in theory it can forever. To bound the InputFormat, I would probably add a max time or max number of messages to consume (in addition to the timeout). I started by looking at the Camus code, but it was easier to whip up a sim

Re: Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
I'd be happy to, if and when it becomes a real thing. Still very alpha quality right now On 8/7/13 10:58 AM, Russell Jurney wrote: David, can you share the code on Github so we can take a look? This sounds awesome. Russell Jurney http://datasyndrome.com On Aug 7, 2013, at 7:49 AM, Jun Rao wr

Re: Kafka/Hadoop consumers and producers

2013-08-07 Thread aotto
Hi all, Over at the Wikimedia Foundation, we're trying to figure out the best way to do our ETL from Kafka into Hadoop. We don't currently use Avro and I'm not sure if we are going to. I came across this post. If the plan is to remove the hadoop-consumer from Kafka contrib, do you think we s

Re: Kafka/Hadoop consumers and producers

2013-08-07 Thread Oleg Ruchovets
I am also interested with hadoop+kafka capabilities. I am using kafka 0.7 , so my question : What is the best way to consume contect from kafka and write it to hdfs? At this time I need the only consuming functionality. thanks Oleg. On Wed, Aug 7, 2013 at 7:33 PM, wrote: > Hi all, > > Over at

at-least-once guarantee?

2013-08-07 Thread Yang
I wonder why at-least-once guarantee is easier to maintain than exactly-once (in that the latter requires 2PC while the former does not , according to http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf ) if u achieve at-least-once guarantee, you are a

Re: at-least-once guarantee?

2013-08-07 Thread Jay Kreps
Yeah I'm not sure how good our understanding was when we wrote that. Here is my take now: At least once delivery is not that hard but you need the ability to deduplicate things--basically you turn the "at least once delivery channel" into the "exactly once channel" by throwing away duplicates. Th

Re: at-least-once guarantee?

2013-08-07 Thread Niek Sanders
With at-least-once, you can retry until your target confirms delivery. Trivial. Exactly once means handling all sorts of nasty cases. E.g. delivered but not confirmed by recipient due to crash. Delivered but not yet processed by client because its in incoming queue. (Sending again would mean >

Re: at-least-once guarantee?

2013-08-07 Thread Milind Parikh
Interesting. .. wouldn't the producer sequence grow without bounds, in the first case, even with the simpler non-ha of key assumption, to provide a strict exactly once semantics? In other words, wouldn't you need to store the entire set of keys that the broker has ever seen to ensure that a potent

Re: at-least-once guarantee?

2013-08-07 Thread Jay Kreps
Not if the system does ordered delivery and the id is a sequentially increasing integer (mod something). Essentially you need to keep (producer_id, topic, partition, max_seq_num) on each broker. That is, rather than storing all keys you just need to know the max you have seen if the seq_num is <

Re: Does C++ client support zookeeper based producer load balancing?

2013-08-07 Thread Jun Rao
Jan, My understanding is that most c++ client implementations use broker list, instead of ZK for load balancing. Thanks, Jun On Wed, Aug 7, 2013 at 8:06 AM, Jan Rudert wrote: > Hi, > > I am starting with kafka. We use version 0.7.2 currently. Does anyone know > wether automatic producer load

Re: Exception with kafka 0.8

2013-08-07 Thread Neha Narkhede
The invalid argument exception on a socket connection looks weird. If you enable debug on kafka.controller.ControllerChannelManager, it will tell you which broker the newly elected controller is trying to talk to. Then you will have to make sure that every broker can connect to every other broker.

Re: Kafka/Hadoop consumers and producers

2013-08-07 Thread Ken
Hi Andrew, Camus can be made to work without avro. You will need to implement a message decoder and and a data writer. We need to add a better tutorial on how to do this, but it isn't that difficult. If you decide to go down this path, you can always ask questions on this list. I try to make