Re: Kafka with Zookeeper behind AWS ELB

2017-07-20 Thread Pradeep Gollakota
Luigi, I strongly urge you to consider a 5 node ZK deployment. I've always done that in the past for resiliency during maintenance. In a 3 node cluster, you can only tolerate one "failure", so if you bring one node down for maintenance and another node crashes during said maintenance, your ZK clus

Re: Scaling up kafka consumers

2017-02-24 Thread Pradeep Gollakota
A single partition can be consumed by at most a single consumer. Consumers compete to take ownership of a partition. So, in order to gain parallelism you need to add more partitions. There is a library that allows multiple consumers to consume from a single partition https://github.com/gerritjvv/k

Re: How does one deploy to consumers without causing re-balancing for real time use case?

2017-02-10 Thread Pradeep Gollakota
e > consumer need to be handled by some other group members." > > So does this mean that the consumer should inform the group ahead of > time before it goes down? Currently, I just shutdown the process. > > > On Fri, Feb 10, 2017 at 8:35 AM, Pradeep Gollakota > wrote:

Re: How does one deploy to consumers without causing re-balancing for real time use case?

2017-02-10 Thread Pradeep Gollakota
I asked a similar question a while ago. There doesn't appear to be a way to not triggering the rebalance. But I'm not sure why it would be taking > 1hr in your case. For us it was pretty fast. https://www.mail-archive.com/users@kafka.apache.org/msg23925.html On Fri, Feb 10, 2017 at 4:28 AM, Krz

Re: Consumer Rebalancing Question

2017-01-06 Thread Pradeep Gollakota
rk, and a rebalance sorts it out and reassigns it to another member of > the group. This happens once and then the "issue" is resolved without any > additional interruptions. > > -Ewen > > On Thu, Jan 5, 2017 at 3:01 PM, Pradeep Gollakota > wrote: > > > I see... doe

Re: Consumer Rebalancing Question

2017-01-05 Thread Pradeep Gollakota
timeout before completing a rebalance. So aside from > the latency of cleanup/committing offests/rejoining after a heartbeat, > rolling bounces should be fast for consumer groups. > > -Ewen > > On Wed, Jan 4, 2017 at 5:19 PM, Pradeep Gollakota > wrote: > > > Hi Kafka folks

Consumer Rebalancing Question

2017-01-04 Thread Pradeep Gollakota
Hi Kafka folks! When a consumer is closed, it will issue a LeaveGroupRequest. Does anyone know how long the coordinator waits before reassigning the partitions that were assigned to the leaving consumer to a new consumer? I ask because I'm trying to understand the behavior of consumers if you're d

Re: kafka + autoscaling groups fuckery

2016-06-28 Thread Pradeep Gollakota
Just out of curiosity, if you guys are in AWS for everything, why not use Kinesis? On Tue, Jun 28, 2016 at 3:49 PM, Charity Majors wrote: > Hi there, > > I just finished implementing kafka + autoscaling groups in a way that made > sense to me. I have a _lot_ of experience with ASGs and various

Re: Re: Re: Re: Kafka lost data issue

2015-11-12 Thread Pradeep Gollakota
What is your producer configuration? Specifically, how many acks are you requesting from Kafka? On Thu, Nov 12, 2015 at 2:03 AM, jinxing wrote: > in kafka_0.8.3.0: > kafkaProducer = new KafkaProducer<>(properties, new ByteArraySerializer(), > new ByteArraySerializer()); > kafkaProducer.flush();

Re: Dealing with large messages

2015-10-06 Thread Pradeep Gollakota
t; > >> Here’s an article that Gwen wrote earlier this year on handling large > > >> messages in Kafka. > > >> > > >> http://ingest.tips/2015/01/21/handling-large-messages-kafka/ > > >> > > >> -James > > >> > > >>

Re: Datacenter to datacenter over the open internet

2015-10-06 Thread Pradeep Gollakota
At Lithium, we have multiple datacenters and we distcp our data across our Hadoop clusters. We have 2 DCs in NA and 1 in EU. We have a non-redundant direct connect from our EU cluster to one of our NA DCs. If and when this fails, we have automatic failover to a VPN that goes over the internet. The

Dealing with large messages

2015-10-05 Thread Pradeep Gollakota
Fellow Kafkaers, We have a pretty heavyweight legacy event logging system for batch processing. We're now sending the events into Kafka now for realtime analytics. But we have some pretty large messages (> 40 MB). I'm wondering if any of you have use cases where you have to send large messages to

Kakfa Recovery Errors

2015-10-01 Thread Pradeep Gollakota
Hi All, We’ve been dealing with a Kafka outage for a few days. In an attempt to recover, we’ve shut down all of our producers and consumers. So the only connections we see to/from the brokers are other brokers and zookeeper. The symptoms we’re seeing are: 1. Uncaught exception on kafka-ne

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Pradeep Gollakota
To add a little more context to Shaun's question, we have around 400 customers. Each customer has a stream of events. Some customers generate a lot of data while others don't. We need to ensure that each customer's data is sorted globally by timestamp. We have two use cases around consumption: 1.

Re: integrate Camus and Hive?

2015-03-09 Thread Pradeep Gollakota
If I understood your question correctly, you want to be able to read the output of Camus in Hive and be able to know partition values. If my understanding is right, you can do so by using the following. Hive provides the ability to provide custom patterns for partitions. You can use this in combin

Re: [kafka-clients] Re: [VOTE] 0.8.2.0 Candidate 3

2015-02-03 Thread Pradeep Gollakota
Lithium Technologies would love to host you guys for a release party in SF if you guys want. :) On Tue, Feb 3, 2015 at 11:04 AM, Gwen Shapira wrote: > When's the party? > :) > > On Mon, Feb 2, 2015 at 8:13 PM, Jay Kreps wrote: > > Yay! > > > > -Jay > > > > On Mon, Feb 2, 2015 at 2:23 PM, Neha

Re: Kafka ETL Camus Question

2015-02-02 Thread Pradeep Gollakota
Hi Bhavesh, At Lithium, we don't run Camus in our pipelines yet, though we plan to. But I just wanted to comment regarding speculative execution. We have it disabled at the cluster level and typically don't need it for most of our jobs. Especially with something like Camus, I don't see any need to

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Pradeep Gollakota
d their status. But > they don't :) > > Any thoughts on how you'd like it to work? > > Gwen > > > On Mon, Feb 2, 2015 at 1:38 PM, Pradeep Gollakota > wrote: > > This is a great question Otis. Like Gwen said, you can accomplish Sync > mode > > by se

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Pradeep Gollakota
This is a great question Otis. Like Gwen said, you can accomplish Sync mode by setting the batch size to 1. But this does highlight a shortcoming of the new producer API. I really like the design of the new API and it has really great properties and I'm enjoying working with it. However, once API

Re: Max. storage for Kafka and impact

2014-12-19 Thread Pradeep Gollakota
@Joe, Achanta is using Indian English numerals which is why it's a little confusing. http://en.wikipedia.org/wiki/Indian_English#Numbering_system 1,00,000 [1 lakh] (Indian English) == 100,000 [1 hundred thousand] (The rest of the world :P) On Fri Dec 19 2014 at 9:40:29 AM Achanta Vamsi Subhash < a

Re: [DISCUSS] Kafka Security Specific Features

2014-06-06 Thread Pradeep Gollakota
I'm actually not convinced that encryption needs to be handled server side in Kafka. I think the best solution for encryption is to handle it producer/consumer side just like compression. This will offload key management to the users and we'll still be able to leverage the sendfile optimization for

Re: Remote Zookeeper

2014-03-11 Thread Pradeep Gollakota
Is there a firewall thats blocking connections on port 9092? Also, the broker list should be comma separated. On Tue, Mar 11, 2014 at 9:02 AM, A A wrote: > Sorry one of the brokers for was down. Brought it back up. Tried the > following > > $KAFKA_HOME/bin/kafka-console-producer.sh --broker-li

Re: New Consumer API discussion

2014-02-13 Thread Pradeep Gollakota
Hi Neha, 6. It seems like #4 can be avoided by using Map> Long> or Map as the argument type. > > How? lastCommittedOffsets() is independent of positions(). I'm not sure I > understood your suggestion. I think of subscription as you're subscribing to a Set of TopicPartitions. Because the argume

Re: New Consumer API discussion

2014-02-11 Thread Pradeep Gollakota
c. > > Thanks, > Neha > > > On Tue, Feb 11, 2014 at 11:45 AM, Pradeep Gollakota >wrote: > > > Hi Jay, > > > > I apologize for derailing the conversation about the consumer API. We > > should start a new discussion about hierarchical topics, if we wa

Re: New Consumer API discussion

2014-02-11 Thread Pradeep Gollakota
? > > -Jay > > > On Mon, Feb 10, 2014 at 3:37 PM, Pradeep Gollakota >wrote: > > > WRT to hierarchical topics, I'm referring to > > KAFKA-1175<https://issues.apache.org/jira/browse/KAFKA-1175>. > > I would just like to think through the implications

Re: New Consumer API discussion

2014-02-10 Thread Pradeep Gollakota
gative value for the timeout > would block in the poll forever until there is new data > 3. We don't have hierarchical topics support. Would you mind explaining > what you meant? > 4. I'm not so sure that we need a class to express a topic which is a > string and a separate clas

Re: Config for new clients (and server)

2014-02-10 Thread Pradeep Gollakota
+1 Jun. On Mon, Feb 10, 2014 at 2:17 PM, Sriram Subramanian < srsubraman...@linkedin.com> wrote: > +1 on Jun's suggestion. > > On 2/10/14 2:01 PM, "Jun Rao" wrote: > > >I actually prefer to see those at INFO level. The reason is that the > >config > >system in an application can be complex. Som

Re: New Consumer API discussion

2014-02-10 Thread Pradeep Gollakota
Couple of very quick thoughts. 1. +1 about renaming commit(...) and commitAsync(...) 2. I'd also like to extend the above for the poll() method as well. poll() and pollWithTimeout(long, TimeUnit)? 3. Have you guys given any thought around how this API would be used with hierarchical topics? 4. Wo

Re: Building a producer/consumer supporting exactly-once messaging

2014-02-10 Thread Pradeep Gollakota
Have you read this part of the documentation? http://kafka.apache.org/documentation.html#semantics Just wondering if that solves your use case. On Mon, Feb 10, 2014 at 9:11 AM, Garry Turkington < g.turking...@improvedigital.com> wrote: > Hi, > > I've been doing some prototyping on Kafka for a f