date:20141219

Max. storage for Kafka and impact

2014-12-19 Thread Achanta Vamsi Subhash

Hi, We are using Kafka for our messaging system and we have an estimate for 200 TB/week in the coming months. Will it impact any performance for Kafka? PS: We will be having greater than 2 lakh partitions. -- Regards Vamsi Subhash

Re: Max. storage for Kafka and impact

2014-12-19 Thread Achanta Vamsi Subhash

We definitely need a retention policy of a week. Hence. On Fri, Dec 19, 2014 at 7:40 PM, Achanta Vamsi Subhash < achanta.va...@flipkart.com> wrote: > > Hi, > > We are using Kafka for our messaging system and we have an estimate for > 200 TB/week in the coming months. Will it impact any performance

Re: Max. storage for Kafka and impact

2014-12-19 Thread nitin sharma

hi, Few things you have to plan for: a. Ensure that from resilience point of view, you are having sufficient follower brokers for your partitions. b. In my testing of kafka (50TB/week) so far, haven't seen much issue with CPU utilization or memory. I had 24 CPU and 32GB RAM. c. 200,000 partitions

Re: Max. storage for Kafka and impact

2014-12-19 Thread Achanta Vamsi Subhash

Yes. We need those many max partitions as we have a central messaging service and thousands of topics. On Friday, December 19, 2014, nitin sharma wrote: > hi, > > Few things you have to plan for: > a. Ensure that from resilience point of view, you are having sufficient > follower brokers for you

Re: Max. storage for Kafka and impact

2014-12-19 Thread Jayesh Thakrar

Technically/conceptually it is possible to have 200,000 topics, but do you really need it like that?What do you intend to do with those messages - i.e. how do you forsee them being processed downstream? And are those topics really there to segregate different kinds of processing or different ids

Re: Max. storage for Kafka and impact

2014-12-19 Thread Achanta Vamsi Subhash

We require: - many topics - ordering of messages for every topic - Consumers hit different Http EndPoints which may be slow (in a push model). In case of a Pull model, consumers may pull at the rate at which they can process. - We need parallelism to hit with as many consumers. Hence, we currently

Re: Max. storage for Kafka and impact

2014-12-19 Thread Joe Stein

see some comments inline On Fri, Dec 19, 2014 at 11:30 AM, Achanta Vamsi Subhash < achanta.va...@flipkart.com> wrote: > > We require: > - many topics > - ordering of messages for every topic > Ordering is only on a per partition basis so you might have to pick a partition key that makes sense for

Re: Max. storage for Kafka and impact

2014-12-19 Thread Joe Stein

Wait, how do you get 2,000 topics each with 50 partitions == 1,000,000 partitions? I think you can take what I said below and change my 250 to 25 as I went with your result (1,000,000) and not your arguments (2,000 x 50). And you should think on the processing as a separate step from fetch and com

Re: Kafka 0.8.2 new producer blocking on metadata

2014-12-19 Thread Paul Pearcy

Hi Jay, Many thanks for the info. All that makes sense, but from an API standpoint when something is labelled async and returns a Future, this will be misconstrued and developers will place async sends in critical client facing request/response pathways of code that should never block. If the app

Re: Max. storage for Kafka and impact

2014-12-19 Thread Achanta Vamsi Subhash

Joe, - Correction, it's 1,00,000 partitions - We can have at max only 1 consumer/partition. Not 50 per 1 partition. Yes, we have a hashing mechanism to support future partition increase as well. We override the Default Partitioner. - We use both Simple and HighLevel consumers depending on the cons

The purpose of key in kafka

2014-12-19 Thread Rajiv Kurian

Hi all, I was wondering what why every ProducerRecord sent requires a serialized key. I am using kafka, to send opaque bytes and I am ending up creating garbage keys because I don't really have a good one. Thanks, Rajiv

Re: The purpose of key in kafka

2014-12-19 Thread Jiangjie Qin

Hi Rajiv, You can send messages without keys. Just provide null for key. Jiangjie (Becket) Qin On 12/19/14, 10:14 AM, "Rajiv Kurian" wrote: >Hi all, > >I was wondering what why every ProducerRecord sent requires a serialized >key. I am using kafka, to send opaque bytes and I am ending up crea

Re: Kafka 0.8.2 new producer blocking on metadata

2014-12-19 Thread Jay Kreps

Hey Paul, I agree we should document this better. We allow and encourage using partitions to semantically distribute data. So unfortunately we can't just arbitrarily assign a partition (say 0) as that would actually give incorrect answers for any consumer that made use of the partitioning. It is

Fwd: Help: KafkaSpout not getting data from Kafka

2014-12-19 Thread Banias H

Hi folks, I am new to both Kafka and Storm and I have problem having KafkaSpout to get data from Kafka in our three-node environment with Kafka 0.8.1.1 and Storm 0.9.3. What is working: - I have a Kafka producer (a java application) to generate random string to a topic and I was able to run the f

Re: Max. storage for Kafka and impact

2014-12-19 Thread Pradeep Gollakota

@Joe, Achanta is using Indian English numerals which is why it's a little confusing. http://en.wikipedia.org/wiki/Indian_English#Numbering_system 1,00,000 [1 lakh] (Indian English) == 100,000 [1 hundred thousand] (The rest of the world :P) On Fri Dec 19 2014 at 9:40:29 AM Achanta Vamsi Subhash < a

Kafka consumer session timeouts

2014-12-19 Thread Terry Cumaranatunge

Hi I would like to get some feedback on design choices with kafka consumers. We have an application that a consumer reads a message and the thread does a number of things, including database accesses before a message is produced to another topic. The time between consuming and producing the message

Re: The purpose of key in kafka

2014-12-19 Thread Rajiv Kurian

Thanks, didn't know that. On Fri, Dec 19, 2014 at 10:39 AM, Jiangjie Qin wrote: > > Hi Rajiv, > > You can send messages without keys. Just provide null for key. > > Jiangjie (Becket) Qin > > > On 12/19/14, 10:14 AM, "Rajiv Kurian" wrote: > > >Hi all, > > > >I was wondering what why every Produce

Re: Kafka 0.8.2 new producer blocking on metadata

2014-12-19 Thread Paul Pearcy

Hi Jay, I have implemented a wrapper around the producer to behave like I want it to. Where it diverges from current 0.8.2 producer is that it accepts three new inputs: - A list of expected topics - A timeout value to init meta for those topics during producer creationg - An option to blow up if

Re: The purpose of key in kafka

2014-12-19 Thread Steve Miller

Also, if log.cleaner.enable is true in your broker config, that enables the log-compaction retention strategy. Then, for topics with the per-topic "cleanup.policy=compact" config parameter set, Kafka will scan the topic periodically, nuking old versions of the data with the same key.

Re: Kafka 0.8.2 new producer blocking on metadata

2014-12-19 Thread Jay Kreps

Yeah if you want to file and JIRA and post a patch for a new option its possible others would want it. Maybe something like pre.initialize.topics=x,y,z pre.initialize.timeout=x The metadata fetch timeout is a bug...that behavior is inherited from Object.wait which defines zero to mean infinite

Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Max. storage for Kafka and impact

Re: Kafka 0.8.2 new producer blocking on metadata

Re: Max. storage for Kafka and impact

The purpose of key in kafka

Re: The purpose of key in kafka

Re: Kafka 0.8.2 new producer blocking on metadata

Fwd: Help: KafkaSpout not getting data from Kafka

Re: Max. storage for Kafka and impact

Kafka consumer session timeouts

Re: The purpose of key in kafka

Re: Kafka 0.8.2 new producer blocking on metadata

Re: The purpose of key in kafka

Re: Kafka 0.8.2 new producer blocking on metadata

20 matches

Site Navigation

Mail list logo

Footer information