Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
ar, Jayesh" Cc: "users@kafka.apache.org" Subject: Re: Kafka Setup for Daily counts on wide array of keys And not to overthink this, but as I'm new to Kafka and streams I want to make sure that it makes the most sense to for my use case. With the streams and grouping, it looks like

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
*Matt Daum > *Date: *Monday, March 5, 2018 at 1:59 PM > > *To: *"Thakrar, Jayesh" > *Cc: *"users@kafka.apache.org" > *Subject: *Re: Kafka Setup for Daily counts on wide array of keys > > > > Ah good call, so you are really having an AVRO wrapper around your sing

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
From: Matt Daum mailto:m...@setfive.com>> Date: Monday, March 5, 2018 at 5:54 AM To: "Thakrar, Jayesh" mailto:jthak...@conversantmedia.com>> Cc: "users@kafka.apache.org<mailto:users@kafka.apache.org>" mailto:users@kafka.apache.org>> Subject: Re: Kafka Setup f

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
izing that overhead > across many rows. > > > > *From: *Matt Daum > *Date: *Monday, March 5, 2018 at 5:54 AM > > *To: *"Thakrar, Jayesh" > *Cc: *"users@kafka.apache.org" > *Subject: *Re: Kafka Setup for Daily counts on wide array of keys > >

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
, March 5, 2018 at 5:54 AM To: "Thakrar, Jayesh" Cc: "users@kafka.apache.org" Subject: Re: Kafka Setup for Daily counts on wide array of keys Thanks for the suggestions! It does look like it's using local RocksDB stores for the state info by default. Will look into using

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
t; *From: *"Thakrar, Jayesh" > *Date: *Sunday, March 4, 2018 at 9:25 PM > *To: *Matt Daum > > *Cc: *"users@kafka.apache.org" > *Subject: *Re: Kafka Setup for Daily counts on wide array of keys > > > > I don’t have any experience/knowledge on the K

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-04 Thread Thakrar, Jayesh
2018 at 9:25 PM To: Matt Daum Cc: "users@kafka.apache.org" Subject: Re: Kafka Setup for Daily counts on wide array of keys I don’t have any experience/knowledge on the Kafka inbuilt datastore, but believe thatfor some portions of streaming Kafka uses (used?) RocksDB to locally store som

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-04 Thread Thakrar, Jayesh
nday, March 4, 2018 at 2:39 PM To: "Thakrar, Jayesh" Cc: "users@kafka.apache.org" Subject: Re: Kafka Setup for Daily counts on wide array of keys Thanks! For the counts I'd need to use a global table to make sure it's across all the data right? Also having milli

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-04 Thread Matt Daum
; > -- > *From:* Matt Daum > *Sent:* Sunday, March 4, 2018 7:06:19 AM > *To:* Thakrar, Jayesh > *Cc:* users@kafka.apache.org > *Subject:* Re: Kafka Setup for Daily counts on wide array of keys > > We actually don't have a kafka cluster setup y

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-04 Thread Thakrar, Jayesh
r. Jayesh From: Matt Daum Sent: Sunday, March 4, 2018 7:06:19 AM To: Thakrar, Jayesh Cc: users@kafka.apache.org Subject: Re: Kafka Setup for Daily counts on wide array of keys We actually don't have a kafka cluster setup yet at all. Right now just have 8 of our applicati

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-04 Thread Matt Daum
We actually don't have a kafka cluster setup yet at all. Right now just have 8 of our application servers. We currently sample some impressions and then dedupe/count outside at a different DC, but are looking to try to analyze all impressions for some overall analytics. Our requests are around 1

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-03 Thread Thakrar, Jayesh
Matt, If I understand correctly, you have an 8 node Kafka cluster and need to support about 1 million requests/sec into the cluster from source servers and expect to consume that for aggregation. How big are your msgs? I would suggest looking into batching multiple requests per single Kafka m

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-02 Thread Matt Daum
Actually it looks like the better way would be to output the counts to a new topic then ingest that topic into the DB itself. Is that the correct way? On Fri, Mar 2, 2018 at 9:24 AM, Matt Daum wrote: > I am new to Kafka but I think I have a good use case for it. I am trying > to build daily co

Kafka Setup for Daily counts on wide array of keys

2018-03-02 Thread Matt Daum
I am new to Kafka but I think I have a good use case for it. I am trying to build daily counts of requests based on a number of different attributes in a high throughput system (~1 million requests/sec. across all 8 servers). The different attributes are unbounded in terms of values, and some wi