try https://www.confluent.io/ - that's what they do
/svante 2018-03-02 21:21 GMT+01:00 Matt Stone <mst...@nexeohr.com>: > We are looking for a consultant or contractor that can come onsite to our > Ogden, Utah location in the US, to help with a Kafka set up and maintenance > project. What we need is someone with the knowledge and experience to > build out the Kafka environment from scratch. > > We are thinking they would need to be onsite for 6-12 months to set it > up, and mentor some of our team so they can get up to speed to do the > maintenance once the contractor is gone. If anyone has the experience > setting up Kafka from scratch in a Linux environment, maintain node > clusters, and help train others on the team how to do it, and you are > interested in a long term project working at the client site, I would love > to start up a discussion, to see if we could use you for the role. > > I would also be interested in hearing about any consulting firms that > might have resources that could help with this role. > > Matt Stone > > > -----Original Message----- > From: Matt Daum [mailto:m...@setfive.com] > Sent: Friday, March 2, 2018 1:11 PM > To: users@kafka.apache.org > Subject: Re: Kafka Setup for Daily counts on wide array of keys > > Actually it looks like the better way would be to output the counts to a > new topic then ingest that topic into the DB itself. Is that the correct > way? > > On Fri, Mar 2, 2018 at 9:24 AM, Matt Daum <m...@setfive.com> wrote: > > > I am new to Kafka but I think I have a good use case for it. I am > > trying to build daily counts of requests based on a number of > > different attributes in a high throughput system (~1 million > > requests/sec. across all 8 servers). The different attributes are > > unbounded in terms of values, and some will spread across 100's of > > millions values. This is my current through process, let me know > > where I could be more efficient or if there is a better way to do it. > > > > I'll create an AVRO object "Impression" which has all the attributes > > of the inbound request. My application servers then will on each > > request create and send this to a single kafka topic. > > > > I'll then have a consumer which creates a stream from the topic. From > > there I'll use the windowed timeframes and groupBy to group by the > > attributes on each given day. At the end of the day I'd need to read > > out the data store to an external system for storage. Since I won't > > know all the values I'd need something similar to the KVStore.all() > > but for WindowedKV Stores. This appears that it'd be possible in 1.1 > > with this > > commit: https://github.com/apache/kafka/commit/ > > 1d1c8575961bf6bce7decb049be7f10ca76bd0c5 . > > > > Is this the best approach to doing this? Or would I be better using > > the stream to listen and then an external DB like Aerospike to store > > the counts and read out of it directly end of day. > > > > Thanks for the help! > > Daum > > >