Kane, The in-built offset management is already in master branch, and will be included in 0.8.2. For now you can give the current trunk a spin.
Guozhang On Fri, Aug 8, 2014 at 1:42 PM, Kane Kane <kane.ist...@gmail.com> wrote: > Hello Guozhang, > > Is storing offsets in kafka topic already in master branch? > We would like to use that feature, when do you plan to release 0.8.2? > Can we use master branch meanwhile (i.e. is it stable enough). > > Thanks. > > On Fri, Aug 8, 2014 at 1:38 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > Hi Roman, > > > > Current Kafka messaging guarantee is at-least once, and we are working on > > transactional messaging features to make it exactly once. We are > expecting > > it to be used as synchronization/replication layer for some storage > systems > > as your use case after that. > > > > As for your design, since you will probably have a lot of users and each > > user's data is small, you will end up with many small files on Kafka. If > > all you want is order preserving per user, you can probably just use > > keyed-messages with key as the user id, by that all messages with the > same > > key will end up into the same partition and hence consumed by the same > > consumer client. With that you only need a fixed small number of > partitions. > > > > Guozhang > > > > > > On Fri, Aug 8, 2014 at 12:35 PM, Roman Iakovlev <roman.iakov...@live.com > > > > wrote: > > > >> Dear all, > >> > >> > >> > >> I'm new to Kafka, and I'm considering using it for a maybe not very > usual > >> purpose. I want it to be a backend for data synchronization between a > >> magnitude of devices, which are not always online (mobile and embedded > >> devices). All the synchronized information belong to some user, and can > be > >> identified by the user id. There are several data types, and a user can > >> have > >> many entries of each data type coming from many different devices. > >> > >> > >> > >> This solution has to scale up to hundreds of thousands of users, and, as > >> far > >> as I understand, Kafka stores every partition in a single file. I've > been > >> thinking about creating a topic for every data type and a separate > >> partition > >> for every user. Amount of data stored by every user is no more than > several > >> megabytes over the whole lifetime, because the data stored would be > keyed > >> messages, and I'm expecting it to be compacted. > >> > >> > >> > >> So what I'm wondering is, would Kafka be a right approach for such task, > >> and > >> if yes, would this architecture (one topic per data type and one > partition > >> per user) scale to specified extent? > >> > >> > >> > >> Thanks, > >> > >> Roman. > >> > >> > > > > > > -- > > -- Guozhang > -- -- Guozhang