This didn't change in 0.8.2, unfortunately. What I typically do with high level consumer is read messages into my own buffer and once I'd done processing them with no errors, I clear my own buffer, commit offsets and read more messages from Kafka.
This way, if I have errors I can re-try from my buffer. If I crash and the buffer is gone, the consumer will re-read these messages since offsets were not committed yet. Would have been nice if the consumer would have handled this for me, but managing the buffer is not bad. Gwen On Sun, Feb 8, 2015 at 10:38 AM, Christopher Piggott <cpigg...@gmail.com> wrote: > Have there been any changes with 0.8.2 in how the marker gets moved when > you use the high-level consumer? > > One problem I have always had was: what if I pull something from the > stream, but then I have an error in processing it? I don't really want to > move the marker. > > I would almost like the client to have a callback mechanism for processing, > and the marker only gets moved if the high level consumer successfully > implements my callback/processor (with no exceptions, at least). > > > > On Sun, Feb 8, 2015 at 9:49 AM, Gwen Shapira <gshap...@cloudera.com> > wrote: > > > On Sun, Feb 8, 2015 at 6:39 AM, Christopher Piggott <cpigg...@gmail.com> > > wrote: > > > > > > The consumer used Zookeeper to store offsets, in 0.8.2 there's an > > option > > > to use Kafka itself for that (by setting *offsets.storage = kafka > > > > > > Does it still really live in zookeeper, and kafka is proxying the > > requests > > > through? > > > > > > > > They don't live in Zookeeper. They live in a secret Kafka topic > (__offsets > > or something like that). > > > > For migration purposes, you can set dual.commit.enable = true and then > > offsets will be stored in both Kafka and ZK, but the intention is to > > migrate to 100% Kafka storage. > > > > > > > > > On Sun, Feb 8, 2015 at 9:25 AM, Gwen Shapira <gshap...@cloudera.com> > > > wrote: > > > > > > > Hi Eduardo, > > > > > > > > 1. "Why sometimes the applications prefer to connect to zookeeper > > instead > > > > brokers?" > > > > > > > > I assume you are talking about the clients and some of our tools? > > > > These are parts of an older design and we are actively working on > > fixing > > > > this. The consumer used Zookeeper to store offsets, in 0.8.2 there's > an > > > > option to use Kafka itself for that (by setting *offsets.storage = > > > kafka*). > > > > We are planning on fixing the tools in 0.9, but obviously they are > less > > > > performance sensitive than the consumers. > > > > > > > > 2. Regarding your tests and disk usage - I'm not sure exactly what > > fills > > > > your disk - if its the kafka transaction logs (i.e. log.dir), then we > > > > expect to store the size of all messages sent times the replication > > > faction > > > > configured for each topic. We keep messages for the amount of time > > > > specified in *log.retention* parameters. If the disk is filled within > > > > minutes, either set log.retention.minutes very low (at risk of losing > > > data > > > > if consumers need restart), or make sure your disk capacity matches > the > > > > rates in which producers send data. > > > > > > > > Gwen > > > > > > > > > > > > On Sat, Feb 7, 2015 at 3:01 AM, Eduardo Costa Alfaia < > > > > e.costaalf...@unibs.it > > > > > wrote: > > > > > > > > > Hi Guys, > > > > > > > > > > I have some doubts about the Kafka, the first is Why sometimes the > > > > > applications prefer to connect to zookeeper instead brokers? > > Connecting > > > > to > > > > > zookeeper could create an overhead, because we are inserting other > > > > element > > > > > between producer and consumer. Another question is about the > > > information > > > > > sent by producer, in my tests the producer send the messages to > > brokers > > > > and > > > > > a few minutes my HardDisk is full (my harddisk has 250GB), is there > > > > > something to do in the configuration to minimize this? > > > > > > > > > > Thanks > > > > > -- > > > > > Informativa sulla Privacy: http://www.unibs.it/node/8155 > > > > > > > > > > > > > > >