Nilesh,
It's expected that a lot of memory is used for cache. This makes sense
because under the hood, Kafka mostly just reads and writes data to/from
files. While Kafka does manage some in-memory data, mostly it is writing
produced data (or replicated data) to log files and then serving those sam
I think these would definitely be useful statistics to have and I've tried
to do similar tests! The biggest difference is probably going to be the
hardware specs on whatever cluster you decide to run it on. Maybe
benchmarks performed on different AWS servers would be helpful too, but I'd
like to se
I would be using the servers available at my place of work. I dont have
access to AWS servers. I would starting off with a small number of nodes in
the cluster and then plot a graph with x-axis as the number of servers in
the cluster and y-axis as the number of topics with partitions, before the
cl
@Jiefu Gong,
Are the results of your tests available publicly?
Regards,
Prabhjot
On Tue, Jul 28, 2015 at 10:35 PM, Prabhjot Bharaj
wrote:
> I would be using the servers available at my place of work. I dont have
> access to AWS servers. I would starting off with a small number of nodes in
> th
Hi,
I'm using Mirror Maker with a cluster of 3 nodes and cluster of 5 nodes.
I would like to ask - is the number of nodes a restriction for Mirror Maker?
Also, are there any other restrictions or properties that should be common
across both the clusters so that they continue mirroring.
I'm aski
I'm trying to get a basic consumer off the ground. I can create the consumer
but I can't do anything at the message level:
consumer = KafkaConsumer(topic,
group_id=group_id,
bootstrap_servers=[ip + ":" + port])
for m in consumer:
print "x"
Hi all,
Currently working on building some tools revolving around the Simple
Consumer, and I was wondering if anyone had any good documentation or
examples for it? I am aware that this:
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
exists but it doesn't give any in
Can you confirm that there are indeed messages in the topic that you
published to?
bin/kafka-console-consumer.sh --zookeeper [details] --topic [topic]
--from-beginning
That should be the right command, and you can use that to first verify that
messages have indeed been published to the topic in q
Thank you. It looks like I had the 'topic' slightly wrong. I didn't realize it
was case-sensitive. I got past that error, but now I'm bumping up against
another error:
Traceback (most recent call last):
...
File "/usr/local/lib/python2.7/dist-packages/kafka/consumer/kafka.py", line
59, in _
This won't be very helpful as I am not too experienced with Python or your
use case, but Java Consumers to indeed have to create a String out of the
byte array returned from a successful consumption like:
String actualmsg = new String(messageAndMetadata.message())
Consult something like this as i
Hi Keith,
kafka-python raises FailedPayloadsError on unspecified server failures.
Typically this is caused by a server exception that results in a 0 byte
response. Have you checked your server logs?
-Dana
On Tue, Jul 28, 2015 at 2:01 PM, JIEFU GONG wrote:
> This won't be very helpful as I am
Hello Jason,
Thanks for reply!
About your proposal, in general case it might be helpful. In my case it
will not help much - I'm allowing each ConsumerRecord or subset of
ConsumerRecords to be processed and ACKed independently and out of HLC
process/thread (not to block partition), and then commit
Correct me if I m wrong. If compaction is used +1 to indicate next offset
is no longer valid. For the compacted section the offset is not increasing
sequentially. i think you need to call the next offset of the last
processed record to figure out what the next offset will be
On Wed, 29 Jul 2015 at
It seems less weird if you think of the offset as the position of the
consumer, i.e. it is "on" record 5. In some sense the consumer is actually
in between records, i.e. if it has processed 4 and not processed 5 do you
think about your position as being on 4 or on 5? Well not on 4 because it
alread
I haven’t found a way to specify that a consumer should read from the
beginning, or from any other explicit offset, or that the offset should be
“reset” in any way. The command-line shell scripts (which I believe simply
wrap the Scala tools) have flags for this sort of thing. Is there any way
Hi Kafka experts,
I am trying to understand how Kafka new producer works. From the Kafka new
producer code/comment, it indicates the producer is thread safe which we
could have multiple clients per KafkaProducer. Is RecordAccumulator the
only place to take care thread synchronization for KafkaProd
Hi Keith,
you can use the `auto_offset_reset` parameter to kafka-python's
KafkaConsumer. It behaves the same as the java consumer configuration of
the same name. See
http://kafka-python.readthedocs.org/en/latest/apidoc/kafka.consumer.html#kafka.consumer.KafkaConsumer.configure
for more details on
Hi Ewen,
Thanks for reply.
The assumptions that you made for replication and partitions are
correct, 120 is total number of partitions and replication factor is 1
for all the topics.
Does that mean that a broker will keep all the messages that are
produced in memory, or will only the unconsumed m
18 matches
Mail list logo