Those are good questions. See my answers inlined below. Thanks,
Jun On Fri, Jul 18, 2014 at 1:33 PM, shweta khare <shweta.p.kh...@gmail.com> wrote: > hi, > > I have the following doubts regarding some kafka config parameters: > > For example if I have a Throughput topic with replication factor 1 and a > single partition 0,then i will see the following files under > /tmp/kafka-logs/Throughput_0: > > 00000000000000000000.index > 00000000000000000000.log > > 00000000000070117826.index > 00000000000070117826.log > > > 1) *log.delete.delay.ms <http://log.delete.delay.ms>:* > > The period of time we hold log files around after they are removed from the > > *index*. This period of time allows any in-progress reads to complete > > uninterrupted without locking. [6000] > > In the above description, does “*index*” refer to the in-memory > segment-list and not the 00000****.index file(in example above)? > > As per documentation, kafka maintains an in-memory segment list: > > To enable read operations, kafka maintains an in-memory range(segment > > list) for each file. To avoid locking reads while still allowing deletes > > that modify the segment list we use a copy-on-write style segment list > > implementation that provides consistent views to allow a binary search to > > proceed on an immutable static snapshot view of the log segments while > > deletes are progressing. > Yes, this refers to the in-memory segment list, not the .index file. > > > > 2) *socket.request.max.bytes: *The maximum request size the server will > allow. > > how is this different from message.max.bytes (The maximum size of a message > that the server can receive.) > A request can consist of data from multiple topic partitions and therefore can contain many messages. A request bigger than socket.request.max.bytes will be rejected. > > 3) *fetch.wait.max.ms <http://fetch.wait.max.ms>: * > > > The maximum amount of time the *server *will block before answering the > > fetch request if there isn't sufficient data to immediately satisfy > > fetch.min.bytes > > Does the server above refer to kafka consumer, which will block for > fetch.wait.max.ms? How is fetch.wait.max.ms different from * > consumer.timeout.ms > <http://consumer.timeout.ms>* ? > fetch.wait.max.ms is used in the server and consumer.timeout.ms is used in the consumer client in case the server doesn't send a response in time. consumer.timeout.ms should be larger than fetch.wait.max.ms. > > 4) Is there any correlation between a producer's > *queue.buffering.max.messages* and *send.buffer.bytes? * > > The former controls how many messages are grouped into a produce request and the latter controls the socket buffer size. > 5) Will batching not happen in case producer.type=async and > request.required.acks=1 or -1 ? Since next message will only be sent after > an ack is received from leader/all ISR replicas? > > Batching is independent of the ack mode. We simply group multiple messages into a single produce request. The ack mode is used in the produce request. > 6) *topic.metadata.refresh.interval.ms > <http://topic.metadata.refresh.interval.ms>: * > After every 10 mins I see the following on my producer side: > > 1200483 [main] INFO kafka.client.ClientUtils$ - Fetching metadata from > broker id:0,host:localhost,port:9092 with correlation id 15078270 for 1 > topic(s) Set(Throughput) > > 1200484 [main] INFO kafka.producer.SyncProducer - Connected to > localhost:9092 for producing > > 1200486 [main] INFO kafka.producer.SyncProducer - Disconnecting from > localhost:9092 > > 1200486 [main] INFO kafka.producer.SyncProducer - Disconnecting from > sdp08:9092 > > 1200487 [main] INFO kafka.producer.SyncProducer - Connected to sdp08:9092 > for producing > > Why is there a disconnection and re-connection happening on each metadata > refresh even though the leader is alive? I have noticed that I loose some > messages when this happens(with request.required.acks=0) ? > Yes, currently we close the connection after issuing metadata requests to save idle connections. Refreshing metadata periodically is useful for picking up changes like increases in # partitions in a topic. The data loss you saw is related to acks=0. For details, see the explanation in http://kafka.apache.org/documentation.html#producerconfigs for details. > > thank you, > shweta >