I am not setting the group id for the console consumer. When I say, the .log 
files are all 0 bytes long it is after the producer has gone through 96 GB 
worth of data. Apart from this topic where I am dumping 96GB of data, I have 
some test topics where I am publishing very small amount of data. I don't have 
any problem reading messages from those topics. The .log files for those topics 
are properly sized and I can read those messages using multiple console 
consumers at the same time. I have a feeling that the this specific topic is 
having trouble due to the amount of data that I am publishing. I am failing to 
understand which Kafka settings are playing role here.
I am sure 96GB of data is really not a big deal for Kafka and I am not the 
first one to do this.
    On Wednesday, July 17, 2019, 04:58:48 PM EDT, Peter Bukowinski 
<pmb...@gmail.com> wrote:  
 
 Are you setting a group.id for your console consumer, perhaps, and keeping it 
static? That would explain the inability to reconsume the data. As to why your 
logs look empty, kafka likes to hold the data in memory and leaves it to the OS 
to flush the data to disk. On a non-busy broker, the interval between when data 
arrives and when it is flushed to disk can be quite a while.


> On Jul 17, 2019, at 1:39 PM, Sachin Nikumbh <saniku...@yahoo.com.INVALID> 
> wrote:
> 
> Hi Jamie,
> I have 3 brokers and the replication factor for my topic is set to 3. I know 
> for sure that the producer is producing data successfully because I am 
> running a console consumer at the same time and it shows me the messages. 
> After the producer produces all the data, I have /var/log/kafka/myTopic-* 
> directories (15 of them) and all of them have only one .log file with size of 
> 0 bytes. So, I am not sure if that addresses your question around the active 
> segment.
> ThanksSachin
>    On Wednesday, July 17, 2019, 04:00:56 PM EDT, Jamie 
><jamied...@aol.co.uk.INVALID> wrote:  
> 
> Hi Sachin, 
> My understanding is that the active segment is never deleted which means you 
> should have at least 1GB of data in your partition, if the data is indeed 
> being produced to Kafka, Are there are errors in your broker logs? How many 
> brokers do you have have and what is the replication factor of the topic? If 
> you have less than 3 brokers, have you set offsets.topic.replication.factor 
> to the number of brokers? 
> 
> Thanks, 
> Jamie
> 
> -----Original Message-----
> From: Sachin Nikumbh <saniku...@yahoo.com.INVALID>
> To: users <users@kafka.apache.org>
> Sent: Wed, 17 Jul 2019 20:21
> Subject: Re: Kafka logs are getting deleted too soon
> 
> Broker 
> configs:===========broker.id=36num.network.threads=3num.io.threads=8socket.send.buffer.bytes=102400socket.receive.buffer.bytes=102400socket.request.max.bytes=104857600log.dirs=/var/log/kafkanum.partitions=1num.recovery.threads.per.data.dir=1offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1log.retention.hours=168log.segment.bytes=1073741824log.retention.check.interval.ms=300000zookeeper.connect=myserver1:2181,myserver2:2181,myserver3:2181zookeeper.connection.timeout.ms=6000confluent.support.metrics.enable=trueconfluent.support.customer.id=anonymousgroup.initial.rebalance.delay.ms=0auto.create.topics.enable=false
> Topic configs:==========--partitions 15--replication-factor 
> 3retention.ms=31449600000retention.bytes=10737418240
> As you can see, I have tried to override the retention.bytes for each 
> partition to 10GB to be explicit. 96GB over 10 partitions which 6.4GB. So, I 
> gave myself more than enough buffer. Even then, I am left with no logs. 
> Here's an example:
> % ls -ltr /var/log/kafka/MyTopic-0total 4-rw-r--r-- 1 root root      14      
> Jul 17 15:05 leader-epoch-checkpoint-rw-r--r-- 1 root root 10485756 Jul 17 
> 15:05 00000000000005484128.timeindex-rw-r--r-- 1 root root        0        
> Jul 17 15:05 00000000000005484128.log-rw-r--r-- 1 root root 10485760 Jul 17 
> 15:05 00000000000005484128.index
> 
>  I kept my eyes on the directory for each partition as the producer was 
>publishing data and I saw periodic .deleted files. Does it mean that Kafka was 
>deleting logs?
> Any help would be highly appreciated.
>    On Wednesday, July 17, 2019, 01:47:44 PM EDT, Peter Bukowinski 
><pmb...@gmail.com> wrote:  
> 
> Can you share your broker and topic config here?
> 
>> On Jul 17, 2019, at 10:09 AM, Sachin Nikumbh <saniku...@yahoo.com.INVALID> 
>> wrote:
>> 
>> Thanks for the quick response, Tom.
>> I should have mentioned in my original post that I am always using 
>> --from-beginning with my console consumer. Even then  I don't get any data. 
>> And as mentioned, the .log files are of size 0 bytes.
>>    On Wednesday, July 17, 2019, 11:09:22 AM EDT, Thomas Aley 
>><thomas.a...@ibm.com> wrote:  
>> 
>> Hi Sachin,
>> 
>> Try adding --from-beginning to your console consumer to view the 
>> historically produced data. By default the console consumer starts from 
>> the last offset.
>> 
>> Tom Aley
>> thomas.a...@ibm.com
>> 
>> 
>> 
>> From:  Sachin Nikumbh <saniku...@yahoo.com.INVALID>
>> To:    Kafka Users <users@kafka.apache.org>
>> Date:  17/07/2019 16:01
>> Subject:        [EXTERNAL] Kafka logs are getting deleted too soon
>> 
>> 
>> 
>> Hi all,
>> I have ~ 96GB of data in files that I am trying to get into a Kafka 
>> cluster. I have ~ 11000 keys for the data and I have created 15 partitions 
>> for my topic. While my producer is dumping data in Kafka, I have a console 
>> consumer that shows me that kafka is getting the data. The producer runs 
>> for a few hours before it is done. However, at this point, when I run the 
>> console consumer, it does not fetch any data. If I look at the logs 
>> directory, .log files for all the partitions are of 0 byte size. 
>> If I am not wrong, the default value for log.retention.bytes is -1 which 
>> means there is no size limit for the logs/partition. I do want to make 
>> sure that the value for this setting is per partition. Given that the 
>> default time based retention is 7 days, I am failing to understand why the 
>> logs are getting deleted. The other thing that confuses me is that when I 
>> use kafka.tools.GetOffsetShell, it shows me large enough values for all 
>> the 15 partitions for offsets.
>> Can someone please help me understand why I don't see logs and why 
>> is kafka.tools.GetOffsetShell making me believe there is data.
>> ThanksSachin
>> 
>> 
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number 
>> 741598. 
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>> 
>    
  

Reply via email to