Re: Why used space of flie channel buffer directory increase?

Hari Shreedharan Wed, 20 Mar 2013 00:24:04 -0700

Hi, 

Like I mentioned earlier, we will always keep 2 data files in each data 
directory (the ".meta" files are metadata associated to the actual data). Once 
a log-8 is created(when log-7 gets rotated when it hits maximum size) and all 
of the events in log-6 are taken, then log-6 will get deleted, but you will 
still will see log-7 and log-8. So what you are seeing is not unexpected.



Hari 

-- 
Hari Shreedharan


On Tuesday, March 19, 2013 at 6:30 PM, Zhiwen Sun wrote:

> Thanks all for your reply.
> 
> @Kenison 
> I stop my tail -F | nc program and there is no new event file in HDFS, so I 
> think there is no event arrive. To make sure, I will test again with enable 
> JMX.
> 
> @Alex
> 
> The latest log is following. I can't see any exception or warning.
> 
> > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Renaming 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp) to 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901)
> > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Creating 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp)
> > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Start checkpoint 
> > for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to 
> > sync = 3
> > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Updating checkpoint 
> > metadata: logWriteOrderID: 1363659953997, queueSize: 0, queueHead: 362981
> > 13/03/19 15:28:17 INFO file.LogFileV3: Updating log-7.meta currentPosition 
> > = 216278208, logWriteOrderID = 1363659953997
> > 13/03/19 15:28:17 INFO file.Log: Updated checkpoint for file: 
> > /home/zhiwensun/.flume/file-channel/data/log-7 position: 216278208 
> > logWriteOrderID: 1363659953997
> > 13/03/19 15:28:26 INFO hdfs.BucketWriter: Renaming 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp) to 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902)
> > 13/03/19 15:28:27 INFO hdfs.BucketWriter: Creating 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp)
> > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Renaming 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp) to 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903)
> > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Creating 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp)
> > 
> > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Start checkpoint 
> > for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to 
> > sync = 2
> > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Updating checkpoint 
> > metadata: logWriteOrderID: 1363659954200, queueSize: 0, queueHead: 362981
> > 13/03/19 15:28:47 INFO file.LogFileV3: Updating log-7.meta currentPosition 
> > = 216288815, logWriteOrderID = 1363659954200
> > 13/03/19 15:28:47 INFO file.Log: Updated checkpoint for file: 
> > /home/zhiwensun/.flume/file-channel/data/log-7 position: 216288815 
> > logWriteOrderID: 1363659954200
> > 13/03/19 15:28:48 INFO hdfs.BucketWriter: Renaming 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp) to 
> > hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904 
> > (http://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904)
> 
> @Hari
> em, 12 hours passed. The size of file channel directory has no reduce.
> 
> Files in file channel directory:
> 
> > -rw-r--r-- 1 zhiwensun zhiwensun    0 2013-03-19 09:15 in_use.lock
> > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 log-6
> > -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 10:12 log-6.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 log-7
> > -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 15:28 log-7.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 
> > ./file-channel/data/log-7
> > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 
> > ./file-channel/data/log-6.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 
> > ./file-channel/data/log-7.meta
> > -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 
> > ./file-channel/data/in_use.lock
> > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 
> > ./file-channel/data/log-6
> 
> 
> 
> 
> 
> Zhiwen Sun 
> 
> 
> 
> On Wed, Mar 20, 2013 at 2:32 AM, Hari Shreedharan <hshreedha...@cloudera.com 
> (mailto:hshreedha...@cloudera.com)> wrote:
> > It is possible for the directory size to increase even if no writes are 
> > going in to the channel. If the channel size is non-zero and the sink is 
> > still writing events to HDFS, the takes get written to disk as well (so we 
> > know what events in the files were removed when the channel/agent 
> > restarts). Eventually the channel will clean up the files which have all 
> > events taken (though it will keep at least 2 files per data directory, just 
> > to be safe). 
> > 
> > -- 
> > Hari Shreedharan
> > 
> > 
> > On Tuesday, March 19, 2013 at 10:32 AM, Alexander Alten-Lorenz wrote:
> > 
> > > Hey,
> > > 
> > > what says debug? Do you can gather logs and attach them?
> > > 
> > > - Alex
> > > 
> > > On Mar 19, 2013, at 5:27 PM, "Kenison, Matt" <matt.keni...@disney.com 
> > > (mailto:matt.keni...@disney.com)> wrote: 
> > > 
> > > > Check the JMX counter first, to make sure you really are not sending 
> > > > new events. If not, is it your checkpoint directory or data directory 
> > > > that is increasing in size? 
> > > > 
> > > > 
> > > > From: Zhiwen Sun <pens...@gmail.com (mailto:pens...@gmail.com)>
> > > > Reply-To: "user@flume.apache.org (mailto:user@flume.apache.org)" 
> > > > <user@flume.apache.org (mailto:user@flume.apache.org)>
> > > > Date: Tue, 19 Mar 2013 01:19:19 -0700
> > > > To: "user@flume.apache.org (mailto:user@flume.apache.org)" 
> > > > <user@flume.apache.org (mailto:user@flume.apache.org)>
> > > > Subject: Why used space of flie channel buffer directory increase?
> > > > 
> > > > hi all:
> > > > 
> > > > I test flume-ng in my local machine. The data flow is :
> > > > 
> > > > tail -F file | nc 127.0.0.01 4444 > flume agent > hdfs 
> > > > 
> > > > My configuration file is here :
> > > > 
> > > > > a1.sources = r1
> > > > > a1.channels = c2
> > > > > 
> > > > > a1.sources.r1.type = netcat
> > > > > a1.sources.r1.bind = 192.168.201.197
> > > > > a1.sources.r1.port = 44444
> > > > > a1.sources.r1.max-line-length = 1000000
> > > > > 
> > > > > a1.sinks.k1.type = logger
> > > > > 
> > > > > a1.channels.c1.type = memory
> > > > > a1.channels.c1.capacity = 10000
> > > > > a1.channels.c1.transactionCapacity = 10000
> > > > > 
> > > > > a1.channels.c2.type = file
> > > > > a1.sources.r1.channels = c2
> > > > > 
> > > > > a1.sources.r1.interceptors = i1
> > > > > a1.sources.r1.interceptors.i1.type = timestamp
> > > > > 
> > > > > a1.sinks = k2
> > > > > a1.sinks.k2.type = hdfs
> > > > > a1.sinks.k2.channel = c2 
> > > > > a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/events/%Y-%m-%d 
> > > > > (http://127.0.0.1:9000/flume/events/%Y-%m-%d)
> > > > > a1.sinks.k2.hdfs.writeFormat = Text
> > > > > a1.sinks.k2.hdfs.rollInterval = 10
> > > > > a1.sinks.k2.hdfs.rollSize = 10000000
> > > > > a1.sinks.k2.hdfs.rollCount = 0
> > > > > 
> > > > > a1.sinks.k2.hdfs.filePrefix = app 
> > > > > a1.sinks.k2.hdfs.fileType = DataStream
> > > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > it seems that events were collected correctly.
> > > > 
> > > > But there is a problem boring me: Used space of file channel (~/.flume) 
> > > > has always increased, even there is no new event. 
> > > > 
> > > > Is my configuration wrong or other problem? 
> > > > 
> > > > thanks.
> > > > 
> > > > 
> > > > Best regards.
> > > > 
> > > > Zhiwen Sun 
> > > 
> > > --
> > > Alexander Alten-Lorenz
> > > http://mapredit.blogspot.com
> > > German Hadoop LinkedIn Group: http://goo.gl/N8pCF
> > > 
> > > 
> > > 
> > 
> > 
>

Re: Why used space of flie channel buffer directory increase?

Reply via email to