Hi Jeff, Thanks of your comments.But what I am really looking for is , consider we are copying a file of 1 GB to spool directory , if suppose copy is in progress , how flume recognize that the complete file is copied into the spool directory and the file is ready for processing ?
how flume make sure it doesnt start processing the partially copied file. On Tue, Jul 22, 2014 at 11:15 PM, Jeff Lord <jl...@cloudera.com> wrote: > I believe the way this works is that flume creates a meta directory to > track which file is being read. > In the event of a restart of the agent the entire file will be re-read > which will create some duplicate events. > > > https://github.com/apache/flume/blob/flume-1.5/flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java#L474 > > > On Tue, Jul 22, 2014 at 6:15 AM, SaravanaKumar TR <saran0081...@gmail.com> > wrote: > >> Hi, >> >> I am planning to use spooling directory to move logfiles in hdfs sink. >> >> I like to know how flume identifies the file we are moving to spool >> directory is complete file or partial & its move still in progress. >> >> if suppose a file is of large size and we started moving it to spooler >> directory , how flume identifies that the complete file is transferred or >> is still in progress. >> >> Please help me out here. >> >> Thanks, >> saravana >> > >