Your understanding is correct :) On Mon, Mar 9, 2015 at 6:54 AM, Lin Ma <lin...@gmail.com> wrote: > Thanks Ashish, > > Followed your guidance, and found below instructions of which have further > questions to confirm with you, it seems we need to close the files and never > touch it for Flume to process correctly, so not sure if it is good practice > that -- (1) let the application write log file in existing way, like hourly > or 5 mins pattern, (2) close and move the files to another directory as > input Source for Flume Agent which Flume could process as Spooling > Directory? > > “This source will watch the specified directory for new files, and will > parse events out of new files as they appear. ” > > " > > If a file is written to after being placed into the spooling directory, > Flume will print an error to its log file and stop processing. > If a file name is reused at a later time, Flume will print an error to its > log file and stop processing. > > " > > regards, > Lin > > On Sun, Mar 8, 2015 at 12:23 AM, Ashish <paliwalash...@gmail.com> wrote: >> >> Please look at following >> Spooling Directory Source >> [http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source] >> and >> HDFS Sink (http://flume.apache.org/FlumeUserGuide.html#hdfs-sink) >> >> Spooling Directory Source need immutable files, means files should not >> be written to once they are being consumed. In short your application >> cannot write to the file being read by Flume. >> >> Log format is not an issue, as long as you don't want it to be >> interpreted by Flume components. Since it's log assuming single log >> per line with line separator at the end of line. >> >> You can also look at Exec source >> (http://flume.apache.org/FlumeUserGuide.html#exec-source) for tailing >> to a file being written by application. Documentation covers details >> on all the links. >> >> HTH ! >> >> >> On Sun, Mar 8, 2015 at 12:32 PM, Lin Ma <lin...@gmail.com> wrote: >> > Hi Flume masters, >> > >> > I want to install Flume on a box, and consume local log file as source >> > and >> > send to remote HDFS sink. The log format is private and text (not Avro >> > or >> > JSON format). >> > >> > I am reading the guide on Flume and many advanced Source configuration, >> > wondering for the plain local log file source, any reference samples? >> > And >> > not sure if Flume could consume the local file while the application is >> > still writing the log file? Thanks. >> > >> > regards, >> > Lin >> >> >> >> -- >> thanks >> ashish >> >> Blog: http://www.ashishpaliwal.com/blog >> My Photo Galleries: http://www.pbase.com/ashishpaliwal > >
-- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal