If I grab the last snapshot would I get these changes? On Tue, Nov 20, 2012 at 3:24 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote:
> that's awesome! > > > On Tue, Nov 20, 2012 at 3:11 PM, Mike Percy <mpe...@apache.org> wrote: > >> Mohit, >> No problem, but Juhani did all the work. :) >> >> The behavior is that you can configure an HDFS sink to close a file if it >> hasn't gotten any writes in some time. After it's been idle for 5 minutes >> or something, it gets closed. If you get a "late" event that goes to the >> same path after the file is closed, it will just create a new file in the >> same path as usual. >> >> Regards, >> Mike >> >> >> On Tue, Nov 20, 2012 at 12:56 PM, Brock Noland <br...@cloudera.com>wrote: >> >>> We are currently voting on a 1.3.0 RC on the dev@ list: >>> >>> http://s.apache.org/OQ0W >>> >>> You don't have to be a committer to vote! :) >>> >>> Brock >>> >>> On Tue, Nov 20, 2012 at 2:53 PM, Mohit Anchlia <mohitanch...@gmail.com> >>> wrote: >>> > Thanks a lot!! Now with this what should be the expected behaviour? >>> After >>> > file is closed a new file is created for writes that come after >>> closing the >>> > file? >>> > >>> > Thanks again for committing this change. Do you know when 1.3.0 is >>> out? I am >>> > currently using the snapshot version of 1.3.0 >>> > >>> > On Tue, Nov 20, 2012 at 11:16 AM, Mike Percy <mpe...@apache.org> >>> wrote: >>> >> >>> >> Mohit, >>> >> FLUME-1660 is now committed and it will be in 1.3.0. In the case >>> where you >>> >> are using 1.2.0, I suggest running with hdfs.rollInterval set so the >>> files >>> >> will roll normally. >>> >> >>> >> Regards, >>> >> Mike >>> >> >>> >> >>> >> On Thu, Nov 15, 2012 at 11:23 PM, Juhani Connolly >>> >> <juhani_conno...@cyberagent.co.jp> wrote: >>> >>> >>> >>> I am actually working on a patch for exactly this, refer to >>> FLUME-1660 >>> >>> >>> >>> The patch is on review board right now, I fixed a corner case issue >>> that >>> >>> came up with unit testing, but the implementation is not really to my >>> >>> satisfaction. If you are interested please have a look and add your >>> opinion. >>> >>> >>> >>> https://issues.apache.org/jira/browse/FLUME-1660 >>> >>> https://reviews.apache.org/r/7659/ >>> >>> >>> >>> >>> >>> On 11/16/2012 01:16 PM, Mohit Anchlia wrote: >>> >>> >>> >>> Another question I had was about rollover. What's the best way to >>> >>> rollover files in reasonable timeframe? For instance our path is >>> YY/MM/DD/HH >>> >>> so every hour there is new file and the -1 hr is just sitting with >>> .tmp and >>> >>> it takes sometimes even hour before .tmp is closed and renamed to >>> .snappy. >>> >>> In this situation is there a way to tell flume to rollover files >>> sooner >>> >>> based on some idle time limit? >>> >>> >>> >>> On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia < >>> mohitanch...@gmail.com> >>> >>> wrote: >>> >>>> >>> >>>> Thanks Mike it makes sense. Anyway I can help? >>> >>>> >>> >>>> >>> >>>> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy <mpe...@apache.org> >>> wrote: >>> >>>>> >>> >>>>> Hi Mohit, this is a complicated issue. I've filed >>> >>>>> https://issues.apache.org/jira/browse/FLUME-1714 to track it. >>> >>>>> >>> >>>>> In short, it would require a non-trivial amount of work to >>> implement >>> >>>>> this, and it would need to be done carefully. I agree that it >>> would be >>> >>>>> better if Flume handled this case more gracefully than it does >>> today. Today, >>> >>>>> Flume assumes that you have some job that would go and clean up >>> the .tmp >>> >>>>> files as needed, and that you understand that they could be >>> partially >>> >>>>> written if a crash occurred. >>> >>>>> >>> >>>>> Regards, >>> >>>>> Mike >>> >>>>> >>> >>>>> >>> >>>>> On Sun, Nov 11, 2012 at 8:32 AM, Mohit Anchlia < >>> mohitanch...@gmail.com> >>> >>>>> wrote: >>> >>>>>> >>> >>>>>> What we are seeing is that if flume gets killed either because of >>> >>>>>> server failure or other reasons, it keeps around the .tmp file. >>> Sometimes >>> >>>>>> for whatever reasons .tmp file is not readable. Is there a way to >>> rollover >>> >>>>>> .tmp file more gracefully? >>> >>>>> >>> >>>>> >>> >>>> >>> >>> >>> >>> >>> >> >>> > >>> >>> >>> >>> -- >>> Apache MRUnit - Unit testing MapReduce - >>> http://incubator.apache.org/mrunit/ >>> >> >> >