I am not sure if there is a simple and perfect solution for both loss and duplication at failure using Flume or other. for example with Flume-OG, using E2E reliability mode, you can minimize loss but duplication can happen; using BE mode with startFromEnd=true for tail, you can minimize duplication but loss can happen.
At this moment, we are using combination of our own plug-ins to minimize the affect at failure and monitoring/alert system to response quickly. -JS On 2/7/13 12:24 PM, 周梦想 wrote: > So all users of flume don't care the agent break down and miss send or > duplicate the content of logs? They have to write their own sources > and sinks? > They don't care the correct of logs? How they do if the flume agent > exited? > I'm not yet understand. > > Andy > > 2013/2/7 周梦想 <[email protected] <mailto:[email protected]>> > > I see,there is no easy way or configure way to know the detail of > what has sent and what haven't. > I have to write my own source or sink code to do this. > Thank you,Alex and all friends. > > Andy > > > 2013/2/6 Alexander Alten-Lorenz <[email protected] > <mailto:[email protected]>> > > You haven't a control in such situations, since tailDir uses > tail and holds the marker in memory. > > We had few days ago a thread about: > > http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance > > - Alex > > On Feb 6, 2013, at 3:45 AM, 周梦想 <[email protected] > <mailto:[email protected]>> wrote: > > > Hello, > > > > I'm using tailDirs('mydir') source of the agent to gather > logs to hadoop > > hdfs. I notice some documents advise that if the agent > collapsed, I have > > to remove files in 'mydir' and clear flume.agent.logdir. > Thus I will lose > > some data or have duplicate data. And I don't know which > line the agent > > have sent to. > > > > I'm worrying about the agent failure and resend or miss-send > the content to > > collector. I want to know how to check which line of log > file the agent > > have sent if the agent exit suddenly. The files in flute log > dir, such as > > sending,sent can't be read. > > > > Please give some advise to process such situation. > > Thanks. > > > > Andy Zhou > > -- > Alexander Alten-Lorenz > http://mapredit.blogspot.com > German Hadoop LinkedIn Group: http://goo.gl/N8pCF > > > -- Jeong-shik Jang / [email protected] Gruter, Inc., R&D Team Leader www.gruter.com Enjoy Connecting
