We've (www.plumbee.co.uk) have been using Flume NG in combination with S3 successfully for about 10 months now without any major issues. Our whole tech stack is hosted in AWS and on average we process 450 million events per day, (approx 120GB) all of which is collected via Flume, aggregated using the the FileChannel backed by EBS volumes and uploaded using the HDFS event sink to S3.
The data we collect represents analytics events from our gaming platform which cannot be recovered if lost so reliability and durability are very important to us. Now although Flume has a great transactional model to achieve this, the S3 filesystem implementation provided by the Hadoop project has several issues which resulted in us modifying it heavily. One such problem is that the syncFs() method of the filesystem which should trigger any system buffers to be written out actually does nothing in the context of S3. So while Flume believes the data is safe and removes it from the channel you have no guarantees it is. Also the S3 filesystem buffers data locally on disk first and only on close() are the contents of the file uploaded to S3. If for whatever reason Flume crashes or the box dies the contents of those files are just orphaned on the local filesystem and you have to manually recover them (assuming they aren't also corrupted). If you have any other questions about our setup just ask! Cheers, Dennis