Thanks Andrew. Arpit also told me about this but I forgot to bring it up here.

Best,

> On Aug 24, 2017, at 10:59 AM, Andrew Wang <andrew.w...@cloudera.com> wrote:
> 
> FYI that committer +1s are binding on merges, so Sean and Mingliang's +1s
> can be upgraded to binding.
> 
> On Thu, Aug 24, 2017 at 6:09 AM, Kihwal Lee <kih...@oath.com.invalid> wrote:
> 
>> +1 (binding)
>> Great work guys!
>> 
>> On Thu, Aug 24, 2017 at 5:01 AM, Steve Loughran <ste...@hortonworks.com>
>> wrote:
>> 
>>> 
>>> On 23 Aug 2017, at 19:21, Aaron Fabbri <fab...@cloudera.com<mailto:fa
>>> b...@cloudera.com>> wrote:
>>> 
>>> 
>>> On Tue, Aug 22, 2017 at 10:24 AM, Steve Loughran <ste...@hortonworks.com
>> <
>>> mailto:ste...@hortonworks.com>> wrote:
>>> video being processed:  https://www.youtube.com/watch?
>>> v=oIe5Zl2YsLE&feature=youtu.be
>>> 
>>> 
>>> Awesome demo Steve, thanks for doing this.  Particularly glad to see
>> folks
>>> using and extending the failure injection client.
>>> 
>>> The HADOOP-13786 iteration turns on throttle event generation. All the
>> new
>>> committer stuff is ready for it, but all the existing S3A FS ops react
>> to a
>>> throttle exception by failing, when they need to just back off a bit.
>> This
>>> complicates testing as I have to explicitly turn off fault injection for
>>> setup & teardown
>>> 
>>> 
>>> Demoing the CLI tool was great as well.
>>> 
>>> 
>>> I'm going to have to do another iteration on that CLI tool post-merge, as
>>> I had one big problem: working out if the bucket and all the binding
>>> settings meant it was "guarded". I think we'll need to track what issues
>>> like that crop up in the field and add the diagnostics/other options.
>>> 
>>> +I think another one that'd be useful would be to enum all s3guard DDB
>>> tables in a region/globally & list their allocated IOPs. I know the AWS
>> UI
>>> can list tables by region, but you need to look around every region to
>> find
>>> out if you've accidentally created one. If you enum all table & look for
>> a
>>> s3guard version marker, then you can identify tables.
>>> 
>>> Wanted to mention two things:
>>> 
>>> 1. Authoritative mode is not fully implemented yet with Dynamo (it needs
>>> to persist an extra bit for directories).  I do have an auth-mode patch
>>> (done for a hackathon) that I need to post which shows large performance
>>> improvements over what S3Guard has today.  As you said, we don't consider
>>> authoritative mode ready for production yet: we want to play with it more
>>> and improve the prune algorithm first.  Authoritative mode can be thought
>>> of as a nice bonus in the future: The main goal of S3Guard v1 is to fix
>> the
>>> get / list consistency issues you mentioned, which it does well.
>>> 
>>> 
>>> we need to call that out in the release notes.
>>> 
>>> 2. Also wanted to thank Lei (Eddy) Xu, he was very active during early
>>> design and contributed some patches as well.
>>> 
>>> 
>>> good point. Lei: you will get a special mention the next time I do the
>> demo
>>> 
>>> 
>>> Again, great demo, enjoyed it!
>>> 
>>> -AF
>>> 
>>> 
>>> its actually quite hard to show any benefits of s3guard on the command
>>> line, so I've ended up showing some scala tests where I turn on the
>>> (bundled) inconsistent AWS client to show how you then need to enable
>>> s3guard to make the stack traces go away
>>> 
>>> 
>>> On 22 Aug 2017, at 11:17, Steve Loughran <ste...@hortonworks.com<mailto:
>>> ste...@hortonworks.com><mailto:ste...@hortonworks.com<mailto:
>>> ste...@hortonworks.com>>> wrote:
>>> 
>>> +1 (binding)
>>> 
>>> I'm happy with it; it's a great piece of work by (in no particular
>> order):
>>> Chris Nauroth, Aaron Fabbri, Sean McRory & Mingliang Liu. plus a few bits
>>> in the corners where I got to break things while they were all asleep.
>> Also
>>> deserving a mention: Thomas Demoor & Ewan Higgs @ WDC for consultancy on
>>> the corners of S3, everyone who tested in (including our QA team), Sanjay
>>> Radia, & others.
>>> 
>>> I've already done a couple of iterations of fixing checksyles & code
>>> reviews, so I think it is ready. I also have a branch-2 patch based on
>>> earlier work by Mingliang, for people who want that.
>>> 
>>> 
>>> 
>>> 
>>> On 17 Aug 2017, at 23:07, Aaron Fabbri <fab...@cloudera.com<mailto:fa
>>> b...@cloudera.com><mailto:fab...@cloudera.com<mailto:fab...@cloudera.com
>>>>> 
>>> wrote:
>>> 
>>> Hello,
>>> 
>>> I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge
>> the
>>> HADOOP-13345 feature branch into trunk.
>>> 
>>> This branch contains the new S3Guard feature which adds metadata
>>> consistency features to the S3A client.  Formatted site documentation can
>>> be found here:
>>> 
>>> https://github.com/apache/hadoop/blob/HADOOP-13345/
>>> hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
>>> 
>>> The current patch against trunk is posted here:
>>> 
>>> https://issues.apache.org/jira/browse/HADOOP-13998
>>> 
>>> The branch modifies the s3a portion of the hadoop-tools/hadoop-aws
>> module:
>>> 
>>> - The feature is off by default, and care has been taken to insure it has
>>> no impact when disabled.
>>> - S3Guard can be enabled with the production database which is backed by
>>> DynamoDB, or with a local, in-memory implementation that facilitates
>>> integration testing without having to pay for a database.
>>> - getFileStatus() as well as directory listing consistency has been
>>> implemented and thoroughly tested, including delete tracking.
>>> - Convenient Maven profiles for testing with and without S3Guard.
>>> - New failure injection code and integration tests that exercise it.  We
>>> use timers and a wrapper around the Amazon SDK client object to force
>>> consistency delays to occur.  This allows us to assert that S3Guard works
>>> as advertised.  This will be extended with more types of failure
>> injection
>>> to continue hardening the S3A client.
>>> 
>>> Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor
>>> changes:
>>> 
>>> - core-default.xml defaults and documentation for s3guard parameters.
>>> - A couple additional FS contract test cases around rename.
>>> - More goodies in LambdaTestUtils
>>> - A new CLI tool for inspecting and manipulating S3Guard features,
>>> including the backing MetadataStore database.
>>> 
>>> This branch has seen extensive testing as well as use in production.
>> This
>>> branch makes significant improvements to S3A's test toolkit as well.
>>> 
>>> Performance is typically on par with, and in some cases better than, the
>>> existing S3A code without S3Guard enabled.
>>> 
>>> This feature was developed with contributions and feedback from many
>>> people.  I'd like to thank everyone who worked on HADOOP-13345 as well as
>>> all of those who contributed feedback and work on the original design
>>> document.
>>> 
>>> This is the first major Apache Hadoop project I've worked on from start
>> to
>>> finish, and I've really enjoyed it.  Please shout if I've missed anything
>>> important here or in the VOTE process.
>>> 
>>> Cheers,
>>> Aaron Fabbri
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org<mailto:
>>> common-dev-unsubscr...@hadoop.apache.org><mailto:common-dev-
>>> unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscribe@
>>> hadoop.apache.org>>
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> <mailto:
>>> common-dev-h...@hadoop.apache.org><mailto:comm
>>> on-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org>>
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to