Re: AWS Consistent S3 & Apache Hadoop's S3A connector

Chang Chen Sun, 06 Dec 2020 23:36:16 -0800

Since S3A now works perfectly with S3Guard turned off, Could Magic
Committor work with S3Guard is off? If Yes, will performance degenerate? Or
if HADOOP-17400 is fixed, then it will have comparable performance?


Steve Loughran <ste...@cloudera.com.invalid> 于2020年12月4日周五 下午10:00写道：

> as sent to hadoop-general.
>
> TL;DR. S3 is consistent; S3A now works perfectly with S3Guard turned off,
> if not, file a JIRA.  rename still isn't real, so don't rely on that or
> create(path, overwrite=false) for atomic operations
>
> -------
>
> If you've missed the announcement, AWS S3 storage is now strongly
> consistent: https://aws.amazon.com/s3/consistency/
>
> That's full CRUD consistency, consistent listing, and no 404 caching.
>
> You don't get: rename, or an atomic create-no-overwrite. Applications need
> to know that and code for it.
>
> This is enabled for all S3 buckets; no need to change endpoints or any
> other settings. No extra cost, no performance impact. This is the biggest
> change in S3 semantics since it launched.
>
> What does this mean for the Hadoop S3A connector?
>
>
>    1. We've been testing it for a while, no problems have surfaced.
>    2. There's no need for S3Guard; leave the default settings alone. If
>    you were using it, turn it off, restart *everything* and then you can
>    delete the DDB table.
>    3. Without S3 listings may get a bit slower.
>    4. There's been a lot of work in branch-3.3 on speeding up listings
>    against raw S3, especially for code which uses listStatusIterator() and
>    listFiles (HADOOP-17400).
>
>
> It'll be time to get Hadoop 3.3.1 out the door for people to play with;
> it's got a fair few other s3a-side enhancements.
>
> People are still using S3Guard and it needs to be maintained for now, but
> we'll have to be fairly ruthless about what isn't going to get closed as
> WONTFIX. I'm worried here about anyone using S3Guard against non-AWS
> consistent stores. If you are, send me an email.
>
> And so for releases/PRs, tdoing est runs with and without S3Guard is
> important. I've added an optional backwards-incompatible change recently
> for better scalability: HADOOP-13230. S3A to optionally retain directory
> markers. which adds markers=keep/delete to the test matrix. This is a pain,
> though as you can choose two options at a time it's manageable.
>
> Apache HBase
> ============
>
> You still need the HBoss extension in front of the S3A connector to use
> Zookeeper to lock files during compaction.
>
>
> Apache Spark
> ============
>
> Any workflows which chained together reads directly after
> writes/overwrites of files should now work reliably with raw S3.
>
>
>    - The classic FileOutputCommitter commit-by-rename algorithms aren't
>    going to fail with FileNotFoundException during task commit.
>    - They will still use copy to rename work, so take O(data) time to
>    commit filesWithout atomic dir rename, v1 commit algorithm can't isolate
>    the commit operations of two task attempts. So it's unsafe and very slow.
>    - The v2 commit is slow, doesn't have isolation between task attempt
>    commits against any filesystem.If different task attempts are generating
>    unique filenames (possibly to work around s3 update inconsistencies), it's
>    not safe. Turn that option off.
>    - The S3A committers' algorithms are happy talking directly to S3.
>    But: SPARK-33402 is needed to fix a race condition in the staging
>    committer.
>    - The "Magic" committer, which has relied on a consistent store, is
>    safe. There's a fix in HADOOP-17318 for the staging committer; hadoop-aws
>    builds with that in will work safely with older spark versions.
>
>
> Any formats which commit work by writing a file with a unique name &
> updating a reference to it in a consistent store (iceberg &c) are still
> going to work great. Naming is irrelevant and commit-by-writing-a-file is
> S3's best story.
>
> (+ SPARK-33135 and other uses of incremental listing will get the benefits
> of async prefetching of the next page of list results)
>
> Disctp
> ======
>
> There'll be no cached 404s to break uploads, even if you don't have the
> relevant fixes to stop HEAD requests before creating files (HADOOP-16932
> and revert of HADOOP-8143)or update inconsistency (HADOOP-16775)
>
>    - If your distcp version supports -direct, use it to avoid rename
>    performance penaltiesIf your distcp version doesn't have HADOOP-15209 it
>    can issue needless DELETE calls to S3 after a big update, and end up being
>    throttled badly. Upgrade if you can.
>    - If people are seeing problems: issues.apache.org + component HADOOP
>    is where to file JIRAs; please tag the version of hadoop libraries you've
>    been running with.
>
>
> thanks,
>
> -Steve
>

Re: AWS Consistent S3 & Apache Hadoop's S3A connector

Reply via email to