Hi Mark, if you add `fs.s3a.fast.upload.buffer: true` to your Flink configuration, it should add that to the respective Hadoop configuration when creating the file system. Note, I haven't tried it but all keys with the prefixes "s3.", "s3a.", "fs.s3a." should be forwarded.
-- Arvid On Mon, Jan 27, 2020 at 5:16 PM Piotr Nowojski <pi...@ververica.com> wrote: > Hi, > > I think reducing the frequency of the checkpoints and decreasing > parallelism of the things using the S3AOutputStream class, would help to > mitigate the issue. > > I don’t know about other solutions. I would suggest to ask this question > directly to Steve L. in the bug ticket [1], as he is the one that fixed the > issue. If there is no workaround, maybe it would be possible to put a > pressure on the Hadoop guys to back port the fix to older versions? > > Piotrek > > [1] https://issues.apache.org/jira/browse/HADOOP-15658 > > On 27 Jan 2020, at 15:41, Cliff Resnick <cre...@gmail.com> wrote: > > I know from experience that Flink's shaded S3A FileSystem does not > reference core-site.xml, though I don't remember offhand what file (s) it > does reference. However since it's shaded, maybe this could be fixed by > building a Flink FS referencing 3.3.0? Last I checked I think it referenced > 3.1.0. > > On Mon, Jan 27, 2020, 8:48 AM David Magalhães <speeddra...@gmail.com> > wrote: > >> Does StreamingFileSink use core-site.xml ? When I was using it, it didn't >> load any configurations from core-site.xml. >> >> On Mon, Jan 27, 2020 at 12:08 PM Mark Harris <mark.har...@hivehome.com> >> wrote: >> >>> Hi Piotr, >>> >>> Thanks for the link to the issue. >>> >>> Do you know if there's a workaround? I've tried setting the following in >>> my core-site.xml: >>> >>> fs.s3a.fast.upload.buffer=true >>> >>> To try and avoid writing the buffer files, but the taskmanager breaks >>> with the same problem. >>> >>> Best regards, >>> >>> Mark >>> ------------------------------ >>> *From:* Piotr Nowojski <pi...@data-artisans.com> on behalf of Piotr >>> Nowojski <pi...@ververica.com> >>> *Sent:* 22 January 2020 13:29 >>> *To:* Till Rohrmann <trohrm...@apache.org> >>> *Cc:* Mark Harris <mark.har...@hivehome.com>; flink-u...@apache.org < >>> flink-u...@apache.org>; kkloudas <kklou...@apache.org> >>> *Subject:* Re: GC overhead limit exceeded, memory full of DeleteOnExit >>> hooks for S3a files >>> >>> Hi, >>> >>> This is probably a known issue of Hadoop [1]. Unfortunately it was only >>> fixed in 3.3.0. >>> >>> Piotrek >>> >>> [1] https://issues.apache.org/jira/browse/HADOOP-15658 >>> >>> On 22 Jan 2020, at 13:56, Till Rohrmann <trohrm...@apache.org> wrote: >>> >>> Thanks for reporting this issue Mark. I'm pulling Klou into this >>> conversation who knows more about the StreamingFileSink. @Klou does the >>> StreamingFileSink relies on DeleteOnExitHooks to clean up files? >>> >>> Cheers, >>> Till >>> >>> On Tue, Jan 21, 2020 at 3:38 PM Mark Harris <mark.har...@hivehome.com> >>> wrote: >>> >>> Hi, >>> >>> We're using flink 1.7.2 on an EMR cluster v emr-5.22.0, which runs >>> hadoop v "Amazon 2.8.5". We've recently noticed that some TaskManagers fail >>> (causing all the jobs running on them to fail) with an >>> "java.lang.OutOfMemoryError: GC overhead limit exceeded”. The taskmanager >>> (and jobs that should be running on it) remain down until manually >>> restarted. >>> >>> I managed to take and analyze a memory dump from one of the afflicted >>> taskmanagers. >>> >>> It showed that 85% of the heap was made up of >>> the java.io.DeleteOnExitHook.files hashset. The majority of the strings in >>> that hashset (9041060 out of ~9041100) pointed to files that began >>> /tmp/hadoop-yarn/s3a/s3ablock >>> >>> The problem seems to affect jobs that make use of the StreamingFileSink >>> - all of the taskmanager crashes have been on the taskmaster running at >>> least one job using this sink, and a cluster running only a single >>> taskmanager / job that uses the StreamingFileSink crashed with the GC >>> overhead limit exceeded error. >>> >>> I've had a look for advice on handling this error more broadly without >>> luck. >>> >>> Any suggestions or advice gratefully received. >>> >>> Best regards, >>> >>> Mark Harris >>> >>> >>> >>> The information contained in or attached to this email is intended only >>> for the use of the individual or entity to which it is addressed. If you >>> are not the intended recipient, or a person responsible for delivering it >>> to the intended recipient, you are not authorised to and must not disclose, >>> copy, distribute, or retain this message or any part of it. It may contain >>> information which is confidential and/or covered by legal professional or >>> other privilege under applicable law. >>> >>> The views expressed in this email are not necessarily the views of >>> Centrica plc or its subsidiaries, and the company, its directors, officers >>> or employees make no representation or accept any liability for its >>> accuracy or completeness unless expressly stated to the contrary. >>> >>> Additional regulatory disclosures may be found here: >>> https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email >>> >>> PH Jones is a trading name of British Gas Social Housing Limited. >>> British Gas Social Housing Limited (company no: 01026007), British Gas >>> Trading Limited (company no: 03078711), British Gas Services Limited >>> (company no: 3141243), British Gas Insurance Limited (company no: >>> 06608316), British Gas New Heating Limited (company no: 06723244), British >>> Gas Services (Commercial) Limited (company no: 07385984) and Centrica >>> Energy (Trading) Limited (company no: 02877397) are all wholly owned >>> subsidiaries of Centrica plc (company no: 3033654). Each company is >>> registered in England and Wales with a registered office at Millstream, >>> Maidenhead Road, Windsor, Berkshire SL4 5GD. >>> >>> British Gas Insurance Limited is authorised by the Prudential Regulation >>> Authority and regulated by the Financial Conduct Authority and the >>> Prudential Regulation Authority. British Gas Services Limited and Centrica >>> Energy (Trading) Limited are authorised and regulated by the Financial >>> Conduct Authority. British Gas Trading Limited is an appointed >>> representative of British Gas Services Limited which is authorised and >>> regulated by the Financial Conduct Authority. >>> >>> >>> >>> >>> The information contained in or attached to this email is intended only >>> for the use of the individual or entity to which it is addressed. If you >>> are not the intended recipient, or a person responsible for delivering it >>> to the intended recipient, you are not authorised to and must not disclose, >>> copy, distribute, or retain this message or any part of it. It may contain >>> information which is confidential and/or covered by legal professional or >>> other privilege under applicable law. >>> >>> The views expressed in this email are not necessarily the views of >>> Centrica plc or its subsidiaries, and the company, its directors, officers >>> or employees make no representation or accept any liability for its >>> accuracy or completeness unless expressly stated to the contrary. >>> >>> Additional regulatory disclosures may be found here: >>> https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email >>> >>> PH Jones is a trading name of British Gas Social Housing Limited. >>> British Gas Social Housing Limited (company no: 01026007), British Gas >>> Trading Limited (company no: 03078711), British Gas Services Limited >>> (company no: 3141243), British Gas Insurance Limited (company no: >>> 06608316), British Gas New Heating Limited (company no: 06723244), British >>> Gas Services (Commercial) Limited (company no: 07385984) and Centrica >>> Energy (Trading) Limited (company no: 02877397) are all wholly owned >>> subsidiaries of Centrica plc (company no: 3033654). Each company is >>> registered in England and Wales with a registered office at Millstream, >>> Maidenhead Road, Windsor, Berkshire SL4 5GD. >>> >>> British Gas Insurance Limited is authorised by the Prudential Regulation >>> Authority and regulated by the Financial Conduct Authority and the >>> Prudential Regulation Authority. British Gas Services Limited and Centrica >>> Energy (Trading) Limited are authorised and regulated by the Financial >>> Conduct Authority. British Gas Trading Limited is an appointed >>> representative of British Gas Services Limited which is authorised and >>> regulated by the Financial Conduct Authority. >>> >> >