Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

2020-09-12 Thread Averell
Hello Robert, I'm not sure why the screenshot I attached in the previous post was not shown. I'm trying to re-attach in this post. As shown in this screenshot, part-1-33, part-1-34, and part-1-35 have already been closed, but the temp file for part-1-33 is still there. Thanks and regards Averell

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

2020-09-11 Thread Robert Metzger
Hi Averell, as far as I know these tmp files should be removed when the Flink job is recovering. So you should have these files around only for the latest incomplete checkpoint while recovery has not completed yet. On Tue, Sep 1, 2020 at 2:56 AM Averell wrote: > Hello Robert, Arvid, > > As I am

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

2020-08-31 Thread Averell
Hello Robert, Arvid, As I am running on EMR, and currently AWS only supports version 1.10. I tried both solutions that you suggested ((i) copying a SAXParser implementation to the plugins folder and (ii) using the S3FS Plugin from 1.10.1), and both worked - I could have successful checkpoints. Ho

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

2020-08-27 Thread Arvid Heise
Hi Averell, This is a known bug [1] caused by the used AWS S3 library not respecting the classloader [2]. The best solution is to upgrade to 1.10.1 (or take the s3-hadoop jar from 1.10.1). Don't try to put Xerces manually anywhere. [1] https://issues.apache.org/jira/browse/FLINK-16014 [2] https:

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

2020-08-27 Thread Robert Metzger
Hi, I guess you've loaded the S3 filesystem using the s3 FS plugin. You need to put the right jar file containing the SAX2 driver class into the plugin directory where you've also put the S3 filesystem plugin. You can probably find out the name of the right sax2 jar file from your local setup wher