I’m seeing a different exception when producing to Kinesis, which seems to do 
with back pressure handling:


java.lang.RuntimeException: An exception was thrown while processing a record: 
Rate exceeded for shard shardId-000000000026 in stream turar-test-output under 
account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under 
account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under 
account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under 
account 9999.

Record has reached expiration


When this exception occurs the entire job restarts after cancelling all the 
workers. We have restart configuration set to 3, so after 3 restarts the entire 
application dies. We’re also running 1.4.x on EMR, and rebuilding with 
“-Daws.kinesis-kpl-version=0.12.6” as suggested below didn’t seem to help.

Is there a recommended solution to handle these kinds of exceptions without the 
entire application getting killed and without loss of data?

Thanks,
Turar


From: "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Date: Friday, June 15, 2018 at 4:48 AM
To: "dyana.rose" <dyana.r...@salecycle.com>, "user@flink.apache.org" 
<user@flink.apache.org>
Subject: Re: Problem producing to Kinesis

@Alexey

If you’d like to stick to 1.4.x for now, you can just do:
`mvn clean install -Daws.kinesis-kpl-version=0.12.6` when building the Kinesis 
connector, to upgrade the KPL version used.

I think we should add this to the documentation. Here’s a JIRA to track that - 
https://issues.apache.org/jira/browse/FLINK-9595.

Cheers,
Gordon

On 15 June 2018 at 9:48:33 AM, dyana.rose 
(dyana.r...@salecycle.com<mailto:dyana.r...@salecycle.com>) wrote:
Porting and rebuilding 1.4.x isn't a big issue. I've done it on our fork, back 
when I reported the upcoming issue and we're running fine.

https://github.com/SaleCycle/flink/commit/d943a172ae7e6618309b45df848d3b9432e062d4

Ignore the circleci file of course, and the rest are the changes that I back 
ported from 1.5

Dyana

On 2018/06/14 20:44:10, Alexey Tsitkin <alexey.tsit...@gmail.com> wrote:
> Thanks Gordon!
>
> This indeed seems like the cause of the issue.
> I've ran the program using 1.5.0, after building the appropriate connector,
> and it's working as expected.
>
> Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL
> version, as this kind of blocks running on EMR and producing to Kinesis...
> :-)
>
> Alexey
>
> On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <tzuli...@apache.org>
> wrote:
>
> > Hi,
> >
> > This could be related:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701
> > .
> >
> > Shortly put, the KPL library version used by default in the 1.4.x Kinesis
> > connector, is no longer supported by AWS.
> > Users would need to use a upgraded version >= 0.12.6 and build the Kinesis
> > connector for the producer to work.
> >
> > We should probably add a warning about this in the Kinesis connector docs.
> >
> > Cheers,
> > Gordon
> >
> > On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard (lassenederga...@gmail.com)
> > wrote:
> >
> > Hi.
> >
> > We see the same error and to my understanding it’s a known error from
> > Amazon. See.
> > https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522
> >
> > We don’t have a workaround and haven’t found the reason for the exception.
> > It is one off the reason why we move to Kafka in the near future.
> >
> > Med venlig hilsen / Best regards
> > Lasse Nedergaard
> >
> >
> > Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <alexey.tsit...@gmail.com
> > >:
> >
> > Hi,
> > I'm trying to run a simple program which consumes from one kinesis stream,
> > does a simple transformation, and produces to another stream.
> > Running on Flink 1.4.0.
> >
> > Code can be seen here (if needed I can also paste it directly on this
> > thread):
> >
> > https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working
> >
> > Consuming the source stream works great, but trying to use the producer
> > causes the exception:
> >
> > *org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException:
> > The child process has been shutdown and can no longer accept messages.*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)*
> > * at
> > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)*
> > * ...*
> >
> >
> > Did anyone have something similar?
> > Or is there any way to debug the daemon itself, to understand the source
> > of the error?
> >
> > As you can see, this is a trivial example, which I mostly copy-pasted from
> > the documentation.
> >
> > Thanks,
> > Alexey
> >
> >
>

Reply via email to