Hi,
I am trying to access dynamo streams from a different aws account but
getting resource not found exception while trying to access the dynamo
streams from Task Manager. I have provided the following configurations :
*dynamodbStreamsConsumerConfig.setProperty(ConsumerConfigConstants.AWS_ROLE_CR
Yes, you can use it in another job. The uid needs only to be unique within a
job.
> 在 2019年10月24日,上午5:42,John Smith 写道:
>
> When setting uid() of an operator does it have to be unique across all jobs
> or just unique within a job?
>
> For example can I use env.addSource(myKafkaConsumer).uid("
Hello,
Suppose I am using a *nondeterministic* time based partitioning scheme (e.g
Flink processing time) to bucket S3 objects via the *BucketAssigner*,
designated using *BulkFormatBuilder* for StreamingFileSink.
Suppose that after an S3 MPU has completed, but *before* Flink internally
commits (w
Hi everyone,
I have a pipeline where I union several streams. I want to test it and don't
want to populate one of the streams. I'm usually creating streams with:
DataStreamTestBase.createTestStreamWith(event).close();
The above statement creates a stream and puts the `event` inside. But in my
ca
When setting uid() of an operator does it have to be unique across all jobs
or just unique within a job?
For example can I use env.addSource(myKafkaConsumer).uid("kafka-consumer")
in another job?
Yeah thanks for the responses. We’re in the process of testing 1.9.1 after we
found https://issues.apache.org/jira/browse/FLINK-12342 as the cause of the
original issue. FLINK-9455 makes sense as to why it didn’t work on legacy mode.
From: Till Rohrmann
Sent: Wednesday, October 23, 2019 5:32
Hello all,
I am running into issues at the moment trying to print my DataStreams to an S3
bucket using writeAsText(“s3://bucket/result.json”) in my Flink job. I used
print() on the same DataStream and I see the output I am looking for in
standard output. I first confirm that my datastream has d
Hello,
I am using StreamingFileSink, KafkaConsumer010 as a Kafka -> S3 connector
(Flink 1.8.1, Kafka 0.10.1).
The setup is simple:
Data is written first bucketed by datetime (granularity of 1 day), then by
kafka partition.
I am using *event time* (Kafka timestamp, recorded at the time of creation
Currently, we don't work on trying to ensure that the number of key groups
is as evenly spread as possible. As a workaround I would suggest to
increase the number of key groups or to change the key function.
Cheers,
Till
On Wed, Oct 23, 2019 at 1:42 PM Piotr Nowojski wrote:
> Hi,
>
> This is a
Hi A.V.
Add a few more points to the previous two answers. There is no clear answer
to this question. In addition to resource issues, it depends on the size of
the messages you are dealing with and the complexity of the logic. If you
don't consider a lot of extra factors, look at the performance o
Hi Farouk,
Not long after Flink 1.9.1 was released, the community may not have time to
provide the corresponding Dockerfiles. I can give you some information:
Flink's official docker file is maintained in this repository. [1]
I have seen many versions of docker files contributed by patricklucas[
There are a lot of variables. How many cores are allocated, how much ram, etc.
there are companies doing billions of events per day and more.
Tell your boss it has proven to have extremely flat horizontal scaling. Meaning
you can get it to process almost any number given sufficient hardware.
Hello A.V.
Id depends on the the underlying resources you are planing for your jobs. I
mean memory and processing will play a principal role about this answer.
keep in mind you are capable to break down your job in a number of parallel
tasks by environment or even by an specific taks within your p
Hi,
My boss wants to know how many events Flink can process, analyse etc. per
second? I cant find this in the documentation.
Hi
Is there an official image published for Flink 1.9.1 on Docker hub ?
I can't find anything and there is one for minor flink version 1.8.2.
Thanks
Farouk
Hi A.V.,
*//When I run below code I get this error: Caused by:
java.lang.RuntimeException: Rowtime timestamp is null. //Please make sure
that a proper TimestampAssigner is defined and the stream environment uses
the EventTime time characteristic.*
You need to assign Timestamp and watermarks to d
Hi,
Have you checked build of your job for a dependency convergence errors? Either
automatically or manually (`mvn dependency:tree` command)? Look for the version
clashes for the dependencies that are pulling in
`io/grpc/netty/shaded/io/netty/channel/AbstractChannel` class (grpc-netty?).
It co
Hi,
This is a known issue of Flink. For example key groups can have sizes +/- 1 and
they are currently randomly distributed across the cluster, so some machines
will get more keys to handle then the others. If the number of keys is
relatively small, like 3 keys per key group, the load differenc
Hi Regina,
When using the FLIP-6 mode, you can control how long it takes for an idle
TaskManager to be released via resourcemanager.taskmanager-timeout. Per
default it is set to 30s.
In the Flink version you are using, 1.6.4, we do not support TaskManagers
with multiple slots properly [1]. The co
Hi Chan,
After FLIP-6, the Flink ResourceManager dynamically allocate resource from
Yarn on demand.
What's your flink version? On the current code base, if the pending
containers in resource manager
is zero, then it will releaseall the excess containers. Could you please
check the
"Remaining pendi
Hi,
I try to create a tumbling time window of 2 rows each in Flink Java. This must
based on the dateTime (TimeStamp3 datatype) or unixDateTime(BIGINT datatype)
column. I've added below the code of two different code versions. The error
messages I get I placed above the code.
When I print the
Hi Constantinos,
I think your analysis is correct, if you have a multi-tenant scenario, but
there is no distinction in Kafka. Then Flink can't treat different tenants
differently. It is easy to form a data hotspot problem for the difference
in the data volume of different tenants.
A compromise is
*Hi All,*
*[Explanation]*
Two tables say lineitem and orders:
Table
orderstbl=bsTableEnv.fromDataStream(orders,"a,b,c,d,e,f,g,h,i,orders.rowtime");
Table
lineitemtbl=bsTableEnv.fromDataStream(lineitem,"a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,lineitem.rowtime");
bsTableEnv.registerTable("Orders",orderst
23 matches
Mail list logo