Interesting. So if I understand correctly, basically you limited the
parallelism of the sources in order to avoid running the job with constant
backpressure, and then scaled up the windows to maximize throughput.
On Tue, May 4, 2021 at 11:23 PM vishalovercome wrote:
> In one of my jobs, windowin
Hi Sumeet,
This feature is supported in 1.13.0 which was just released and so there is no
documentation about it in 1.12.
Regards,
Dian
> 2021年5月4日 上午2:09,Sumeet Malhotra 写道:
>
> Hi,
>
> Is keyed state [1] supported by PyFlink yet? I can see some code for it in
> the Flink master branch, bu
Hi Prasanna,
in the latest Flink version (1.13.0) I couldn't find these dependencies.
Which version of Flink are you looking at? What you could check is whether
one of these dependencies is contained in one of Flink's shaded
dependencies [1].
[1] https://github.com/apache/flink-shaded
Cheers,
Ti
Thanks for managing the release. +1. I like the focus on improving operations
with this version.
Von: "Matthias Pohl"
An: "Etienne Chauchot"
CC: "dev" , "Dawid Wysakowicz" ,
"user" , annou...@apache.org
Gesendet: Dienstag, 4. Mai 2021 21:53:31
Betreff: Re: [ANNOUNCE] Apache Flink 1.13.0
Thanks Dian. Yes, I hadn't looked at the 1.13.0 documentation earlier.
On Wed, May 5, 2021 at 1:46 PM Dian Fu wrote:
> Hi Sumeet,
>
> This feature is supported in 1.13.0 which was just released and so there
> is no documentation about it in 1.12.
>
> Regards,
> Dian
>
> 2021年5月4日 上午2:09,Sumeet M
Hi,
I would like to split streamed data from Kafka into 2 streams based on some
filter criteria, using PyFlink Table API. As described here [1], a way to
do this is to use .filter() which should split the stream for parallel
processing.
Does this approach work in Table API as well? I'm doing the
Hi Sumeet,
Yes, this approach also works in Table API.
Could you share which API you use to execute the job? For jobs with multiple
sinks, StatementSet should be used. You could refer to [1] for more details on
this.
Regards,
Dian
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1
Hi Robert,
Due to the performance issue of using state processor, I probably would like to
give up state processor and am trying StreamingFileSink in a streaming manner.
However, I need to store the files on GCS. However, I encountered the error
below. It looks like Flink hasn't support GCS for
Hi,
can you check the client log in the "log/" directory?
The Flink client will try to access the K8s API server to retrieve the
endpoint of the jobmanager. For that, the pod needs to have permissions
(through a service account) to make such calls to K8s. My hope is that the
logs or previous messag
One of these (plexus-utils) is afaik used by maven, so the scanner is
potentially scanning the wrong thing. Or you are scanning all
dependencies downloaded during the build of Flink, including everything
used by various plugins of the build process & maven itself.
On 5/5/2021 11:08 AM, Till Ro
Hi,
> Could it be somehow partition info isn't up to date on TM when job is
> restarting?
Partition info should be up to date or become so eventually - but this
is assuming that JM is able to detect the failure.
The latter may not be the case, as Sihan You wrote previously:
> The strange thing i
To all Scala 2.12 users,
Due to a mistake during the release process of Flink 1.12.3 jars
intended to be built against Scala 2.12 were actually built against
Scala 2.11 .
This affects all jars published to maven central; the convenience
binaries are not affected.
Scala 2.12 users are advised
Can some provide a bit more explanation why replacing
/com.google.common.base.Objects.hashCode with toString().hashCode(),/ with
/toString().hashCode()/ making it work?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi,
I'm assuming it's just a workaround for changing fields. The string
representation happens to be stable while the underlying values change.
It's best practice to use completely immutable types if you have similar
issues, you should double-check that nothing can be changed in your data
type or
Thanks for the reply. Here is an updated exception with DEBUG on. It
appears to be timing out:
2021-05-05 16:56:19,700 DEBUG
org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] -
Setting namespace of Kubernetes client to cmdaa
2021-05-05 16:56:19,700 DEBUG
org.apache.flink.kubernetes.
Well, I was thinking you could have avoided overwhelming your internal
services by using something like Flink's async i/o operator, tuned to limit
the total number of concurrent requests. That way the pipeline could have
uniform parallelism without overwhelming those services, and then you'd
rely o
Hi Vishal,
WRT “bring down our internal services” - a common pattern with making requests
to external services is to measure latency, and throttle (delay) requests in
response to increased latency.
You’ll see this discussed frequently on web crawling forums as an auto-tuning
approach.
Typical
Okay, it appears to have resolved 10.43.0.1:30081 as the address of the
JobManager. Most likely, the container can not access this address. Can you
validate this from within the container?
If I understand the Flink documentation correctly, you should be able to
manually specify rest.address, rest.
Hi Ragini,
Since this exception is coming from the Hbase client, I assume the issue
has nothing to do with Flink directly.
I would recommend carefully studying the HBase client configuration
parameters, maybe setup a simple Java application that "hammers" data into
Hbase at a maximum rate to under
Hi,
Bit of background, I have a stream of customers who have purchased some
product, reading these transactions on a KAFKA topic. I want to aggregate
the number of products the customer has purchased in a particular duration
( say 10 seconds ) and write to a sink.
I am using session windows to ac
Yes. While back-pressure would eventually ensure high throughput, hand tuning
parallelism became necessary because the job with high source parallelism
would immediately bring down our internal services - not giving enough time
to flink to adjust the in-rate. Plus running all operators at such a hi
Hi Arvid,
I sent a separate mail titled - Session Windows - not working as expected
closing this thread.
Please have a look when you have a few minutes, much appreciated.
Regards,
Swagat
On Wed, May 5, 2021 at 7:24 PM Swagat Mishra wrote:
> Hi Arvid,
>
> Tried a small POC to reproduce the b
Hi Arvid,
I sent a separate mail titled - Session Windows - not working as expected (
to the user community )
All other details are here if you need, closing this thread.
Please have a look when you have a few minutes, much appreciated.
Regards,
Swagat
On Thu, May 6, 2021 at 1:50 AM Swagat Mis
Adding the code for CustomWatermarkGenerator
.
@Override
public void onEvent(Customer customer, long l, WatermarkOutput
watermarkOutput) {
currentMaxTimestamp = Math.max(currentMaxTimestamp,
customer.getEventTime() );
}
@Override
public void onPeriodicEmit(WatermarkOutput watermarkOutput
This seems to be working fine in processing time but doesn't work in event
time. Is there an issue with the way the water mark is defined or do we
need to set up timers?
Please advise.
WORKING:
customerStream
.keyBy((KeySelector) Customer::getIdentifier)
.window(ProcessingTimeSe
Thanks Dawid & Guowei for the great work, thanks everyone involved.
Best,
Leonard
> 在 2021年5月5日,17:12,Theo Diefenthal 写道:
>
> Thanks for managing the release. +1. I like the focus on improving operations
> with this version.
>
> Von: "Matthias Pohl"
> An: "Etienne Chauchot"
> CC: "dev" ,
Hi Yik San,
You can check whether the execution environment used is
`LocalStreamEnvironment` and you can get the class object corresponding to
the corresponding java object through py4j in PyFlink. You can take a look
at the example I wrote below, I hope it will help you
```
from pyflink.table impo
Hi Shipeng,
Matthias is correct. FLINK-18202 should address this topic. There is
already a pull request there which is in good shape. You can also download
the PR and build the format jar yourself, and then it should work with
Flink 1.12.
Best,
Jark
On Mon, 3 May 2021 at 21:41, Matthias Pohl wr
It seems that you are using the NodePort to expose the rest service. If you
only want to access the Flink UI/rest in the K8s cluster,
then I would suggest to set "kubernetes.rest-service.exposed.type" to
"ClusterIP". Because we are using the K8s master node to
construct the JobManager rest endpoint
Hi Ragini,
How did you submit your job ? The exception here is mostly cuased
that the `flink-client` is not included in the classpath at the client side.
If the job is submitted via the flink cli, namely `flink run -c xx.jar`,
it should be included by default, and if some programming way is used,
Hi Sumeet,
I think you might first convert the table back to the DataStream [1],
then define the timestamp and watermark with
`assignTimestampsAndWatermarks(...)`,
and then convert it back to table[2].
Best,
Yun
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/common/
Hi Xingbo,
Thank you!
On Thu, May 6, 2021 at 10:01 AM Xingbo Huang wrote:
> Hi Yik San,
> You can check whether the execution environment used is
> `LocalStreamEnvironment` and you can get the class object corresponding to
> the corresponding java object through py4j in PyFlink. You can take a
Hi,
You could trigger savepoint via rest API [1] or refer to SavepointITCase[2] to
see how to trigger savepoint in test code.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/rest_api/#jobs-jobid-savepoints
[2]
https://github.com/apache/flink/blob/c688bf3c83e72155ccf5
Thanks Dawid & Guowei as the release managers, and everyone who has
contributed to this release.
Thank you~
Xintong Song
On Thu, May 6, 2021 at 9:51 AM Leonard Xu wrote:
> Thanks Dawid & Guowei for the great work, thanks everyone involved.
>
> Best,
> Leonard
>
> 在 2021年5月5日,17:12,Theo Dief
Hi Users,
Is there a way that Flink 1.9 the checkpointed data can be read using the
state processor api.
Docs [1] says - When reading operator state, users specify the operator
uid, the state name, and the type information.
What is the type for the kafka operator, which needs to be specified whil
Hi,
Is there a working example somewhere that I can refer for writing Avro entities
in Flink state as well as Avro serializaition in KafkaConsumer/Producer?
I tried to use Avro entities directly but there is an issue beyond Apache Avro
1.7.7 in that the entities created have a serialVersionUid.
Thanks for Dawid and Guowei's great work, and thanks for everyone involved for
this release.
Best
Yun Tang
From: Xintong Song
Sent: Thursday, May 6, 2021 12:08
To: user ; dev
Subject: Re: [ANNOUNCE] Apache Flink 1.13.0 released
Thanks Dawid & Guowei as the rele
Hi,
Is your CustomerGenerator setting the event timestamp correctly? Are your
evictors evicting too early?
You can try to add some debug output into the watermark assigner and see if
it's indeed progressing as expected.
On Thu, May 6, 2021 at 12:48 AM Swagat Mishra wrote:
> This seems to be wo
Yes customer generator is setting the event timestamp correctly like I see
below. I debugged and found that the events are getting late, so never
executed. i.e,. in the window operator the method this.isWindowLate(
actualWindow) is getting executed to false for the rest of the events
except the fi
39 matches
Mail list logo