Re: Regarding Flink Upgrades

2022-11-02 Thread Dawid Wysakowicz
ve any updates going forward. Best, Dawid On 02/11/2022 07:45, Prasanna kumar wrote: Hi Community, Currently we are using version 1.12.7 and it is running without any issue. And we see that version 1.17 is set to release early next year. That means we would be 5 versions behind. 1) So

Re: Avro deserialization issue

2022-04-14 Thread Dawid Wysakowicz
Just add dependency on avro 1.10. If I remember correctly that should simply work. If that does not solve the problem, I'd look into which field fails to be deserialized. Best, Dawid On 13/04/2022 18:11, Anitha Thankappan wrote: Hi Piotr, *The code i wrtten in 1.13.1 * public f

Re: Interval join operator is not forwarding watermarks correctly

2022-03-17 Thread Dawid Wysakowicz
'd suggest keeping the watermark generation just right after the source. If this is not possible, as a workaround before it is fixed in Flink, you need to cut off WatermarkStatuses somehow. You can do that either in a custom operator or by modifying the TimestampsAndWatermarksOperator.

Re: Move savepoint to another s3 bucket

2022-03-08 Thread Dawid Wysakowicz
Hi Lukas, I am afraid you're hitting this bug: https://issues.apache.org/jira/browse/FLINK-25952 Best, Dawid On 08/03/2022 16:37, Lukáš Drbal wrote: Hello everyone, I'm trying to move savepoint to another s3 account but restore always failed with some weird 404 error. We are

Re: Could not stop job with a savepoint

2022-03-07 Thread Dawid Wysakowicz
Hi, From the exception it seems the job has been already done when you're triggering the savepoint. Best, Dawid On 07/03/2022 14:56, Vinicius Peracini wrote: Hello everyone, I have a Flink job (version 1.14.0 running on EMR) and I'm having this issue while trying to stop a

Re: Question about Flink counters

2022-03-07 Thread Dawid Wysakowicz
e not preserved across restarts. Counters are generally scoped. Therefore counters in UDFs are scoped[1] to the parallel instance that uses it. You should combine them on the monitoring system side if you need a more general overview. Hope that helps, Best, Dawid [1] https://nightlies.apach

Re: streaming mode with both finite and infinite input sources

2022-02-25 Thread Dawid Wysakowicz
This should be supported in 1.14 if you enable checkpointing with finished tasks[1], which has been added in 1.14. In 1.15 it will be enabled by default. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#enabling-and

Re: CDC using Query

2022-02-04 Thread Dawid Wysakowicz
ezium engine inside of Flink's source and can process DB changelog with all processing guarantees that Flink provides. As for the idea of processing further with Kafka Streams. Why not process data with Flink? What do you miss in Flink? Best, Dawid [1] https://github.com/ververica

Re: Reading from Kafka kafkarecorddeserializationschema

2022-02-04 Thread Dawid Wysakowicz
s used. BTW, have you tried looking at Table API? It would abstract quite a few things for you, e.g. translation of what I presume is a CSV format[2] in your case. Best, Dawid [1] https://github.com/apache/flink/blob/5846d8d61b4b2aa10f925e9f63885cb7f6686303/flink-examples/flink-examples-str

Re: Queryable State Deprecation

2022-02-04 Thread Dawid Wysakowicz
ry to achieve similar results with metrics. Best, Dawid On 01/02/2022 16:36, Jatti, Karthik wrote: Hi, I see on the Flink Roadmap that Queryable state API is scheduled to be deprecated but I couldn’t find much information on confluence or this mailing group’s archives to understand the backgrou

Re: flink savepoints not (completely) relocatable ?

2022-02-03 Thread Dawid Wysakowicz
I looked into the code again and unfortunately I have bad news :( Indeed we treat S3 as if it always injects entropy. Even if the entropy key is not specified, which effectively means it is disabled. I created a JIRA ticket[1] to fix it. Best, Dawid [1] https://issues.apache.org/jira/browse

Re: flink savepoints not (completely) relocatable ?

2022-02-03 Thread Dawid Wysakowicz
Hi Frank. Do you use entropy injection by chance? I am afraid savepoints are not relocatable in combination with entropy injection as described here[1]. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/savepoints/#triggering-savepoints On 03/02/2022 14

Re: create savepoint on bounded source in streaming mode

2022-01-25 Thread Dawid Wysakowicz
Hi Shawn, You could also take a look at the hybrid source[1] Best, Dawid [1]https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/hybridsource/ On 26/01/2022 08:39, Guowei Ma wrote: Hi Shawn Currently Flink can not trigger the sp at the end of the input. An

Re: Examples / Documentation for Flink ML 2

2022-01-17 Thread Dawid Wysakowicz
I am adding a couple of people who worked on it. Hopefully, they will be able to answer you. On 17/01/2022 13:39, Bonino Dario wrote: > > Dear List, > > We are in the process of evaluating Flink ML version 2.0 in the > context of some ML task mainly concerned with classification and > clustering.

Re: [statefun] upgrade path - shared cluster use

2022-01-17 Thread Dawid Wysakowicz
I am pretty confident the goal is to be able to run on the newest Flink version. However, as the release cycle is decoupled for both modules it might take a bit. I added Igal to the conversation, who I hope will be able to give you an idea when you can expect that to happen. Best, Dawid On 17

Re: Flink per-job cluster HbaseSinkFunction fails before starting - Configuration issue

2022-01-17 Thread Dawid Wysakowicz
Hey Kamil, Have you followed this guide to setup kerberos authentication[1]? Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/security/security-kerberos/ On 14/01/2022 17:09, Kamil ty wrote: > Hello all, > I have a flink job that is usi

Re: Sorting/grouping keys and State management in BATCH mode

2022-01-12 Thread Dawid Wysakowicz
ortunately best you can get is the javadocs/comments in the class itself. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-runtime-sort-spilling-threshold [2] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/me

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-12-06 Thread Dawid Wysakowicz
just concerned with the entry in logs? Best, Dawid [1] https://github.com/apache/flink/blob/ef0e17ad6319175ce0054fc3c4db14b78e690dd6/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/ExecutionCheckpointingOptions.java#L236 [2] https://nightlies.apache.org/flink/flink

Re: Upgrade from 1.13.1 to 1.13.2/1.13.3 failing

2021-11-09 Thread Dawid Wysakowicz
lized checkpoint created from 1.13.1. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#starting-a-job-from-a-savepoint On 22/10/2021 16:39, Chesnay Schepler wrote: > The only suggestion I can offer is to take a savepoint with 1.13.1 and > try to restor

Re: Duplicate Calls to Cep Filter

2021-11-01 Thread Dawid Wysakowicz
to evaluate the condition for the IGNORE edge, because we should IGNORE the element (or in other words wait for a next element) only if we have not taken it. Therefore the condition for the IGNORE edge is NOT(TAKE_CONDITION). Best, Dawid On 29/10/2021 01:05, Puneet Duggal wrote: > Hi Yun

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Dawid Wysakowicz
lies with the individual serving as release manager. The specifics of the process may vary from project to project,*but the 'minimum quorum of three +1 votes' rule is universal.* Best, Dawid https://www.apache.org/foundation/voting.html#ReleaseVotes On 19/10/2021 14:2

Re: Issue with Flink UI for Flink 1.14.0

2021-10-14 Thread Dawid Wysakowicz
I am afraid it is a bug in flink 1.14. I created a ticket for it FLINK-24550[1]. I believe we should pick it up soonish. Thanks for reporting the issue! Best, Dawid [1] https://issues.apache.org/jira/browse/FLINK-24550 On 13/10/2021 20:32, Peter Westermann wrote: > > Hello, > >

Re: a question about flink table catalog.

2021-10-14 Thread Dawid Wysakowicz
flink-packages, but I am not aware of any plans. (cc Timo) Best, Dawid On 14/10/2021 15:05, Yuepeng Pan wrote: > Dawid Wysakowicz > >    Thanks for your reply.  Will community to plan to implement this > feature ?  > > > > Best,  > Roc > > > > At 2021-1

Re: a question about flink table catalog.

2021-10-14 Thread Dawid Wysakowicz
If I understand your question correctly, you're asking if you can somehow persist the GenericInMemoryCatalog. I am afraid it is not possible. The idea of the GenericInMemoryCatalog is that it is transient and is stored purely in memory. Best, Dawid On 14/10/2021 13:44, Yuepeng Pan wrote:

Re: Exception: SequenceNumber is treated as a generic type

2021-10-14 Thread Dawid Wysakowicz
9 to track the kinesis issue.[1] On the backpressure note, are you sure the issue is in the serialization? Have you tried identifying the slow task first?[2] Best, Dawid [1] https://issues.apache.org/jira/browse/FLINK-24549 [2] https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/

Re:

2021-10-14 Thread Dawid Wysakowicz
I hope Rui (in cc) will be able to help you. Best, Dawid On 12/10/2021 15:32, Andrew Otto wrote: > Hello, > > I'm trying to use HiveCatalog with Kerberos.  Our Hadoop cluster, our > Hive Metastore, and our Hive Server are kerberized.  I can > successfully submit Flink jobs t

Re: Issues while upgrading from 1.12.1 to 1.14.0

2021-10-06 Thread Dawid Wysakowicz
t what are the "accumulators" you refer to? Are they *State primitives[1] or really constructs that are called Accumulator[2]? The latter are not checkpointed. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-ke

Re: Error: Timeout of 60000ms expired before the position for partition

2021-10-04 Thread Dawid Wysakowicz
Hi, Do you mean that you fail to start Kafka? Or do you get the exception from Flink. Could you please share the full stack trace of the error? Best, Dawid On 02/10/2021 16:58, Dipanjan Mazumder wrote: > Hi, > >   I am getting below error while starting the flink as a standalone >

Re: Exception thrown during batch job execution on YARN even though job succeeded

2021-10-04 Thread Dawid Wysakowicz
Hey Ken, Regarding Rufus, I know he might be a bit eager in changing lines ;) If you want to ignore his changes in git blame, please take a look here[1]. For the main issue, do you mind creating a ticket? I hope someone will be able to pick it up. Best, Dawid [1] https://nightlies.apache.org

Re: Flink application mode with no ui , how to start job using k8s ?

2021-10-04 Thread Dawid Wysakowicz
simply not expose the port? Best, Dawid [1] https://flink.apache.org/2021/05/06/reactive-mode.html On 30/09/2021 02:41, Dhiru wrote: > Hi , > >    My requirement is to create Flink cluster application Mode on k8s > and do not want to expose UI, my requirement is to start the > long-

[ANNOUNCE] Apache Flink 1.14.0 released

2021-09-29 Thread Dawid Wysakowicz
?projectId=12315522&version=12349614   We would like to thank all contributors of the Apache Flink community who made this release possible!   Regards, Xintong, Joe, Dawid OpenPGP_signature Description: OpenPGP digital signature

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Dawid Wysakowicz
y for deserialization. Best, Dawid On 07/09/2021 16:31, Joel Edwards wrote: > Good day, > > I have been attempting to submit a job to a session cluster. This job > involves a pair of dynamic tables and a SQL query. The SQL query is > calling a UDF which I register via the table API's > c

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Dawid Wysakowicz
there. Best, Dawid On 07/09/2021 16:31, Joel Edwards wrote: > Good day, > > I have been attempting to submit a job to a session cluster. This job > involves a pair of dynamic tables and a SQL query. The SQL query is > calling a UDF which I register via the table API's > c

Re: Broadcast data to all keyed streams

2021-09-07 Thread Dawid Wysakowicz
of the operator are either keyed or non-keyed. Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/broadcast_state/ On 06/09/2021 18:02, James Sandys-Lumsdaine wrote: > Hello, > > I have a Flink workflow which is partitioned on a

Re: De/Serialization API to tear-down user code

2021-09-02 Thread Dawid Wysakowicz
mmunicates over REST. Moreover no other schema needs a close for now. For the Table API we also need only the open for generating the code of the serializer. Now that you're reaching out with such a requirement we might revisit it. WDYT Arvid? Best, Dawid [1] https://issues.apache.

Re: Watermark UI after checkpoint failure

2021-07-18 Thread Dawid Wysakowicz
Do you mean a failed checkpoint, or do you mean that it happens after a restore from a checkpoint? If it is the latter then this is kind of expected, as watermarks are not checkpointed and they need to be repopulated again. Best, Dawid On 19/07/2021 07:41, Dan Hill wrote: > After my dev fl

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-15 Thread Dawid Wysakowicz
, which is usually a permanent storage you most probably want to store alongside records with different schemas (because the schema may evolve over time) thus you need schema registry. Best, Dawid [1] https://avro.apache.org/docs/1.10.2/spec.html#Data+Serialization+and+Deserialization [2] https://

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-13 Thread Dawid Wysakowicz
use the writer schema retrieved from schema registry as the reader schema. I hope this answers your questions. Best, Dawid [1] https://avro.apache.org/docs/1.10.2/spec.html On 09/07/2021 03:09, M Singh wrote: > Hi: > > I am trying to read avro encoded messages from Kafka with schema

Re: My batch source doesn't emit MAX_WATERMARK when it finishes - why?

2021-07-08 Thread Dawid Wysakowicz
in 1.14 or by implementing a ProcessFunction with a timer for Long.MAX_VALUE, or lastly with a custom operator. Best, Dawid On 08/07/2021 14:51, Yik San Chan wrote: > Hi, > > According to the docs [1] > > When a source reaches the end of the input, it emits a final watermark &

Re: Flink cep checkpoint size

2021-07-08 Thread Dawid Wysakowicz
Hi, Sorry for the late reply. Indeed I found a couple of problems with clearing the state for short lived keys. I created a JIRA[1] issue to track it and opened a PR (which needs test coverage before it can be merged) with fixes for those. Best, Dawid [1] https://issues.apache.org/jira/browse

Re: Flink Metric Reporting from Job Manager

2021-07-08 Thread Dawid Wysakowicz
Hi, I think that is not directly supported. After all, the main method can also be executed outside of a JobManager and there you don't have any Flink context/connections/components set up. Best, Dawid On 08/07/2021 00:12, Mason Chen wrote: > Hi all, > > Does Flink support rep

Re: Question about POJO rules - why fields should be public or have public setter/getter?

2021-07-08 Thread Dawid Wysakowicz
ith a state stored in an incompatible way with the updated serializer. This is not a problem for Table/SQL programs as we control the state internally, and that's why we were able to change the requirements for POJOs in Table/SQL programs. [1] Best, Dawid [1] https://ci.apache.org/projects/f

Re: Write Kafka message header using FlinkKafkaProducer

2021-06-21 Thread Dawid Wysakowicz
Hi, You can use KafkaSerializationSchema[1] which can create a ProducerRecord with Headers. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/connectors/kafka/KafkaSerializationSchema.html On 21/06/2021 12:58, Tan, Min wrote

Re: unsubscribe

2021-06-21 Thread Dawid Wysakowicz
You should send a message to user-unsubscr...@flink.apache.org if you want to unsubscribe. On 20/06/2021 00:08, SANDEEP PUNIYA wrote: OpenPGP_signature Description: OpenPGP digital signature

Re: unsubscribe

2021-06-21 Thread Dawid Wysakowicz
You should send a message to user-unsubscr...@flink.apache.org if you want to unsubscribe. On 19/06/2021 18:04, 林俊良 wrote: > OpenPGP_signature Description: OpenPGP digital signature

Re:

2021-06-21 Thread Dawid Wysakowicz
You should send a message to user-unsubscr...@flink.apache.org if you want to unsubscribe. On 21/06/2021 04:25, 张万新 wrote: > unsubscribe OpenPGP_signature Description: OpenPGP digital signature

Re: DSL for Flink CEP

2021-06-03 Thread Dawid Wysakowicz
aries. Lastly, I am not aware of any comparisons of CEP libraries/extensions that work with Flink. I am afraid you have to do the feature comparison yourself. I think the documentation for community supported library is a good start for it. Best, Dawid [1] https://ci.apache.org/projects/flink/

Re: [ANNOUNCE] Apache Flink 1.13.1 released

2021-05-31 Thread Dawid Wysakowicz
official image, as it depends on the maintainers of docker hub. Best, Dawid On 31/05/2021 08:01, Youngwoo Kim (김영우) wrote: > Great work! Thank you Dawid and all of the contributors. > I'm eager to adopt the new release, however can't find docker images for > that from https://hub

Re: Running multiple CEP pattern rules

2021-05-31 Thread Dawid Wysakowicz
I am afraid there is no much of an active development going on in the CEP library. I would not expect new features there in the nearest future. On 28/05/2021 22:00, Tejas wrote: > Hi Dawid, > Do you have any plans to bring this functionality in flink CEP in future ? > > > &

Re: Running multiple CEP pattern rules

2021-05-28 Thread Dawid Wysakowicz
can have many different patterns in a single job, but the number of vertices in your graph is not unlimited. In your scenario I'd try to combine the rules in a single operator. You could try to use the ProcessFunction for that. Best, Dawid On 28/05/2021 01:53, Tejas wrote: > Hi, > We

[ANNOUNCE] Apache Flink 1.13.1 released

2021-05-28 Thread Dawid Wysakowicz
/05/28/release-1.13.1.html|   |The full release notes are available in Jira:| |https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12350058|   |We would like to thank all contributors of the Apache Flink community who made this release possible!|   |Regards,| |Dawid

Re: [VOTE] Release 1.13.1, release candidate #1

2021-05-28 Thread Dawid Wysakowicz
Thank you all for the votes. I am happy to say we approved the release. I will write a separate summary mail. Best, Dawid On 28/05/2021 14:40, Robert Metzger wrote: > +1 (binding) > > - Tried out reactive mode in from the scala 2.11 binary locally (with > scale up & sto

Re: [VOTE] Release 1.13.1, release candidate #1

2021-05-27 Thread Dawid Wysakowicz
+1 (binding) * verified signatures and checksums * built from sources and run an example, quickly checked Web UI * checked diff of pom.xml and NOTICE files from 1.13.0, o there were no version changes, o checked the updated licenses of javascript dependencies Best, Dawid On

Re: Customer operator in BATCH execution mode

2021-05-26 Thread Dawid Wysakowicz
running in the BATCH execution mode. Moreover it uses a different kind of StateBackend. Actually a dummy one, which just imitates a real state backend. Best, Dawid [1] https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runt

[VOTE] Release 1.13.1, release candidate #1

2021-05-25 Thread Dawid Wysakowicz
lease-1.13.1-rc1" [5],| |* website pull request listing the new release and adding announcement blog post [6]. |   |The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.|   |Best,| |Dawid|   |[1] https://issues.apache.org/j

Re: Is it possible to leverage the sort order in DataStream Batch Execution Mode?

2021-05-21 Thread Dawid Wysakowicz
I am afraid it is not possible to leverage the sorting for business logic. The sorting is applied on binary representation of the key as it is not necessary sorting per se, but rather grouping by the same keys. You can find more information in the FLIP of this feature e.g. here[1] Best, Dawid

Re: Possible way to avoid unnecessary serialization calls.

2021-05-12 Thread Dawid Wysakowicz
Hi Alex, I cannot reproduce the issue. Do you mind checking if it is not an issue on your side? P.S. It would be nice if you could reply to the ML as well. That way other people can benefit from the answers. Moreover there will be more people who could help answering your question. Best, Dawid

Re: Flink: Clarification required

2021-05-10 Thread Dawid Wysakowicz
ache camel along with Flink ? I am not very familiar with Apache Camel so can't say much on this. As far as I know Apache Camel is more of a routing system, whereas Flink is a data processing framework. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connec

Re: Possible way to avoid unnecessary serialization calls.

2021-05-10 Thread Dawid Wysakowicz
look into enabling object reuse[2]. Make sure though you work with immutable objects. Secondly, all operators that simply forwards records should be chained by default. If you need a more fine grained control over it you can look into this docs[3] Best, Dawid [1] https://ci.apache.org/projects/f

Re: What does enableObjectReuse exactly do?

2021-05-10 Thread Dawid Wysakowicz
Hi, In the streaming API, the biggest difference is that if you do not disable object reuse, records will be duplicated/copied when forwarding from an operator to the downstream one. If you are sure you work with immutable objects, I'd highly recommend enabling object reuse. Best, Dawid

Re: Read kafka offsets from checkpoint - state processor

2021-05-10 Thread Dawid Wysakowicz
/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L984 Hope it helps. Best, Dawid On 07/05/2021 14:43, bat man wrote: > Anyone who has tried this or can help on this. > > Thanks. > > On Thu, May 6, 2021 at 10:34 AM bat man <mailto:tintin0...@gmail.com>> wrote: > >

Re: Unsubscribe

2021-05-10 Thread Dawid Wysakowicz
ou did confirm it and you are still receiving messages from the ML, make sure you're using the mail you're subscribed with. Best, Dawid On 06/05/2021 18:19, Dan Pettersson wrote: > I've also tried a few times now the last couple of months. I think it > would be very nice if t

Re: callback by using process function

2021-05-10 Thread Dawid Wysakowicz
r the current key. Best, Dawid On 06/05/2021 12:30, Abdullah bin Omar wrote: > Hi, > > According to [1] example section, > > (i) If we check the stored count of the last modification time against > the previous timestamp count, then emit the count if they (count from > las

Re: some questions about data skew

2021-05-10 Thread Dawid Wysakowicz
step. Best, Dawid On 06/05/2021 09:24, jester jim wrote: > Hi, > I have run a program to monitor the sum of the delay in every minutes > of a stream,this is my code: > .map(new RichMapFunction[String,(Long,Int)] { > override def map(in: String): (Long,Int) = { > var s

Re: Guaranteeing event order in a KeyedProcessFunction

2021-05-10 Thread Dawid Wysakowicz
from the "sorting" operator. If you emit records with timestamps larger than the Watermark that "triggered" its generation it can become late. Hope those tips could help you a bit. Best, Dawid On 04/05/2021 14:49, Miguel Araújo wrote: > Hi Timo, > > Thanks for

[ANNOUNCE] Apache Flink 1.13.0 released

2021-05-03 Thread Dawid Wysakowicz
nk all contributors of the Apache Flink community who made this release possible!|   |Regards,| |Guowei & Dawid | OpenPGP_signature Description: OpenPGP digital signature

Re: Contiguity in SQL vs CEP

2021-04-25 Thread Dawid Wysakowicz
Hi, MATCH_RECOGNIZE clause in SQL standard does not support different contiguities. The MATCH_RECOGNIZE always uses the strict contiguity. Best, Dawid On 21/04/2021 00:02, tbud wrote: > There's 3 different types of Contiguity defined in the CEP documentation [1] > looping +

Re: Contiguity and state storage in CEP library

2021-04-25 Thread Dawid Wysakowicz
es it will be processed and if it does not match it will be discarded and it won't be stored any longer. Best, Dawid On 21/04/2021 02:44, tbud wrote: > We are evaluating a use-case where there will be 100s of events stream coming > in per second and we want to run some fixed set of

Re: Python Integration with Ververica Platform

2021-04-13 Thread Dawid Wysakowicz
I'd recommend reaching out directly to Ververica. Ververica platform is not part of the open-source Apache Flink project. I can connect you with Konstantin who I am sure will be happy to answer your question ;) Best, Dawid On 12/04/2021 15:40, Robert Cullen wrote: > I've b

Re: NPE when aggregate window.

2021-04-13 Thread Dawid Wysakowicz
Hi, Could you check that your grouping key has a stable hashcode and equals? It is very likely caused by an unstable hashcode and that a record with an incorrect key ends up on a wrong task manager. Best, Dawid On 13/04/2021 08:47, Si-li Liu wrote: > Hi,  > > I encounter a weird NPE

[ANNOUNCE] Release 1.13.0, release candidate #0

2021-04-01 Thread Dawid Wysakowicz
itory [3],| |* source code tag "release-1.2.3-rc3" [4],|   |Your help testing the release will be greatly appreciated! |   |Thanks,| |Dawid Wysakowicz |   |[1] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc0/ | |[2] https://dist.apache.org/repos/dist/release/fl

Re: [DISCUSS] Feature freeze date for 1.13

2021-03-31 Thread Dawid Wysakowicz
a proper review rather than rush unfinished feature and try to fix it later. Moreover it got broader support. Unless somebody else objects, I think we can merge this PR later and include it in RC1. Best, Dawid On 01/04/2021 08:39, Arvid Heise wrote: > Hi Dawid and Guowei, > > I'd lik

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-29 Thread Dawid Wysakowicz
Hey, I am not sure which format you use, but if you work with JSON maybe this option[1] could help you. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/json.html#json-timestamp-format-standard On 30/03/2021 06:45, Sumeet Malhotra wrote

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

2021-03-23 Thread Dawid Wysakowicz
way as DataStream APi. The documentation you linked, Aeden, describes the SQL API. @Jark @Timo Could you verify if the SQL documentation is correct? Best, Dawid On 23/03/2021 15:20, Matthias Pohl wrote: > Hi Aeden, > sorry for the late reply. I looked through the code and verified that

[DISCUSS] Feature freeze date for 1.13

2021-03-23 Thread Dawid Wysakowicz
atly minimize the chances of failling tests 4. Push the change to the main branch Let us know what you think! Best, Guowei & Dawid OpenPGP_signature Description: OpenPGP digital signature

Re: Eliminating Shuffling Under FlinkSQL

2021-03-19 Thread Dawid Wysakowicz
Your understanding of a group by is correct. It is equivalent to a key by. I agree it would be a great feature to keep the Source's partitioning but unfortunately as of now it is not yet supported. Best, Dawid On 18/03/2021 18:28, Aeden Jameson wrote: > It's my understanding that

Re: Understanding Max Parallelism

2021-03-19 Thread Dawid Wysakowicz
works in this blogpost[1]. Following up on your other questions it is mainly a reservation as of now, but it will definitely be a cap in case of a reactive/auto scaling because of the above. Best, Dawid [1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html On 18/03/2021 17

Re: Parameter to config read frequency in Kafka SQL connector

2021-03-19 Thread Dawid Wysakowicz
Hi, Unfortunately I have no experience with this. You can pass any Kafka client parameters through the properties.* option[1] and see if the setting works for you. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html#properties On 18/03

Re: Flink minimum resource recommendation on k8s cluster

2021-03-19 Thread Dawid Wysakowicz
I'd say no. It depends on your job. You can refer to a very good presentation from Robert on how to calculate resource requirements[1]. [1] https://www.youtube.com/watch?v=8l8dCKMMWkw On 18/03/2021 11:37, Amit Bhatia wrote: > Hi, > > Is there any minimum resource ( CPU & Memory) recommendation to

Re: The Role of TimerService in ProcessFunction

2021-03-19 Thread Dawid Wysakowicz
ty the old behaviour has been kept. Hope that it clarifies the things a bit. Best, Dawid On 17/03/2021 07:47, Chirag Dewan wrote: > Hi, > > Currently, both ProcessFunction and KeyedProcessFunction (and their > CoProcess counterparts) expose the Context and TimerService in the >

Re: Production Readiness of File Source

2021-03-18 Thread Dawid Wysakowicz
who wrote the File Source to grasp his opinion as well. Best, Dawid On 17/03/2021 06:53, Chirag Dewan wrote: > Hi, > > I am intending to use the File source for a production use case. I > have a few use cases that are currently not supported like deleting a > file once it&#

Re: ClassCastException after upgrading Flink application to 1.11.2

2021-03-18 Thread Dawid Wysakowicz
Could you share a full stacktrace with us? Could you check the stack trace also in the task managers logs? As a side note, make sure you are using the same version of all Flink dependencies. Best, Dawid On 17/03/2021 06:26, soumoks wrote: > Hi, > > We have upgraded an application o

Re: How to advance Kafka offsets manually with checkpointing enabled on Flink (TableAPI)

2021-03-18 Thread Dawid Wysakowicz
Another approach that you could try is to edit the checkpoint via the State Processor API[2] and increase the checkpointed offsets. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/cli.html#starting-a-job-from-a-savepoint [2] https://ci.apache.org/projects/flink

Re: custom metrics within a Trigger

2021-03-18 Thread Dawid Wysakowicz
Do you mind sharing the code how do you register your metrics with the TriggerContext? It could help us identify where does name collisions come from. As far as I am aware it should be fine to use the TriggerContext for registering metrics. Best, Dawid On 16/03/2021 17:35, Aleksander Sumowski

Re: DataStream in batch mode - handling (un)ordered bounded data

2021-03-12 Thread Dawid Wysakowicz
it would be nice if you could create a jira ticket for it. Best, Dawid On 12/03/2021 15:37, Alexis Sarda-Espinosa wrote: > > Hello, > >   > > Regarding the new BATCH mode of the data stream API, I see that the > documentation states that some operators will process all d

Re: Gradually increasing checkpoint size

2021-03-11 Thread Dawid Wysakowicz
Therefore the cleaning logic in onTimer effectively uses the same logic. If I understand it correctly, this trick was introduced to deduplicate the method. There might be a bug somewhere, but I don't think it's where you pointed. I'd suggest to first investigate the progress of watermarks

Re: Flink + Hive + Compaction + Parquet?

2021-03-04 Thread Dawid Wysakowicz
Hi, I know Jingsong worked on Flink/Hive filesystem integration in the Table/SQL API. Maybe he can shed some light on your questions. Best, Dawid On 02/03/2021 21:03, Theo Diefenthal wrote: > Hi there, > > Currently, I have a Flink 1.11 job which writes parquet file

Re: State Schema Evolution within SQL API

2021-03-04 Thread Dawid Wysakowicz
result in a completely different physical plan. Generally speaking you should be fine when adding/removing fields in a projection. I'd say it is the only somewhat safe change, but it is not guaranteed in all cases nevertheless. Best, Dawid On 01/03/2021 17:41, Jan Oelschlegel wrote: > >

Re: timeWindow()s and queryable state

2021-03-04 Thread Dawid Wysakowicz
;("any", IntSerializer.INSTANCE); desc.setQueryable("vanilla"); Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/state/queryable_state.html#managed-keyed-state On 01/03/2021 17:39, Ron Crocker wrote: > Hi all - > > I’m trying to keep s

Re: Producer Configuration

2021-03-04 Thread Dawid Wysakowicz
issue could you post your configuration and the stacktrace you are getting? Best, Dawid On 28/02/2021 03:14, Alexey Trenikhun wrote: > They are picked up, otherwise you would not able to write any messages > at all. I believe the page you referring is not for displaying Kafka > properti

Re: Flink CEP: can't process PatternStream (v 1.12, EventTime mode)

2021-02-26 Thread Dawid Wysakowicz
Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/event_timestamps_watermarks.html#dealing-with-idle-sources [2] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/event_timestamps_watermarks.html#watermark-strategies-and-the-kafka-connector On 26/02

Re: Flink CEP: can't process PatternStream

2021-02-26 Thread Dawid Wysakowicz
= CEP.pattern(stream, pattern).inProcessingTime(); Basically you are facing exactly the same problem as described in the stackoverflow entry you posted. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/event_timestamps_watermarks.html#generating-watermarks On 26

[UPDATE] Release 1.13 feature freeze

2021-02-24 Thread Dawid Wysakowicz
more thorough update if you feel like so. Your release managers Guowei & Dawid OpenPGP_signature Description: OpenPGP digital signature

Re: 回复: DataStream problem

2021-02-17 Thread Dawid Wysakowicz
I am sure you can achieve that with a ProcessFunction[1] Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/operators/process_function.html#process-function On 16/02/2021 07:28, ?g???U?[ wrote: > Hi Dawid > > ?0?2 ?0?2 For example, if user 001

Re: Failed to register Protobuf Kryo serialization

2021-02-15 Thread Dawid Wysakowicz
e the GenericType, but a PojoType and this message helps you to identify a problem with your POJO declaration. Best, Dawid On 15/02/2021 11:50, Svend Vanderveken wrote: > Oh! > > Indeed, my program was just not starting because I omitted the > flink.execute() part ! I confirms it works

Re: DataStream problem

2021-02-15 Thread Dawid Wysakowicz
Hi Jiazhi, Could you elaborate what exactly do you want to achieve? What have you tried so far? Best, Dawid On 15/02/2021 11:11, ?g???U?[ wrote: > Hi all > ?0?2 ?0?2 ?0?2 ?0?2Using DataStream, How to implement a message and the same > message appears again 10 minutes later?

Re: Performance issues when RocksDB block cache is full

2021-02-15 Thread Dawid Wysakowicz
as he is on vacation right now). Best, Dawid On 14/02/2021 06:57, Yaroslav Tkachenko wrote: > Hello, > > I observe throughput degradation when my pipeline reaches the maximum > of the allocated block cache.  > > The pipeline is consuming from a few Kafka topics at a high rate

Re: Failed to register Protobuf Kryo serialization

2021-02-15 Thread Dawid Wysakowicz
Hey, Why do you say the way you did it, does not work? The logs you posted say the classes cannot be handled by Flink's built-in mechanism for serializing POJOs and it falls back to a GenericType which is serialized with Kryo and should go through your registered serializer. Best, Dawid

Re: Generated SinkConversion code ignores incoming StreamRecord timestamp

2021-02-14 Thread Dawid Wysakowicz
The best I can do is point you to the thread[1]. I am also cc'ing Yuan who is the release manager for 1.12.2. Best, Dawid [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Apache-Flink-1-12-2-td48603.html On 15/02/2021 08:51, Yuval Itzchakov wrote:

Re: Generated SinkConversion code ignores incoming StreamRecord timestamp

2021-02-14 Thread Dawid Wysakowicz
Hey Yuval, Could it be that you are hitting this bug[1], which has been fixed recently? Best, Dawid [1] https://issues.apache.org/jira/browse/FLINK-21013 On 15/02/2021 08:20, Yuval Itzchakov wrote: > Hi, > > I have a source that generates events with timestamps. These flow > n

  1   2   3   4   5   >