Re: Debezium Flink EMR

2020-08-21 Thread Marta Paes Moreira
Hi, Rex. Part of what enabled CDC support in Flink 1.11 was the refactoring of the table source interfaces (FLIP-95 [1]), and the new ScanTableSource [2], which allows to emit bounded/unbounded streams with insert, update and delete rows. In theory, you could consume data generated with Debezium

Re: Debezium Flink EMR

2020-08-24 Thread Marta Paes Moreira
alues I just > delete by pk, and then I can build out the rest of my joins like normal. > > Are there any performance implications of doing it this way that is > different from the out-of-the-box 1.11 solution? > > On Fri, Aug 21, 2020 at 2:28 AM Marta Paes Moreira > wrote: &g

Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-27 Thread Marta Paes Moreira
Congrats, Dian! On Thu, Aug 27, 2020 at 11:39 AM Yuan Mei wrote: > Congrats! > > On Thu, Aug 27, 2020 at 5:38 PM Xingbo Huang wrote: > >> Congratulations Dian! >> >> Best, >> Xingbo >> >> jincheng sun 于2020年8月27日周四 下午5:24写道: >> >>> Hi all, >>> >>> On behalf of the Flink PMC, I'm happy to annou

Re: Debezium Flink EMR

2020-08-31 Thread Marta Paes Moreira
tors.kafka.internal.KafkaFetcher.partitionConsumerRecordsHandler(KafkaFetcher.java:181) >>>>>> ~[?:?] >>>>>> flink-jobmanager_1 | at >>>>>> org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:141)

Re: Does Flink support such a feature currently?

2020-09-22 Thread Marta Paes Moreira
Hi, Roc. *Note:* in the future, please send this type of questions to the user mailing list instead (user@flink.apache.org)! If I understand your question correctly, this is possible using the LIKE clause and a registered catalog. There is currently no implementation for the MySQL JDBC catalog, b

Re: Alink and Flink ML

2020-03-09 Thread Marta Paes Moreira
Hi, Flavio. Indeed, Becket is the best person to answer this question, but as far as I understand the idea is that Alink will be contributed back to Flink in the form of a refactored Flink ML library (sitting on top of the Table API) [1]. You can follow the progress of these efforts by tracking FL

Re: FLINK SQL中时间戳怎么处理处理

2020-03-23 Thread Marta Paes Moreira
Hi, 吴志勇. Please use the *user-zh* mailing list (in CC) to get support in Chinese. Thanks! Marta On Mon, Mar 23, 2020 at 8:35 AM 吴志勇 <1154365...@qq.com> wrote: > 如题: > 我向kafka中输出了json格式的数据 > {"id":5,"price":40,"timestamp":1584942626828,"type":"math"} > {"id":2,"price":70,"timestamp":15849426296

Re: subscribe messages

2020-03-25 Thread Marta Paes Moreira
Hi, Jianhui! To subscribe, please send an e-mail to user-subscr...@flink.apache.org instead. For more information on mailing list subscriptions, check [1]. [1] https://flink.apache.org/community.html#mailing-lists On Wed, Mar 25, 2020 at 10:07 AM Jianhui <980513...@qq.com> wrote: > >

Re: [Third-party Tool] Flink memory calculator

2020-04-01 Thread Marta Paes Moreira
Hey, Yangze. I'd like to suggest that you submit this tool to Flink Community Pages [1]. That way it can get more exposure and it'll be easier for users to find it. Thanks for your contribution! [1] https://flink-packages.org/ On Tue, Mar 31, 2020 at 9:09 AM Yangze Guo wrote: > Hi, there. > >

Re: Anomaly detection Apache Flink

2020-04-03 Thread Marta Paes Moreira
Hi, Salvador. You can find some more examples of real-time anomaly detection with Flink in these presentations from Microsoft [1] and Salesforce [2] at Flink Forward. This blogpost [3] also describes how to build that kind of application using Kinesis Data Analytics (based on Flink). Let me know

Re: Anomaly detection Apache Flink

2020-04-03 Thread Marta Paes Moreira
sn’t help with getting it > working with Flink, but may be a good place to start for an algorithm. > > > > https://github.com/aws/random-cut-forest-by-aws > > > > Ryan > > > > *From:* Marta Paes Moreira > *Sent:* Friday, April 3, 2020 5:25 AM > *To:* Salva

Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

2020-04-07 Thread Marta Paes Moreira
Thank you for managing the release, Gordon — you did a tremendous job! And to everyone else who worked on pushing it through. Really excited about the new use cases that StateFun 2.0 unlocks for Flink users and beyond! Marta On Tue, Apr 7, 2020 at 4:47 PM Hequn Cheng wrote: > Thanks a lot for

Re: How to use OpenTSDB as Source?

2020-04-22 Thread Marta Paes Moreira
Hi, Lucas. There was a lot of refactoring in the Table API / SQL in the last release, so the user experience is not ideal at the moment — sorry for that. You can try using the DDL syntax to create your table, as shown in [1,2]. I'm CC'ing Timo and Jark, who should be able to help you further. Ma

Re: Task Assignment

2020-04-23 Thread Marta Paes Moreira
Hi, Navneeth. If you *key* your stream using stream.keyBy(…), this will logically split your input and all the records with the same key will be processed in the same operator instance. This is the default behavior in Flink for keyed streams and transparently handled. You can read more about it i

Re: Flink Forward 2020 Recorded Sessions

2020-04-23 Thread Marta Paes Moreira
Hi, Sivaprasanna. The talks will be up on Youtube sometime after the conference ends. Today, the starting schedule is different (9AM CEST / 12:30PM IST / 3PM CST) and more friendly to Europe, India and China. Hope you manage to join some sessions! Marta On Fri, 24 Apr 2020 at 06:58, Sivaprasann

Re: Task Assignment

2020-04-27 Thread Marta Paes Moreira
e a common key to do this > but I would have to parallelize as much as possible since the number of > incoming messages is too large to narrow down to a single key and > processing it. > > Thanks > > On Thu, Apr 23, 2020 at 2:02 AM Marta Paes Moreira > wrote: > >>

Re: Flink Forward 2020 Recorded Sessions

2020-04-28 Thread Marta Paes Moreira
the information. > > On Fri, 24 Apr 2020 at 11:20 AM, Marta Paes Moreira > wrote: > >> Hi, Sivaprasanna. >> >> The talks will be up on Youtube sometime after the conference ends. >> >> Today, the starting schedule is different (9AM CEST / 12:30PM IST / 3PM &

Re: Python UDF from Java

2020-04-30 Thread Marta Paes Moreira
Hi, Flavio. Extending the scope of Python UDFs is described in FLIP-106 [1, 2] and is planned for the upcoming 1.11 release, according to Piotr's last update. Hope this addresses your question! Marta [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Fun

Re: Webinar: Unlocking the Power of Apache Beam with Apache Flink

2020-05-27 Thread Marta Paes Moreira
Thanks for sharing, Aizhamal - it was a great webinar! Marta On Wed, 27 May 2020 at 23:17, Aizhamal Nurmamat kyzy wrote: > Thank you all for attending today's session! Here is the YT recording: > https://www.youtube.com/watch?v=ZCV9aRDd30U > And link to the slides: > https://github.com/aijamaln

Re: Installing Ververica, unable to write to file system

2020-05-28 Thread Marta Paes Moreira
Hi, Charlie. This is not the best place for questions about Ververica Platform CE. Please use community-edit...@ververica.com instead — someone will be able to support you there! If you have any questions related to Flink itself, feel free to reach out to this mailing list again in the future. T

Re: Is Flink HIPAA certified

2020-06-30 Thread Marta Paes Moreira
Hi, Prasanna. We're not aware of any Flink users in the US healthcare space (as far as I know). I'm looping in Ryan from AWS, as he might be able to tell you more about how you can become HIPAA-compliant with Flink [1]. Marta [1] https://docs.aws.amazon.com/kinesisanalytics/latest/java/akda-jav

Re: [DISCUSS] FLIP-133: Rework PyFlink Documentation

2020-07-31 Thread Marta Paes Moreira
Hi, Jincheng! Thanks for creating this detailed FLIP, it will make a big difference in the experience of Python developers using Flink. I'm interested in contributing to this work, so I'll reach out to you offline! Also, thanks for sharing some information on the adoption of PyFlink, it's great t

Re: Community chat?

2021-02-24 Thread Marta Paes Moreira
Ah! That freenode channel dates back to...2014? The community is not maintaining any channels other than the Mailing List (and Stack Overflow), currently. But this is something we're looking into, as it's coming up more and more frequently. Would Slack be your first pick? Or would something async

Re: Size of state for any known production use case

2020-02-13 Thread Marta Paes Moreira
Hi, Reva. If you are looking for the maximum known state size, I believe Alibaba is using Flink at the largest scale in production [1]. There are also other examples of variable scale scattered across Flink Forward talks [2]. In particular, this Netflix talk [3] should be interesting to you. Mar

Re: [DISCUSS] Create a Flink ecosystem website

2019-07-19 Thread Marta Paes Moreira
Hey, Robert. I will keep an eye on the overall progress and get started on the blog post to make the community announcement. Are there (mid-term) plans to translate/localize this website as well? It might be a point worth mentioning in the blogpost. Hats off to you and Daryl — this turned out ama