Re: Flink native k8s integration vs. operator

2022-01-13 Thread Xintong Song
Thanks for volunteering to drive this effort, Marton, Thomas and Gyula. Looking forward to the public discussion. Please feel free to reach out if there's anything you need from us. Thank you~ Xintong Song On Fri, Jan 14, 2022 at 8:27 AM Chenya Zhang wrote: > Thanks Thomas, Gyula, and Marto

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread JIN FENG
Hi I am a software engineer from Xiaomi. Last year we used metacat(https://github.com/Netflix/metacat) to manage all metadata, including Hive, Kudu, Doris, Iceberg, Elasticsearch, Talos (Xiaomi self-developed message queue), Mysql, Tidb.. Metacat is well compatible with the hive-metastore protoco

Unsubscribe

2022-01-13 Thread Jerome Li
Unsubscribe

FlinkKafkaConsumer and FlinkKafkaProducer and Kafka Cluster Migration

2022-01-13 Thread Alexey Trenikhun
Hello, Currently we are using FlinkKafkaConsumer and FlinkKafkaProducer and planning to migrate to different Kafka cluster. Are boostrap servers, username and passwords part of FlinkKafkaConsumer and FlinkKafkaProducer ? So if we take savepoint change boostrap server and credentials and start

Re: Flink native k8s integration vs. operator

2022-01-13 Thread Chenya Zhang
Thanks Thomas, Gyula, and Marton for driving this effort! It would greatly ease the adoption of Apache Flink on Kubernetes and help to address the current operational pain points as mentioned. Look forward to the proposal and more discussions! Best, Chenya On Thu, Jan 13, 2022 at 12:15 PM Márton

Re: Flink native k8s integration vs. operator

2022-01-13 Thread Márton Balassi
Hi All, I am pleased to see the level of enthusiasm and technical consideration already emerging in this thread. I wholeheartedly support building an operator and endorsing it via placing it under the Apache Flink umbrella (as a separate repository) as the current lack of it is clearly becoming an

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread Maciej Obuchowski
Hello, I'm an OpenLineage committer - and previously, a minor Flink contributor. OpenLineage community is very interested in conversation about Flink metadata, and we'll be happy to cooperate with the Flink community. Best, Maciej Obuchowski czw., 13 sty 2022 o 18:12 Martijn Visser napisał(a)

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread Martijn Visser
Hi all, @Andrew thanks for sharing that! @Tero good point, I should have clarified the purpose. I want to understand what "metadata platforms" tools are used or evaluated by the Flink community, what's their purpose for using such a tool (is it as a generic catalogue, as a data discovery tool, is

Re: [DISCUSS] Future of Per-Job Mode

2022-01-13 Thread Thomas Weise
Regarding session mode: ## Session Mode * main() method executed in client Session mode also supports execution of the main method on Jobmanager with submission through REST API. That's how Flinkk k8s operators like [1] work. It's actually an important capability because it allows for allocation

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread Pedro Silva
Hello, I'm part of the DataHub community and working in collaboration with the company behind it: http://acryldata.io Happy to have a conversation or clarify any questions you may have on DataHub :) Have a nice day! Em qui., 13 de jan. de 2022 às 15:33, Andrew Otto escreveu: > Hello! The Wiki

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread Andrew Otto
Hello! The Wikimedia Foundation is currently doing a similar evaluation (although we are not currently including any Flink considerations). https://wikitech.wikimedia.org/wiki/Data_Catalog_Application_Evaluation_Rubric More details will be published there as folks keep working on this. Hope that

[FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-13 Thread Martijn Visser
Hi everyone, I'm currently checking out different metadata platforms, such as Amundsen [1] and Datahub [2]. In short, these types of tools try to address problems related to topics such as data discovery, data lineage and an overall data catalogue. I'm reaching out to the Dev and User mailing lis

Re: Upgrade to flink 1.14.2 and using new Data Source and Sink API

2022-01-13 Thread Mika Naylor
Hi Daniel, These logs look pretty normal. As for the -1 epochs, depending on which version you're using, I think that this might apply: "For a producer which is being initialized for the first time, the producerId and epoch will be set to -1. For a producer which is reinitializing, a positive

Flink 1.14.2 - Log4j2 -Dlog4j.configurationFile is ignored and falls back to default /opt/flink/conf/log4j-console.properties

2022-01-13 Thread Tamir Sagi
Hey All I'm Running Flink 1.14.2, it seems like it ignores system property -Dlog4j.configurationFile and falls back to /opt/flink/conf/log4j-console.properties I enabled debug log for log4j2 ( -Dlog4j2.debug) DEBUG StatusLogger Catching java.io.FileNotFoundException: file:/opt/flink/conf/log4

Re: [DISCUSS] Future of Per-Job Mode

2022-01-13 Thread Biao Geng
Hi Konstantin, Thanks a lot for starting this discussion! I hope my thoughts and experiences why users use Per-Job Mode, especially in YARN can help: #1. Per-job mode makes managing dependencies easier: I have met some customers who used Per-Job Mode to submit jobs with a lot of local user-defined

[DISCUSS] Future of Per-Job Mode

2022-01-13 Thread Konstantin Knauf
Hi everyone, I would like to discuss and understand if the benefits of having Per-Job Mode in Apache Flink outweigh its drawbacks. *# Background: Flink's Deployment Modes* Flink currently has three deployment modes. They differ in the following dimensions: * main() method executed on Jobmanager

Re: OutOfMemoryError: Java heap space while implmentating flink sql api

2022-01-13 Thread Martijn Visser
Hi Ronak, As mentioned in the Flink Community & Project information [1] the primary place for support are the mailing lists and user support should go to the User mailing list. Keep in mind that this is still done by the community and set up for asynchronous handling. If you want to have quick ack

Re: Flink native k8s integration vs. operator

2022-01-13 Thread Konstantin Knauf
Hi Thomas, Yes, I was referring to a separate repository under Apache Flink. Cheers, Konstantin On Thu, Jan 13, 2022 at 6:19 AM Thomas Weise wrote: > Hi everyone, > > Thanks for the feedback and discussion. A few additional thoughts: > > [Konstantin] > With respect to common lifecycle managem