Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Salva Alcántara
I've also posted an extended version of my original question in SO: - https://stackoverflow.com/questions/73031434/obtain-sourceb-sinkb-out-of-sourcea-sinka-and-fa-b-fb-a Salva On Mon, Jul 18, 2022 at 9:25 PM Alexander Fedulov wrote: > I had to do something similar recently for FLIP-238 (genera

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Salva Alcántara
Thanks Alexander! That is different from what I initially had in mind, but interesting anyway so I will take a closer look at it. What I initially had in mind is a construct for simply: - appending a conversion functions f:A->B to a Source in order to get a Source (corresponds to "map") - prepend

Re: standalone mode support in the kubernetes operator (FLIP-25)

2022-07-18 Thread Yang Wang
I think at least we have the following advantages(in some cases). * We do not need to configure a service account for JobManager, which allows it could allocate/delete pods from Kubernetes APIServer. * The reactive mode[1] could only work with standalone cluster. [1]. https://nightlies.apache.org/

Re: Any usage examples for flink-table-api-java-bridge?

2022-07-18 Thread Alexander Fedulov
You are welcome, glad it helped! Best, Alexander Fedulov On Mon, Jul 18, 2022 at 8:06 PM Salva Alcántara wrote: > For the record, Alexander Fedulov pointed me to an example within the > kafka connector: > > > https://github.com/apache/flink/blob/025675725336cd572aa2601be525efd4995e5b84/flink-co

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Alexander Fedulov
I had to do something similar recently for FLIP-238 (generator source) [1]. The PoC [2] reuses the NumberSequenceSource to produce other data types based on a user-defined mapper. The mapping happens within the SourceReader. Here are some relevant classes [3], [4]. Not sure if this is the best appr

RE: Any usage examples for flink-table-api-java-bridge?

2022-07-18 Thread Salva Alcántara
For the record, Alexander Fedulov pointed me to an example within the kafka connector: https://github.com/apache/flink/blob/025675725336cd572aa2601be525efd4995e5b84/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/KafkaDynamicSink.java#L218 Th

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Salva Alcántara
Yep, that is mostly it. I have (DataStream) connector (sources & sink) which works for a fixed type (`JsonNode` for what it's worth) as you say and I want to reuse it for Table/SQL, which requires working with `DataRow` as the underlying data type. But even beyond that specific use case, I think be

PyFlink SQL: force maximum use of slots

2022-07-18 Thread John Tipper
Hi all, Is there a way of forcing a pipeline to use as many slots as possible? I have a program in PyFlink using SQL and the Table API and currently all of my pipeline is using just a single slot. I've tried this:    StreamExecutionEnvironment.get_execution_environment().disable_operator_

Re: flink on yarn job always restart

2022-07-18 Thread SmileSmile
When I turn off zk's ha configuration and do a fault walkthrough, yarn's resourceManager log comes up with the following message. 2022-07-18 23:42:53,633 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=Application Finished - Failed TARGET=RMAppManager

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Alexander Fedulov
Hi Salva, what is the goal? Do you have some source that already has a fixed type and you want to reuse its functionality for producing a different data type? Best, Alexander Fedulov On Mon, Jul 18, 2022 at 1:29 PM Salva Alcántara wrote: > If I have a Source (Sink), what would be the simplest

Re: flink on yarn job always restart

2022-07-18 Thread Geng Biao
The log shows that “Diagnostics Cluster entrypoint has been closed externally..” So are you trying to kill the YARN cluster entrypoint process directly in the terminal using “kill ”? If users want to kill a TM, they should go to the machine that the TM process resides and kill the TM process. C

Re: flink on yarn job always restart

2022-07-18 Thread SmileSmile
Thanks for the reply, our scenario was a failure test to see if the job would recover on its own after killing a TM. It turns out that the job gets a SIGNAL 15 hang during the switch from DEPLOYING to INITIALIZING. Because zk's ha appears to restart repeatedly My confusion 1. why does it receive

Re: flink on yarn job always restart

2022-07-18 Thread Geng Biao
Hi, One possible direction is to check your YARN log or TM log to see if the YARN RM kills the TM for some reason(e.g. physical memory is over limit) and as a result, the JM will try to recover the TM repeatedly according to your restart strategy. The snippet of JM logs you provide is usually n

Re: flink on yarn job always restart

2022-07-18 Thread SmileSmile
The previous logs were all job deployment logs, then suddenly JM received SIGNAL 15, and all components started to exit 2022-07-18 20:31:27,813 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - theOpeartion namex (839/3000) (f913468fb654c6d2c3466ef28d296396) switched

Re: PyFlink and parallelism

2022-07-18 Thread John Tipper
Thanks very much, given that I'm using SQL it doesn't look like I'm able to access the operators to be able to change the parallelism without dropping from the table API into the datastream API and then back again. In any case, conversion between a tablesteam and a datastream is broken if the t

Re: flink on yarn job always restart

2022-07-18 Thread Zhanghao Chen
Hi, could you provide the whole JM log? Best, Zhanghao Chen From: SmileSmile Sent: Monday, July 18, 2022 20:46 To: user Subject: flink on yarn job always restart hi all we meet a situation, parallelism 3000,the job contains multiple agg operation,the job recove

flink on yarn job always restart

2022-07-18 Thread SmileSmile
hi all we meet a situation, parallelism 3000,the job contains multiple agg operation,the job recover from checkpoint or savepoint must be unrecoverable, the job restarts repeatedly jm error logorg.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - RECEIVED S IGNAL 15: SIGTERM. Shuttin

RE: Any usage examples for flink-table-api-java-bridge?

2022-07-18 Thread Salva Alcántara
Sorry, I forgot to copy the relevant text from the docs regarding this library: "If you want to develop a connector that needs to bridge with DataStream APIs (i.e. if you want to adapt a DataStream connector to the Table API), you need to add this dependency:" As I said, it's still not clear to m

Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Salva Alcántara
If I have a Source (Sink), what would be the simplest way of obtaining a Source (Sink) based on a mapping/conversion function from A to B. AFAIK sources & sinks don't have map so I was just wondering how to approach this in the context of new sources/sinks apis. Regards, Salva

Any usage examples for flink-table-api-java-bridge?

2022-07-18 Thread Salva Alcántara
The following library is mentioned here: - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sourcessinks/#project-configuration org.apache.flink flink-table-api-java-bridge 1.16-SNAPSHOT provided ``` The following is stated: When developing the connector

Re: unsubscribe

2022-07-18 Thread Yuxin Tan
Hi, you can send any content to user-unsubscr...@flink.apache.org to unsubscribe. Best Regards Yuxin Alex Drobinsky 于2022年7月17日周日 14:23写道: > >