Re: Deploying and managing multiple jobs on EMR

2019-05-29 Thread orips
I agree. At the moment deploying Flink to EMR is laborious and too custom. I would like to know too. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink vs KStreams

2019-05-29 Thread orips
Elias Levy wrote > Flink: > > Pros: > * Intra-job traffic flows directly between workers. > * More mature. > * Higher-level constructs: SQL, CEP, etc. How is SQL a Pro in Flink? Kafka Streams has KSQL which is at least as good as Flink's SQL. -- Sent from: http://apache-flink-user-mailing-l

Retention properties for CANCELED/FINISHED jobs

2019-05-29 Thread Abdul Qadeer
Hi! Is it possible to persist history of completed jobs across job manager restarts, without starting a history server? Also, is there a limit to how many jobs are stored in CANCELED state at jobmanager?

Re: [DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

2019-05-29 Thread Stefan Richter
Hi Yu, Sorry for the late reaction. As already discussed internally, I think this is a very good proposal and design that can help to improve a major limitation of the current state backend. I think that most discussion is happening in the design doc and I left my comments there. Looking forwar

RE: Building Flink distribution with Scala2.12

2019-05-29 Thread Visser, M.J.H. (Martijn)
Hi Boris, I believe you have to change your commando to: mvn clean package -pl flink-dist -am -Pscala-2.12 -Dscala-2.12 -DskipTests Also see https://issues.apache.org/jira/browse/FLINK-12007 Thanks, Martijn From: Boris Lublinsky Sent: woensdag 29 mei 2019 00:38 To: user Subject: Building Fli

Re: Building Flink distribution with Scala2.12

2019-05-29 Thread Boris Lublinsky
Thanks Martijn, this was it. It would be nice to have this in documentation. Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > On May 29, 2019, at 5:02 AM, Visser, M.J.H. (Martijn) > wrote: > > Hi Boris, > > I believe you have to change your commando to

count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
Hi! We are using apache-flink-1.4.2. It seems this version doesn't support count(DISTINCT). I am trying to find a way to dedup the stream. So I tried: SELECT CONCAT_WS( '-', CAST(MONTH(longToDateTime(rowtime)) AS VARCHAR), CAST(YEAR(longToDateTime(rowtime)) AS VARCHAR),

Re: count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
More details on the error with query#1 that used COUNT(DISTINCT()): org.apache.flink.table.api.TableException: Cannot generate a valid execution plan for the given query: FlinkLogicalCalc(expr#0..8=[{inputs}], expr#9=[_UTF-16LE'-'], expr#10=[CAST($t1):VARCHAR(65536) CHARACTER SET "UTF-16LE" COLLA

Re: count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
Another interesting thing is that if I add DISTINCT in the 2nd query it doesn't complain. But because of the inner-select it is a no-op because the inner select is doing the deduping: SELECT CONCAT_WS( '-', CAST(MONTH(row_datetime) AS VARCHAR), CAST(YEAR(row_datetime) AS

Re: What are savepoint state manipulation support plans

2019-05-29 Thread Tzu-Li (Gordon) Tai
FYI: Seth starting a FLIP for adding a savepoint connector that addresses this - http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-43-Savepoint-Connector-td29233.html Please join the discussion there if you are interested! On Thu, Mar 28, 2019 at 5:23 PM Tzu-Li (Gordon)