date:20200505

Re: multiple joins in one job

2020-05-05 Thread Fabian Hueske

You can in fact forward both time attributes because Flink makes sure that the watermark is automatically adjusted to the "slower" of both input streams. You can run the following queries in the SQL CLI client (here taken an example from a Flink SQL training [1] Flink SQL> CREATE VIEW ridesWithFa

??????MongoDB sink;

2020-05-05 Thread myflink

my solution: First, Flink sinks data to Kafka; Second, MongoDB reads data from Kafka. Over. -- -- ??: "Aissa Elaffani"

MongoDB sink;

2020-05-05 Thread Aissa Elaffani

Hello , I want to sink my data to MongoDB but as far as I know there is no sink connector to MongoDB. How can I implement a MongoDB sink ? If there is any other solutions, I hope you can share with me.

Re: Flink: For terabytes of keyed state.

2020-05-05 Thread Gowri Sundaram

Hi Congxian, Thank you so much for your response! We will go ahead and do a POC to test out how Flink performs at scale. Regards, - Gowri On Wed, May 6, 2020 at 8:34 AM Congxian Qiu wrote: > Hi > > From my experience, you should care the state size for a single task(not > the whole job state si

Autoscaling vs backpressure

2020-05-05 Thread Manish G

Hi, As flink doesn't provide out-of-box support for autoscaling, can backpressure be considered as an alternative to it? Autoscaling allows us to add/remove nodes as load goes up/down. With backpressure, if load goes up system would signal upstream to release data slowly. So we don't need to add

What is the RocksDB local directory in flink checkpointing?

2020-05-05 Thread LakeShen

Hi community, Now I have a question about flink checkpoint local directory , our flink version is 1.6, job mode is flink on yarn per job . I saw the flink source code , and I find the flink checkpoint local directory is /tmp when you didn't config the "state.backend.rocksdb.localdir". But I go i

Re: Flink: For terabytes of keyed state.

2020-05-05 Thread Congxian Qiu

Hi >From my experience, you should care the state size for a single task(not the whole job state size), the download speed for single thread is almost 100 MB/s (this may vary in different env), and I do not have much performance for loading state into RocksDB(we use an internal KV store in my comp

Re: multiple joins in one job

2020-05-05 Thread Benchao Li

Yes. The watermark will be propagated correctly, which is the min of two inputs. lec ssmi 于2020年5月6日周三上午9:46写道： > Even if the time attribute field is retained, will the related watermark > be retained? > If not, and there is no sql syntax to declare watermark again, it is > equivalent to not b

Re: multiple joins in one job

2020-05-05 Thread lec ssmi

Even if the time attribute field is retained, will the related watermark be retained? If not, and there is no sql syntax to declare watermark again, it is equivalent to not being able to do multiple joins in one job. Benchao Li 于2020年5月5日周二下午9:23写道： > You cannot select more than one time attri

Re: Writing _SUCCESS Files (Streaming and Batch)

2020-05-05 Thread Jingsong Li

Hi Peter, The troublesome is how to know the "ending" for a bucket in streaming job. In 1.11, we are trying to implement a watermark-related bucket ending mechanism[1] in Table/SQL. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table Best, Jingsong Lee

Re: Cannot start native K8s

2020-05-05 Thread Yang Wang

Hi Dongwon Kim, I think it is a known issue. The native kubernetes integration could not work with jdk 8u252 due to okhttp issue[1]. Currently, you could upgrade your jdk to a new version to work around. [1]. https://issues.apache.org/jira/browse/FLINK-17416 Dongwon Kim 于2020年5月6日周三上午7:15写道：

Export user metrics with Flink Prometheus endpoint

2020-05-05 Thread Eleanore Jin

Hi all, I just wonder is it possible to use Flink Metrics endpoint to allow Prometheus to scrape user defined metrics? Context: In addition to Flink metrics, we also collect some application level metrics using opencensus. And we run opencensus agent as side car in kubernetes pod to collect metri

Casting from org.apache.flink.api.java.tuple.Tuple2 to scala.Product; using java and scala in flink job

2020-05-05 Thread Nick Bendtner

Hi guys, In our flink job we use java source for deserializing a message from kafka using a kafka deserializer. Signature is as below. public class CustomAvroDeserializationSchema implements KafkaDeserializationSchema> The other parts of the streaming job are in scala. When data has to b

MongoDB as a Sink;

2020-05-05 Thread Aissa Elaffani

Hello Guys, I am looking for some help concerning my flink sink, i want te output to be stocked in MongoDB database. As far as I know, there is no sink conector for MongoDB, and I need to implement one by my self, and i don't know how to do that. Can you please help me in this ?

Cannot start native K8s

2020-05-05 Thread Dongwon Kim

Hi, I'm using Flink-1.10 and tested everything [1] successfully. While trying [2], I got the following message. Can anyone help please? [root@DAC-E04-W06 bin]# ./kubernetes-session.sh > 2020-05-06 08:10:49,411 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > confi

Flink pipeline;

2020-05-05 Thread Aissa Elaffani

Hello Guys, I am new to the real-time streaming field, and I am trying to build a BIG DATA architecture for processing real-time streaming. I have some sensors that generate data in json format, they are sent to Apache kafka cluster then i want to consume them with Apache flinkin ordre to do some a

Overriding hadoop core-site.xml keys using the flink-fs-hadoop-shaded assemblies

2020-05-05 Thread Jeff Henrikson

Has anyone had success overriding hadoop core-site.xml keys using the flink-fs-hadoop-shaded assemblies? If so, what versions were known to work? Using btrace, I am seeing a bug in the hadoop shaded dependencies distributed with 1.10.0. Some (but not all) of the core-site.xml keys cannot be

Flink - Hadoop Connectivity - Unable to read file

2020-05-05 Thread Samik Mukherjee

Hi All, I am trying to get some file from HDFS which is locally installed. But I am not able to. I tried with both these ways. But all the time the program is ending with "Process finished with exit code 239." Any help will be helpful- public class Processor { public static void main(String[]

Re: table.show() in Flink

2020-05-05 Thread Fabian Hueske

There's also the Table API approach if you want to avoid typing a "full" SQL query: Table t = tEnv.from("myTable"); Cheers, Fabian Am Di., 5. Mai 2020 um 16:34 Uhr schrieb Őrhidi Mátyás < matyas.orh...@gmail.com>: > Thanks guys for the prompt answers! > > On Tue, May 5, 2020 at 2:49 PM Kurt You

Re: table.show() in Flink

2020-05-05 Thread Őrhidi Mátyás

Thanks guys for the prompt answers! On Tue, May 5, 2020 at 2:49 PM Kurt Young wrote: > A more straightforward way after FLIP-84 would be: > TableResult result = tEnv.executeSql("select xxx ..."); > result.print(); > > And if you are using 1.10 now, you can use TableUtils#collectToList(table) > t

Re: multiple joins in one job

2020-05-05 Thread Benchao Li

You cannot select more than one time attribute, the planner will give you an Exception if you did that. lec ssmi 于2020年5月5日周二下午8:34写道： > As you said, if I select all the time attribute fields from > both , which will be the final one？ > > Benchao Li 于 2020年5月5日周二 17:26写道： > >

Re: Restore from savepoint with Iterations

2020-05-05 Thread ashish pok

Let me see if I can do artificial throttle somewhere. Volume of data is really high and hence trying to avoid rounds in Kafka too. Looks like options are “not so elegant” until FLIP-15. Thanks for pointers again!!! On Monday, May 4, 2020, 11:06 PM, Ken Krugler wrote: Hi Ashish, The workaroun

Re: table.show() in Flink

2020-05-05 Thread Kurt Young

A more straightforward way after FLIP-84 would be: TableResult result = tEnv.executeSql("select xxx ..."); result.print(); And if you are using 1.10 now, you can use TableUtils#collectToList(table) to collect the result to a list, and then print rows by yourself. Best, Kurt On Tue, May 5, 2020

Re: table.show() in Flink

2020-05-05 Thread Jark Wu

Hi Matyas, AFAIK, currently, this is the recommended way to print result of table. In FLIP-84 [1] , which is targeted to 1.11, we will introduce some new APIs to do the fluent printing like this. Table table2 = tEnv.sqlQuery("select yy ..."); TableResult result2 = table2.execute(); result2.print(

Re: Reading from sockets using dataset api

2020-05-05 Thread Arvid Heise

Hi Kaan, explicitly mapping to physical nodes is currently not supported and would need some workarounds. I have readded user mailing list (please always also include it in response); maybe someone can help you with that. Best, Arvid On Thu, Apr 30, 2020 at 10:12 AM Kaan Sancak wrote: > One q

Re: multiple joins in one job

2020-05-05 Thread lec ssmi

As you said, if I select all the time attribute fields from both , which will be the final one？ Benchao Li 于 2020年5月5日周二 17:26写道： > Hi lec, > > You don't need to specify time attribute again like `TUMBLE_ROWTIME`, you > just select the time attribute field > from one of the inp

table.show() in Flink

2020-05-05 Thread Őrhidi Mátyás

Dear Flink Community, I'm missing Spark's table.show() method in Flink. I'm using the following alternative at the moment: Table results = tableEnv.sqlQuery("SELECT * FROM my_table"); tableEnv.toAppendStream(results, Row.class).print(); Is it the recommended way to print the content of a table?

Flink on Kubernetes unable to Recover from failure

2020-05-05 Thread Morgan Geldenhuys

Community, I am currently doing some fault tolerance testing for Flink (1.10) running on Kubernetes (1.18) and am encountering an error where after a running job experiences a failure, the job fails completely. A Flink session cluster has been created according to the documentation containe

Re: ML/DL via Flink

2020-05-05 Thread m@xi

Hello Becket, I just watched your Flink Forward talk. Really interesting! I leave the link here as it is related to the post. AI Flow (FF20) Best, Max -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: multiple joins in one job

2020-05-05 Thread Benchao Li

Hi lec, You don't need to specify time attribute again like `TUMBLE_ROWTIME`, you just select the time attribute field from one of the input, then it will be time attribute automatically. lec ssmi 于2020年5月5日周二下午4:42写道： > But I have not found there is any syntax to specify time > a

Re: Autoscaling flink application

2020-05-05 Thread Manish G

Thanks. It would help. On Tue, May 5, 2020 at 2:12 PM David Anderson wrote: > There's no explicit, out-of-the-box support for autoscaling, but it can be > done. > > Autoscaling came up a few times at the recent Virtual Flink Forward, > including a talk on Autoscaling Flink at Netflix [1] by Timo

Re: Autoscaling flink application

2020-05-05 Thread David Anderson

There's no explicit, out-of-the-box support for autoscaling, but it can be done. Autoscaling came up a few times at the recent Virtual Flink Forward, including a talk on Autoscaling Flink at Netflix [1] by Timothy Farkas. [1] https://www.youtube.com/watch?v=NV0jvA5ZDNc Regards, David On Mon,

Re: multiple joins in one job

2020-05-05 Thread lec ssmi

But I have not found there is any syntax to specify time attribute field and watermark again with pure sql. Fabian Hueske 于 2020年5月5日周二 15:47写道： > Sure, you can write a SQL query with multiple interval joins that preserve > event-time attributes and watermarks. > There's no ne

Re: multiple joins in one job

2020-05-05 Thread Fabian Hueske

Sure, you can write a SQL query with multiple interval joins that preserve event-time attributes and watermarks. There's no need to feed data back to Kafka just to inject it again to assign new watermarks. Am Di., 5. Mai 2020 um 01:45 Uhr schrieb lec ssmi : > I mean using pure sql statement to ma

Re: multiple joins in one job

??????MongoDB sink;

MongoDB sink;

Re: Flink: For terabytes of keyed state.

Autoscaling vs backpressure

What is the RocksDB local directory in flink checkpointing?

Re: Flink: For terabytes of keyed state.

Re: multiple joins in one job

Re: multiple joins in one job

Re: Writing _SUCCESS Files (Streaming and Batch)

Re: Cannot start native K8s

Export user metrics with Flink Prometheus endpoint

Casting from org.apache.flink.api.java.tuple.Tuple2 to scala.Product; using java and scala in flink job

MongoDB as a Sink;

Cannot start native K8s

Flink pipeline;

Overriding hadoop core-site.xml keys using the flink-fs-hadoop-shaded assemblies

Flink - Hadoop Connectivity - Unable to read file

Re: table.show() in Flink

Re: table.show() in Flink

Re: multiple joins in one job

Re: Restore from savepoint with Iterations

Re: table.show() in Flink

Re: table.show() in Flink

Re: Reading from sockets using dataset api

Re: multiple joins in one job

table.show() in Flink

Flink on Kubernetes unable to Recover from failure

Re: ML/DL via Flink

Re: multiple joins in one job

Re: Autoscaling flink application

Re: Autoscaling flink application

Re: multiple joins in one job

Re: multiple joins in one job

34 matches

Site Navigation

Mail list logo

Footer information