Re: Does Flink support TFRecordFileOutputFormat?

2020-07-12 Thread Peidian Li
Thanks, I'll check it out. Jingsong Li 于2020年7月13日周一 下午2:50写道: > Hi, > > Flink also has `HadoopOutputFormat`, it can wrap hadoop OutputFormat to > Flink sink. > You can have a try. > > Best, > Jingsong > > On Mon, Jul 13, 2020 at 2:34 PM 殿李 wrote: > >> Hi, >> >> Yes, TF means TensorFlow. >> >>

Re: Does Flink support TFRecordFileOutputFormat?

2020-07-12 Thread Jingsong Li
Hi, Flink also has `HadoopOutputFormat`, it can wrap hadoop OutputFormat to Flink sink. You can have a try. Best, Jingsong On Mon, Jul 13, 2020 at 2:34 PM 殿李 wrote: > Hi, > > Yes, TF means TensorFlow. > > This class may not be in the spark package, but spark supports writing > this file format

Re: Does Flink support TFRecordFileOutputFormat?

2020-07-12 Thread 殿李
Hi, Yes, TF means TensorFlow. This class may not be in the spark package, but spark supports writing this file format to HDFS. tfRDD.saveAsNewAPIHadoopFile(output, "org.tensorflow.hadoop.io.TFRecordFileOutputFormat", keyClass="org.apache.hadoop.io.BytesWritable"

Re: Table API throws "No FileSystem for scheme: file" when loading local parquet

2020-07-12 Thread Danny Chan
> No FileSystem for scheme: file It seems that your path does not work correctly, from the patch you gave, the directly name 'test.parquet’ seems invalid. Best, Danny Chan 在 2020年7月11日 +0800 AM8:07,Danny Chan ,写道: > > It seems that your path does not work correctly, from the patch you gave, the

Re: Does Flink support TFRecordFileOutputFormat?

2020-07-12 Thread Danny Chan
I didn’t see any class named TFRecordFileOutputFormat in Spark, for TF do you mean TensorFlow ? Best, Danny Chan 在 2020年7月10日 +0800 PM5:28,殿李 ,写道: > Hi, > > Does Flink support TFRecordFileOutputFormat? I can't find the relevant > information in the document. > > As far as I know, spark is suppor

Re: Flink 1.11 Table API cannot process Avro

2020-07-12 Thread Lian Jiang
Thanks Leonard and Jark. Here is my repo for your repro: https://bitbucket.org/jiangok/flink-playgrounds/src/0d242a51f02083711218d3810267117e6ce4260c/table-walkthrough/pom.xml#lines-131. As you can see, my pom.xml has already added flink-avro and avro dependencies. You can run this repro by: git

Re: Flink 1.11 Table API cannot process Avro

2020-07-12 Thread Leonard Xu
Hi, Jiang > Is there a uber jar or a list of runtime dependencies so that developers can > easily make the above example of Flink SQL for avro work? Thanks. The dependency list for using Avro in Flink SQL is simple and has not a uber jar AFAIK, we only need to add `flink-avro` and `avro` depend

Re: Flink 1.11 Table API cannot process Avro

2020-07-12 Thread Jark Wu
>From the latest exception message, it seems that the avro factory problem has been resolved. The new exception indicates that you don't have proper Apache Avro dependencies (because flink-avro doesn't bundle Apache Avro), so you have to add Apache Avro into your project dependency, or export HADOO

Error --GC Cleaner Provider -- Flink 1.11.0

2020-07-12 Thread Murali Krishna Pusala
Hi All, I have written simple java code that read data using Hive and transform using Table API (Blink Planner) and Flink 1.11.0 on HDP cluster. I am encountering "java.lang.Error: Failed to find GC Cleaner among available providers” error. Full error stack is at end of the email. Do anyon

Re: Flink 1.11 Table API cannot process Avro

2020-07-12 Thread Lian Jiang
Thanks guys. I missed the runtime dependencies. After adding below into https://github.com/apache/flink-playgrounds/blob/master/table-walkthrough/Dockerfile. The original issue of "Could not find any factory for identifier" is gone. wget -P /opt/flink/lib/ https://repo1.maven.org/maven2/org/apach

Re: History Server Not Showing Any Jobs - File Not Found?

2020-07-12 Thread Chesnay Schepler
Ah, I remembered wrong, my apologies. Unfortunately there is no option to prevent the cleanup; it is something I wanted to have for a long time though... On 11/07/2020 17:57, Hailu, Andreas wrote: Thanks for the clarity. To this point you made: /(Note that by configuring "historyserver.web.t

Re: Customised RESTful trigger

2020-07-12 Thread Chesnay Schepler
You can specify arguments to your job via query parameters or a json body (recommended) as documented here . On 10/07/2020 18:48, Jacek Grzebyta wrote: Hello, I am a newbie in the Apache Flin

Re: Table API jobs migration to Flink 1.11

2020-07-12 Thread godfrey he
hi Flavio, `BatchTableSource` can only be used for old planner. if you want to use Blink planner to run batch job, your table source should implement `StreamTableSource` and `isBounded` method return true. Best, Godfrey Flavio Pompermaier 于2020年7月10日周五 下午10:32写道: > Is it correct to do someth

Re: Checkpoint failed because of TimeWindow cannot be cast to VoidNamespace

2020-07-12 Thread Si-li Liu
Someone told me that maybe this issue is Mesos specific. I'm kind of a newbie in Flink, and I digged into the code but can not get a conclusion. Here I just wanna have a better JoinWindow that emits the result and delete it from the window state immediately when joined successfully, is there any ot

Re: Savepoint fails due to RocksDB 2GiB limit

2020-07-12 Thread Ori Popowski
> AFAIK, current the 2GB limit is still there. as a workaround, maybe you can reduce the state size. If this can not be done using the window operator, can the keyedprocessfunction[1] be ok for you? I'll see if I can introduce it to the code. > if you do, the ProcessWindowFunction is getting as a