Flink 1.11 Table API cannot process Avro

2020-07-10 Thread Lian Jiang
Hi, According to https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html, avro is supported for table API but below code failed: tEnv.executeSql("CREATE TABLE people (\n" + "id INT,\n" + "name STRING\n" + ") WITH (\n" +

Table API throws "No FileSystem for scheme: file" when loading local parquet

2020-07-10 Thread Lian Jiang
Hi, I am trying Table API in Flink 1.11: tEnv.executeSql("CREATE TABLE people (\n" + "id INT,\n" + "name STRING\n" + ") WITH (\n" + "'connector' = 'filesystem',\n" + "'path' = 'file:///data/test.parquet',\n" + "'format'=

Saving file to the ftp server

2020-07-10 Thread Paweł Goliszewski
Hi to all, I tried to send a file from local storage to ftp server in docker container (image: stilliard/pure-ftpd) using Flink 1.10 with hadoop 2.8.5. I tried to do so with the following code: final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); Dat

Customised RESTful trigger

2020-07-10 Thread Jacek Grzebyta
Hello, I am a newbie in the Apache Flink environment. I found it is possible to trigger a job using the MONITORING REST API. Is it possible to customise a request to start a job with some parameters? From the bigger perspective I would like to provide a large file URL into a Flink application to d

Re: Table API jobs migration to Flink 1.11

2020-07-10 Thread Flavio Pompermaier
Is it correct to do something like this? TableSource myTableSource = new BatchTableSource() { @Override public TableSchema getTableSchema() { return new TableSchema(dsFields, ft); } @Override public DataSet getDataSet(ExecutionEnvironment execEnv) { re

Re: flink take single element from stream

2020-07-10 Thread Aljoscha Krettek
I'm afraid limit() is not yet available on the Table API but you can use it via SQL, i.e. sth like "select * FROM (VALUES 'Hello', 'CIAO', 'foo', 'bar') LIMIT 2;" works. You can execute that from the Table API via `TableEnvironment.executeSql()`. Best, Aljoscha On 09.07.20 17:53, Georg Heiler

Re: map JSON to scala case class & off-heap optimization

2020-07-10 Thread Aljoscha Krettek
Hi Georg, I'm afraid the other suggestions are missing the point a bit. From your other emails it seems you want to use Kafka with JSON records together with the Table API/SQL. For that, take a look at [1] which describes how to define data sources for the Table API. Especially the Kafka and J

Re: Task recovery?

2020-07-10 Thread John Smith
Yeah it's fine but the thing is I guess because I don't have the history server and the UI wasn't showing any jobs and I didn't have any job Id so I can go and look for the checkpoints. I restarted them but instead of checkpoint I went and played back a few days before just to be sure... All my jo

Re: MalformedClassName for scala case class

2020-07-10 Thread Aljoscha Krettek
Hi, could you please post the stacktrace with the exception and also let us know which Flink version you're using? I have tried the following code and it works on master/flink-1.11/flink-1.10: case class Foo(lang: String, count: Int) def main(args: Array[String]): Unit = { val senv

Re: Table API jobs migration to Flink 1.11

2020-07-10 Thread Flavio Pompermaier
How can you reuse InputFormat to write a TableSource? I think that at least initially this could be the simplest way to test the migration..then I could try yo implement the new Table Source interface On Fri, Jul 10, 2020 at 3:38 PM godfrey he wrote: > hi Flavio, > Only old planner supports Batc

Re: Table API jobs migration to Flink 1.11

2020-07-10 Thread godfrey he
hi Flavio, Only old planner supports BatchTableEnvironment (which can convert to/from DataSet), while Blink planner in batch mode only support TableEnvironment. Because Blink planner convert the batch queries to Transformation (corresponding to DataStream), instead of DataSet. one approach is you

Re: Task recovery?

2020-07-10 Thread Aljoscha Krettek
On 03.07.20 18:42, John Smith wrote: If I understand correctly on June 23rd it suspended the jobs? So at that point they would no longer show in the UI or be restarted? Yes, that is correct, though in the logs it seems the jobs failed terminally on June 22nd: 2020-06-22 23:30:22,130 INFO or

Re: Table API jobs migration to Flink 1.11

2020-07-10 Thread Flavio Pompermaier
Thanks but I still can't understand how to migrate my legacy code. The main problem is that I can't create a BatchTableEnv anymore so I can't call createInput. Is there a way to reuse InputFormats? Should I migrate them to TableSource instead? public static void main(String[] args) throws Excepti

Possible bug clean up savepoint dump into filesystem and low network IO starting from savepoint

2020-07-10 Thread David Magalhães
Hi, yesterday when I was creating a savepoint (to S3, around 8GB of state) using 2 TaskManager (8 GB) and it failed because one of the task managers fill up the disk (probably didn't have enough RAM to save the state into S3 directly,I don't know what was the disk space, and reached 100% usage spac

Re: Checkpoint failed because of TimeWindow cannot be cast to VoidNamespace

2020-07-10 Thread Si-li Liu
Sorry I can't reproduce it with reduce/aggregate/fold/apply and due to some limitations in my working environment, I can't use flink 1.10 or 1.11. Congxian Qiu 于2020年7月5日周日 下午6:21写道: > Hi > > First, Could you please try this problem still there if use flink 1.10 or > 1.11? > > It seems strange,

Re: Table API jobs migration to Flink 1.11

2020-07-10 Thread Dawid Wysakowicz
You should be good with using the TableEnvironment. The StreamTableEnvironment is needed only if you want to convert to DataStream. We do not support converting batch Table programs to DataStream yet. A following code should work: EnvironmentSettings settings = EnvironmentSettings.newInstance().i

Table API jobs migration to Flink 1.11

2020-07-10 Thread Flavio Pompermaier
Hi to all, I was trying to update my legacy code to Flink 1.11. Before I was using a BatchTableEnv and now I've tried to use the following: EnvironmentSettings settings = EnvironmentSettings.newInstance().inBatchMode().build(); Unfortunately in the StreamTableEnvironmentImpl code there's : if (!

Does Flink support TFRecordFileOutputFormat?

2020-07-10 Thread 殿李
Hi, Does Flink support TFRecordFileOutputFormat? I can't find the relevant information in the document. As far as I know, spark is supportive. Best regards Peidian Li