Re: some basic questions

2020-01-18 Thread godfrey he
hi kant, "FULL OUTER JOIN" job will generate retract message, so toRetractStream is required to guarantee the correctness. I think it's better to use StreamExecutionEnvrionment.execute, because you have converted the Table to DataStream. kant kodali 于2020年1月19日周日 上午11:59写道: > Hi Godfrey, > > I w

Re: some basic questions

2020-01-18 Thread kant kodali
Hi Godfrey, I was just clicking the run button on my IDE and it doesn't really show me errors so I used command line fink run and that shows me what the error is. It tells me I need to change to toRetractStream() and both StreamExecutionEnvrionment and StreamTableEnvrionment .execute seems to wor

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-18 Thread Yang Wang
I think this exception is not because the hadoop version isn't high enough. It seems that the "s3" URI scheme could not be recognized by `S3FileSystemFactory`. So it fallbacks to the `HadoopFsFactory`. Could you share the debug level jobmanager/taskmanger logs so that we could confirm whether the

Re: some basic questions

2020-01-18 Thread kant kodali
Hi Godfrey, Thanks a lot for your response. I just tried it with env.execute("simple job") but I still get the same error message. Kant On Sat, Jan 18, 2020 at 6:26 PM godfrey he wrote: > hi kant, > > > 1) The Documentation says full outer join is supported however the below > code just exits

Re: some basic questions

2020-01-18 Thread godfrey he
hi kant, > 1) The Documentation says full outer join is supported however the below code just exits with value 1. No error message. if you have converted Table to DataStream, please execute it with StreamExecutionEnvironment ( call env.execute("simple job") ) > 2) If I am using a blink planner sh

Does Flink 1.9 support create or replace views syntax in raw sql?

2020-01-18 Thread kant kodali
Hi All, Does Flink 1.9 support create or replace views syntax in raw SQL? like spark streaming does? Thanks!

Re: ValueState with pure Java class keeping lists/map vs ListState/MapState, which one is a recommended way?

2020-01-18 Thread David Anderson
[Note that this question is better suited for the user mailing list than dev.] In general using ListState and MapState is recommended rather than using ValueState> or ValueState>. Some of the state backends are able to optimize common access patterns for ListState and MapState in ways that are no

Re: Flink ParquetAvroWriters Sink

2020-01-18 Thread Arvid Heise
Hi Anuj, I think that there may be a fundamental misunderstanding about the role of a schema registry in Kafka. So let me first clarify that. In each Avro/Parquet file, all records have the same schema. The schema is stored within the file, such that we can always retrieve the writer schema for th

Re: Flink ParquetAvroWriters Sink

2020-01-18 Thread aj
Thanks, Arvid. I do not fully understand the above approach, so currently, I am thinking to go with the envelope approach that you suggested. One more question I have if I do not want to keep schema in my consumer project even its a single envelope schema. I want it to be fetched from the schema

Re: Flink ParquetAvroWriters Sink

2020-01-18 Thread Arvid Heise
(Readded user mailing list) Hi Anuj, since I'd still recommend going with distinct sources/sinks, let me try to solve your issues in this mail. If that doesn't work out, I'd address your concerns about the envelope format later. In Flink, you can have several subtopologies in the same applicatio

Re: Does Flink Metrics provide information about each records inserted into the database

2020-01-18 Thread Flavio Pompermaier
What about using an accumulator? Does it work for you needs? Il Sab 18 Gen 2020, 10:03 Soheil Pourbafrani ha scritto: > Hi, > > I'm using Flink to insert some processed records into the database. I need > to have some aggregated information about records inserted into the > database so far. For

some basic questions

2020-01-18 Thread kant kodali
Hi All, 1) The Documentation says full outer join is supported however the below code just exits with value 1. No error message. import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.

Does Flink Metrics provide information about each records inserted into the database

2020-01-18 Thread Soheil Pourbafrani
Hi, I'm using Flink to insert some processed records into the database. I need to have some aggregated information about records inserted into the database so far. For example, for a specific column value, I need to know how many records have been inserted. Can I use the Flink Matrics to provide t