date:20200219

Re: StreamTableEnvironment can not run in batch mode

2020-02-19 Thread Jingsong Li

Hi Fanbin, TableEnvironment is unification of batch/streaming in blink planner. Use: TableEnvironment.create(fsSettings) We continue improving TableEnvironment to contain more features. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#create-a-tableenvironment Be

JDBC source running continuously

2020-02-19 Thread Fanbin Bu

Hi, My app creates the source from JDBC inputformat and running some sql and print out. But the source terminates itself after the query is done. Is there anyway to keep the source running? samle code: val env = StreamExecutionEnvironment.getExecutionEnvironment val settings = EnvironmentSettings.

Fwd: StreamTableEnvironment can not run in batch mode

2020-02-19 Thread Fanbin Bu

Hi, Currently, "StreamTableEnvironment can not run in batch mode for now, please use TableEnvironment." Are there any plans on the unification of batch/streaming roadmap that use StreamTableEnvironment for both streamingMode and batchMode? Thanks, Fanbin

FlinkKinesisConsumer throws "Unknown datum type org.joda.time.DateTime"

2020-02-19 Thread Lian Jiang

Hi, I use a FlinkKinesisConsumer in a Flink job to de-serialize kinesis events. Flink: 1.9.2 Avro: 1.9.2 The serDe class is like: public class ManagedSchemaKinesisPayloadSerDe implements KinesisSerializationSchema, KinesisDeserializationSchema { private static final String REGISTRY_END

Re: Flink job fails with org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

2020-02-19 Thread John Smith

I think so also. But I was wondering if this was Consumer or actual Kafka Broker. But this error displayed on the flink task node where the task was running. The brokers looked fine at the time. I have about a dozen topics which all are single partition except one which is 18. So I really doubt the

Re: Side Outputs from RichAsyncFunction

2020-02-19 Thread KristoffSC

Hi, any thoughts about this one? Regards, Krzysztof -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Tests in FileUtilsTest while building Flink in local

2020-02-19 Thread Arujit Pradhan

Hi all, I was trying to build Flink in my local machine and these two unit tests are failing. *[ERROR] Errors:[ERROR] FileUtilsTest.testCompressionOnRelativePath:261->verifyDirectoryCompression:440 » NoSuchFile[ERROR] FileUtilsTest.testDeleteDirectoryConcurrently » FileSystem /var/folders/x9/

StreamTableEnvironment can not run in batch mode

2020-02-19 Thread Fanbin Bu

Hi, Currently, "StreamTableEnvironment can not run in batch mode for now, please use TableEnvironment." Are there any plans on the unification of batch/streaming roadmap that use StreamTableEnvironment for both streamingMode and batchMode? Thanks, Fanbin

Re: Link to Flink on K8S Webinar

2020-02-19 Thread Austin Bennett

Cool; @aniket and @dagang, As someone who hasn't dug into the code of either (will go through your recording) -- might you share any thoughts on differences between: https://github.com/googlecloudplatform/flink-on-k8s-operator and https://github.com/lyft/flinkk8soperator ?? Also, for those in Ba

Re: Flink Kafka connector consume from a single kafka partition

2020-02-19 Thread hemant singh

Hi Arvid, Thanks for your response. I think I did not word my question properly. I wanted to confirm that if the data is distributed to more than one partition then the ordering cannot be maintained (which is documented). According to your response I understand if I set the parallelism to number o

Re: Updating ValueState not working in hosted Kinesis

2020-02-19 Thread Chris Stevens

Thanks so much Timo, got it working now. All down to my lack of Java skill. Many thanks, Chris Stevens Head of Research & Development +44 7565 034 595 On Wed, 19 Feb 2020 at 15:12, Timo Walther wrote: > Hi Chris, > > your observation is right. By `new Sensor() {}` instead of just `new > Sensor

Re: CSV StreamingFileSink

2020-02-19 Thread Austin Cawley-Edwards

Hey Timo, Thanks for the assignment link! Looks like most of my issues can be solved by getting better acquainted with Java file APIs and not in Flink-land. Best, Austin On Wed, Feb 19, 2020 at 6:48 AM Timo Walther wrote: > Hi Austin, > > the StreamingFileSink allows bucketing the output data

RE: Flink's Either type information

2020-02-19 Thread jacopo.gobbi

Yes, I create it the way you mentioned. From: Yun Gao [mailto:yungao...@aliyun.com] Sent: Dienstag, 18. Februar 2020 10:12 To: Gobbi, Jacopo-XT; user Subject: [External] Re: Flink's Either type information Hi Jacopo, Could you also provide how the KeyedBroadcastProcessFunction is

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread aj

Thanks, Timo. I have not used and explore Table API until now. I have used dataset and datastream API only. I will read about the Table API. On Wed, Feb 19, 2020 at 4:33 PM Timo Walther wrote: > Hi Anuj, > > another option would be to use the new Hive connectors. Have you looked > into those? Th

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread aj

Thanks, Rafi. I will try with this but yes if partitioning is not possible then I also have to look some other solution. On Wed, Feb 19, 2020 at 3:44 PM Rafi Aroch wrote: > Hi Anuj, > > It's been a while since I wrote this (Flink 1.5.2). Could be a > better/newer way, but this is what how I read

RE: KafkaFetcher closed before end of stream is received for all partitions.

2020-02-19 Thread Bill Wicker

Thanks for the update! Since we are still in the planning stage I will try to find another way to achieve what we are trying to do in the meantime and I'll keep an eye on that Jira. Two workarounds I thought about are to either match the parallelism of the source to the partition count, or since

Re: Updating ValueState not working in hosted Kinesis

2020-02-19 Thread Timo Walther

Hi Chris, your observation is right. By `new Sensor() {}` instead of just `new Sensor()` you are creating an anonymous non-static class that references the outer method and class. If you check your logs, there might be also a reason why your POJO is used as a generic type. I assume because y

Re: Process stream multiple time with different KeyBy

2020-02-19 Thread Lehuede sebastien

Hi guys, Thanks for your answers and sorry for the late reply. My use case is : I receive some events on one stream, each events can contain: - 1 field category - 1 field subcategory - 1 field category AND 1 field subcategory Events are matched against rules which can contain :

Re: about registering completion function for worker shutdown

2020-02-19 Thread Robert Metzger

Hi, I now realize that you are using the batch API, and I gave you an answer for the streaming API :( The mapPartition function also has a close() method, which you can use to implement the same pattern. With a JVM Shutdown hook, you are assuming that the TaskManager is shutting down at the end of

Re: Re: Updating ValueState not working in hosted Kinesis

2020-02-19 Thread Chris Stevens

Thanks again Timo, I hope I replied correctly this time. As per my previous message the Sensor class is a very simple POJO type (I think). When the serialization trace talks about PGSql stuff it makes me think that something from my operator is being included in serialization. Not just the Sensor

Re: Parallelize Kafka Deserialization of a single partition?

2020-02-19 Thread Till Rohrmann

I have to correct myself. DataStream respects the ExecutionConfig.enableObjectReuse which happens in the form of creating different Outputs in the OperatorChain. This is also in line with the different behaviour you are observing. Concerning your initial question Theo, you could do the following i

Re: Flink Kafka connector consume from a single kafka partition

2020-02-19 Thread Arvid Heise

Hi Hemant, Flink passes your configurations to the Kafka consumer, so you could check if you can subscribe to only one partition there. However, I would discourage that approach. I don't see the benefit to just subscribing to the topic entirely and have dedicated processing for the different devi

Fwd: Re: Updating ValueState not working in hosted Kinesis

2020-02-19 Thread Timo Walther

Hi Chris, [forwarding the private discussion to the mailing list again] first of all, are you sure that your Sensor class is either a top-level class or a static inner class. Because it seems there is way more stuff in it (maybe included by accident transitively?). Such as: org.apache.loggin

Re: Table API: Joining on Tables of Complex Types

2020-02-19 Thread Timo Walther

Hi Andreas, you are right, currently the Row type only supports accessing fields by index. Usually, we recommend to fully work in Table API. There you can access structured type fields by name (`SELECT row.field.field` or `'row.get("field").get("field")`) and additional utilities such as `fla

Re: How Do i Serialize a class using default kryo and protobuf in scala

2020-02-19 Thread Timo Walther

Hi, would Apache Avro be an option for you? Because this is currently still the best supported format when it comes to schema upgrades as far as I know. Maybe Gordon in CC can give your some additional hints. Regards, Timo On 18.02.20 10:38, ApoorvK wrote: I have some case class which have

Re: Updating ValueState not working in hosted Kinesis

2020-02-19 Thread Timo Walther

Hi Chris, it seems there are field serialized into state that actually don't belong there. You should aim to treat Sensor as a POJO instead of a Kryo generic serialized black-box type. Furthermore, it seems that field such as "org.apache.logging.log4j.core.layout.AbstractCsvLayout" should no

Re: CSV StreamingFileSink

2020-02-19 Thread Timo Walther

Hi Austin, the StreamingFileSink allows bucketing the output data. This should help for your use case: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment Regards, Timo On 19.02.20 01:00, Austin Cawley-Edwards wrote: Following up on th

Re: Parallelize Kafka Deserialization of a single partition?

2020-02-19 Thread Timo Walther

Hi Theo, there are lot of performance improvements that Flink could do but they would further complicate the interfaces and API. They would require deep knowledge of users about the runtime when it is safe to reuse object and when not. The Table/SQL API of Flink uses a lot of these optimizat

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread Timo Walther

Hi Anuj, another option would be to use the new Hive connectors. Have you looked into those? They might work on SQL internal data types which is why you would need to use the Table API then. Maybe Bowen in CC can help you here. Regards, Timo On 19.02.20 11:14, Rafi Aroch wrote: Hi Anuj, I

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread Rafi Aroch

Hi Anuj, It's been a while since I wrote this (Flink 1.5.2). Could be a better/newer way, but this is what how I read & write Parquet with hadoop-compatibility: // imports > import org.apache.avro.generic.GenericRecord; > import org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormat; > impo

Re: Exceptions in Web UI do not appear in logs

2020-02-19 Thread Robert Metzger

Thanks a lot for reporting this issue. Which version of Flink are you using? I checked the code of the Kinesis ShardConsumer (the current version though), and I found that exceptions from the ShardConsumer are properly forwarded to the lower level runtime. Did you check the *.out files of the Tas

RE: Parallelize Kafka Deserialization of a single partition?

2020-02-19 Thread Theo Diefenthal

I have the same experience as Eleanore, When enabling object reuse, I saw a significant performance improvement and in my profiling session. I saw that a lot of serialization/deserialization was not performed any more. That’s why my question should originally aim a bit further: It’s good th

Re: StreamTableEnvironment can not run in batch mode

JDBC source running continuously

Fwd: StreamTableEnvironment can not run in batch mode

FlinkKinesisConsumer throws "Unknown datum type org.joda.time.DateTime"

Re: Flink job fails with org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

Re: Side Outputs from RichAsyncFunction

Tests in FileUtilsTest while building Flink in local

StreamTableEnvironment can not run in batch mode

Re: Link to Flink on K8S Webinar

Re: Flink Kafka connector consume from a single kafka partition

Re: Updating ValueState not working in hosted Kinesis

Re: CSV StreamingFileSink

RE: Flink's Either type information

Re: BucketingSink capabilities for DataSet API

Re: BucketingSink capabilities for DataSet API

RE: KafkaFetcher closed before end of stream is received for all partitions.

Re: Updating ValueState not working in hosted Kinesis

Re: Process stream multiple time with different KeyBy

Re: about registering completion function for worker shutdown

Re: Re: Updating ValueState not working in hosted Kinesis

Re: Parallelize Kafka Deserialization of a single partition?

Re: Flink Kafka connector consume from a single kafka partition

Fwd: Re: Updating ValueState not working in hosted Kinesis

Re: Table API: Joining on Tables of Complex Types

Re: How Do i Serialize a class using default kryo and protobuf in scala

Re: Updating ValueState not working in hosted Kinesis

Re: CSV StreamingFileSink

Re: Parallelize Kafka Deserialization of a single partition?

Re: BucketingSink capabilities for DataSet API

Re: BucketingSink capabilities for DataSet API

Re: Exceptions in Web UI do not appear in logs

RE: Parallelize Kafka Deserialization of a single partition?

32 matches

Site Navigation

Mail list logo

Footer information