Re: Python UDF from Java

2020-04-30 Thread jincheng sun
Thanks Flavio and Thanks Marta, That's a good question as many user want to know that! CC to user-zh mailing list :) Best, Jincheng - Twitter: https://twitter.com/sunjincheng121 - Flavio Pompermaier 于2020年5月1日周五 上午7:04写道: > Yes, that's awesome! I think this would be

Re: Python UDF from Java

2020-04-30 Thread Flavio Pompermaier
Yes, that's awesome! I think this would be a really attractive feature to promote the usage of Flink. Thanks Marta, Flavio On Fri, May 1, 2020 at 12:26 AM Marta Paes Moreira wrote: > Hi, Flavio. > > Extending the scope of Python UDFs is described in FLIP-106 [1, 2] and is > planned for the upco

Re: Python UDF from Java

2020-04-30 Thread Marta Paes Moreira
Hi, Flavio. Extending the scope of Python UDFs is described in FLIP-106 [1, 2] and is planned for the upcoming 1.11 release, according to Piotr's last update. Hope this addresses your question! Marta [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Fun

Python UDF from Java

2020-04-30 Thread Flavio Pompermaier
Hi to all, is it possible to run a Python UDF from a Java job (using Table API or SQL)? Is there any reference? Best, Flavio

Re: Does it make sense to use init containers for job upgrades in kubernetes

2020-04-30 Thread bbaj...@gmail.com
Thnx all: 1) for now, we will try with inhouse kubernetes, and see how it goes. 2) Till, cheers, I'll give a stab, though likely I'll end up with an operator or some other workflow tool ( I've gotten multiple weird looks when I mentioned init container approach at work; I was mostly curios at this

Re: Wait for cancellation event with CEP

2020-04-30 Thread Till Rohrmann
Hi Maxim, I think your problem should be solvable with the CEP library: public static void main(String[] args) throws Exception { // set up the streaming execution environment final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStrea

Re: Does it make sense to use init containers for job upgrades in kubernetes

2020-04-30 Thread Till Rohrmann
Hi Barisa, from what you've described I believe it could work. But I never tried it out. Maybe you could report back once you tried it out. I believe it would be interesting to hear your experience with this approach. One thing to note is that the approach hinges on the fact that the older JobMan

Re: join in sql without time interval

2020-04-30 Thread Jark Wu
Yes. Flink Table&SQL uses something like that but more lower API called `TwoInputStreamOperator`, you can see: org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator And state ttl in TableConfig can take effect on such join query. Best, Jark On Thu, 30 Apr 2020 at 22:35, lec

Re: join in sql without time interval

2020-04-30 Thread lec ssmi
Thanks, but is the bottom layer of the table API really implemented like this? Konstantin Knauf 于 2020年4月30日周四 22:02写道: > Hi Lec Ssmi, > > yes, Dastream#connect on two streams both keyed on the productId with a > KeyedCoProcessFunction is the way to go. > > Cheers, > > Konstantin > > On Thu, Apr

Re: Using logicalType in the Avro table format

2020-04-30 Thread Arvid Heise
Okay bummer, but not completely unexpected. The conversions should be automatically compiled into SpecificRecords. I'm not sure how the Table API is doing it internally; I just saw SpecificRecord in your stacktrace and figured to try it out. On Thu, Apr 30, 2020 at 3:35 PM Gyula Fóra wrote: > H

Re: join in sql without time interval

2020-04-30 Thread Konstantin Knauf
Hi Lec Ssmi, yes, Dastream#connect on two streams both keyed on the productId with a KeyedCoProcessFunction is the way to go. Cheers, Konstantin On Thu, Apr 30, 2020 at 11:10 AM lec ssmi wrote: > Maybe, the connect method? > > lec ssmi 于2020年4月30日周四 下午3:59写道: > >> Hi: >> As the following

Re: Does it make sense to use init containers for job upgrades in kubernetes

2020-04-30 Thread Alexander Fedulov
Hi Barisa, it seems that there is no immediate answer to your concrete question here, so I wanted to ask you back a more general question: did you consider using the Community Edition of Ververica Platform for your purposes [1]

Re: Using logicalType in the Avro table format

2020-04-30 Thread Gyula Fóra
Hi Arvid! I tried it with Avro 1.9.2, and it lead to the same error. Seems like Avro cannot find the conversion class between LocalDateTime and timestamp-millis. Not sure how exactly this works, maybe we need to set the conversions ourselves? Thanks! Gyula On Thu, Apr 30, 2020 at 12:53 PM Arvid

Re: Using Stateful Functions within a Flink pipeline

2020-04-30 Thread Annemarie Burger
Hi Igal, Thanks for your responses. Regarding "having a pre-processing job that does whatever transformations necessary with the DataStream and outputs to Kafka / Kinesis, and then having a separate StateFun deployment that consumes from that transformed Kafka / Kinesis topic." I was wondering how

Re: multiple joins in one job

2020-04-30 Thread Benchao Li
Hi lec, AFAIK, time attribute will be preserved after time interval join. Could you share your DDL and SQL queries with us? lec ssmi 于2020年4月30日周四 下午5:48写道: > Hi: >I need to join multiple stream tables using time interval join. The > problem is that the time attribute will disappear aft

Re: Using logicalType in the Avro table format

2020-04-30 Thread Arvid Heise
Hi Gyula, it may still be worth to try to upgrade to Avro 1.9.2 (can never hurt) and see if this solves your particular problem. The code path in GenericDatumWriter is taking the conversion path, so it might just work. Of course that depends on the schema being correctly translated to a specific r

Wait for cancellation event with CEP

2020-04-30 Thread Maxim Parkachov
Hi everyone, I need to implement following functionality. There is a kafka topic where "forward" events are coming and in the same topic there are "cancel" events. For each "forward" event I need to wait 1 minute for possible "cancel" event. I can uniquely match both events. If "cancel" event come

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-30 Thread Jiahui Jiang
I see I see. Thank you so much! From: Xintong Song Sent: Wednesday, April 29, 2020 11:22 PM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10 That's pretty much it. I'm not very familiar with

Re: "Fill in" notification messages based on event time watermark

2020-04-30 Thread Manas Kale
Hi Timo and Piotrek, Thank you for the suggestions. I have been trying to set up unit tests at the operator granularity, and the blog post's testHarness examples certainly help a lot in this regard. I understood my problem - an upstream session window operator can only report the end of the sessio

multiple joins in one job

2020-04-30 Thread lec ssmi
Hi: I need to join multiple stream tables using time interval join. The problem is that the time attribute will disappear after the jon , and pure sql cannot declare the time attribute field again . So, to make is success, I need to insert the last result of join to kafka ,and consume it

Re: join in sql without time interval

2020-04-30 Thread lec ssmi
Maybe, the connect method? lec ssmi 于2020年4月30日周四 下午3:59写道: > Hi: > As the following sql: > >SELECT * FROM Orders INNER JOIN Product ON Orders.productId = > Product.id > > If we use datastream API instead of sql, how should it be achieved? > Because the APIs in DataStream only have W

Re: doing demultiplexing using Apache flink

2020-04-30 Thread Alexander Fedulov
This too, should be possible. Flink uses `StreamingFileSink` to transfer data to S3 [1 ]. You can pass it your custom bucket assigner [2

Re: Using logicalType in the Avro table format

2020-04-30 Thread Gyula Fóra
Hi! @Arvid: We are using Avro 1.8 I believe but this problem seems to come from the flink side as Dawid mentioned. @Dawid: Sounds like a reasonable explanation, here are the actual queries to reproduce within the SQL client/table api: CREATE TABLE source_table ( int_field INT, ti

Re: Savepoint memory overhead

2020-04-30 Thread Lasse Nedergaard
Hi Thanks for the reply. The link you provide make us thinking of some old rocksdb cfg. We was still using and it could cause our container killing problems so I will do a test without specific rocksdb cfg. But we also see RocksDbExceptions “cannot allocate memory” while appending to a file.

Re: Using logicalType in the Avro table format

2020-04-30 Thread Dawid Wysakowicz
Hi Gyula, I have not verified it locally yet, but I think you are hitting yet another problem of the unfinished migration from old TypeInformation based type system to the new type system based on DataTypes. As far as I understand the problem the information about the bridging class (java.sql.Time

join in sql without time interval

2020-04-30 Thread lec ssmi
Hi: As the following sql: SELECT * FROM Orders INNER JOIN Product ON Orders.productId = Product.id If we use datastream API instead of sql, how should it be achieved? Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above

Re: Using logicalType in the Avro table format

2020-04-30 Thread Arvid Heise
Hi Gyula, can you please check which Avro version you are using? Avro only supports Java 8 time (java.time.LocalDateTime) after 1.9.2. Before that everything was hardcoded to joda time. However, I'm not entirely sure where the Java 8 time is coming in your example, as I'm not familiar with the t

Re: doing demultiplexing using Apache flink

2020-04-30 Thread Arvid Heise
Hi Dhurandar, if you use KafkaSerializationSchema [1], you can create a producer record, where you explicitly set the output topic. The topic can be arbitrarily calculated. You pass it while constructing the sink: stream.addSink(new FlinkKafkaProducer( topic, serSchema, // <--