Re: History Server Not Showing Any Jobs - File Not Found?

2020-04-21 Thread Chesnay Schepler
Which Flink version are you using? Have you checked the history server logs after enabling debug logging? On 21/04/2020 17:16, Hailu, Andreas [Engineering] wrote: Hi, I’m trying to set up the History Server, but none of my applications are showing up in the Web UI. Looking at the console, I s

Re: LookupableTableSource from kafka consumer

2020-04-21 Thread Danny Chan
We usually implementation a LookupableTableSource based on k-v store data sources, such as the ES, Hbase and Redis. For Kafka, what we usually do is a regular stream join [1] [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/streaming/joins.html#regular-joins Best, Dann

Reading from sockets using dataset api

2020-04-21 Thread Kaan Sancak
Hi, I have been running some experiments on large graph data, smallest graph I have been using is around ~70 billion edges. I have a graph generator, which generates the graph in parallel and feeds to the running system. However, it takes a lot of time to read the edges, because even though th

Two questions about Async

2020-04-21 Thread Stephen Connolly
1. On Flink 1.10 when I look at the topology overview, the AsyncFunctions show non-zero values for Bytes Received; Records Received; Bytes Sent but Records Sent is always 0... yet the next step in the topology shows approx the same Bytes Received as the async sent (modulo minor delays) and a non-ze

Re: Checkpoint Error Because "Could not find any valid local directory for s3ablock-0001"

2020-04-21 Thread Lu Niu
Cool, thanks! On Tue, Apr 21, 2020 at 4:51 AM Robert Metzger wrote: > I'm not aware of anything. I think the presto s3 file system is generally > the recommended S3 FS implementation. > > On Mon, Apr 13, 2020 at 11:46 PM Lu Niu wrote: > >> Thank you both. Given the debug overhead, I might just

Re: Debug Slowness in Async Checkpointing

2020-04-21 Thread Lu Niu
Hi, Robert Thanks for replying. To improve observability , do you think we should expose more metrics in checkpointing? for example, in incremental checkpoint, the time spend on uploading sst files? https://github.com/apache/flink/blob/5b71c7f2fe36c760924848295a8090898cb10f15/flink-state-backends/

Re: Latency tracking together with broadcast state can cause job failure

2020-04-21 Thread Yun Tang
Hi Lasse Really sorry for missing your reply. I'll run your project and find the root cause in my day time. And thanks for @Robert Metzger 's kind remind. Best Yun Tang From: Robert Metzger Sent: Tuesday, April 21, 2020 20:01 To: Las

How to use OpenTSDB as Source?

2020-04-21 Thread Lucas Kinne
Hey guys, in a university project we are storing our collected sensor data in an OpenTSDB database. I am now trying to use this database as a source in Apache Flink, but I can't seem to figure out how to do it. I have seen that there is no existing connector for this Databas

KeyedStream and chained forward operators

2020-04-21 Thread Cliff Resnick
I'm running a massive file sifting by timestamp DataSteam job from s3. The basic job is: FileMonitor -> ContinuousFileReader -> MultipleFileOutputSink The MultipleFileOutputSink sifts data based on timestamp to date-hour directories It's a lot of data, so I'm using high parallelism, but I want t

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
I had some back and forth last year about this, I'll forward the discussion email chain to you privately (it was in this mailing list). Basically, the idea was to make *DispatcherRestEndpoint* and/or *WebMonitorExtension* more accessible so we can extend them. It didn't look too much work on Flink

History Server Not Showing Any Jobs - File Not Found?

2020-04-21 Thread Hailu, Andreas [Engineering]
Hi, I'm trying to set up the History Server, but none of my applications are showing up in the Web UI. Looking at the console, I see that all of the calls to /overview return the following 404 response: {"errors":["File not found."]}. I've set up my configuration as follows: JobManager Archive

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Jeff Zhang
I know some users do the same thing in spark. Usually the service run spark driver side. But flink is different from spark. Spark driver is equal to flink client + flink job manager. I don't think currently we allow to run any user code in job manager. So allow running user defined service in job m

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
definitely, this is for me about making Flink an "application framework" rather than solely a "dataflow framework". -- [image: MotaWord] Oytun Tez M O T A W O R D | CTO & Co-Founder oy...@motaword.com On Tue, Apr 21, 2020 at 11:07 AM Flavio Pompermaier

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Flavio Pompermaier
In my mind the user API could run everywhere but the simplest thing is to make them available in the Job Manager (where the other REST API lives). They could become a very simple but powerful way to add valuable services to Flink without adding useless complexity to the overall architecture for jus

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Jeff Zhang
Hi Flavio, I am curious know where service run, Do you create this service in UDF and run it in TM ? Flavio Pompermaier 于2020年4月21日周二 下午8:30写道: > Hi to all, > many times it happens that we use Flink as a broker towards the data layer > but we need to be able to get some specific info from the

Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
I would LOVE this. We had to hack our way a lot to achieve something similar. @Flavio, we basically added a new entrypoint to the same codebase and ran that separately in its own container. -- [image: MotaWord] Oytun Tez M O T A W O R D | CTO & Co-Founder oy...@motaword.com

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Leonard Xu
Thanks @Seth Wiesman Ah,I just found I used 1.10-SNAPSHOT locally so can not reproduce the bug. @Averell you can use casts first and wait for 1.10.1 version, 1.10.1 will release soon. Best, Leonard > 在 2020年4月21日,22:03,Seth Wiesman 写道: > > Hi All, > > There is a bug in the builder that prev

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Seth Wiesman
Hi All, There is a bug in the builder that prevents it from compiling in scala due to differences in type inference between java and scala[1]. It as already been resolved for 1.10.1 and 1.11. In the meantime, just go ahead and use casts or construct the object in a java class. Seth [1] https://i

JDBC Table and parameters provider

2020-04-21 Thread Flavio Pompermaier
Hi all, we have a use case where we have a prepared statement that we parameterize using a custom parameters provider (similar to what happens in testJDBCInputFormatWithParallelismAndNumericColumnSplitting[1]). How can we handle this using the JDBC table API? What should we do to handle such a use

Re: Flink incremental checkpointing - how long does data is kept in the share folder

2020-04-21 Thread Shachar Carmeli
Hi Yun, First of all sorry for the naming mistake , it was a typo How to judge that they are related to specific checkpoint? I judged by removing the files and restarting the job - seeing if it fails in the code below I missed the privateState that is why I was missing files What about recov

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Leonard Xu
Hi, Averell I found you’re using scala so I reproduced your case local in Scala 2.11.12 with Flink 1.10.0 and it works too. From your picture it’s wired that line`.withBucketAssigner(new DateTimeBucketAssigner)` hint is `Any`, it should be `RowFormatBuilder` otherwise you can not call `#build

Enable custom REST APIs in Flink

2020-04-21 Thread Flavio Pompermaier
Hi to all, many times it happens that we use Flink as a broker towards the data layer but we need to be able to get some specific info from the data sources we use (i.e. get triggers and relationships from jdbc). The quick and dirty way of achieving this is to run a Flink job that calls another ser

Re: Passing checkpoint lock object to StreamSourceContexts.getSourceContext after StreamTask.getCheckpointLock deprecation

2020-04-21 Thread Arvid Heise
Hi Yuval, sorry for the late response on the mailing list. I'm assuming your stackoverflow post is referring to the same question and would just cross-reference it here. [1] [1] https://stackoverflow.com/questions/61141240/streamtask-getcheckpointlock-deprecation-and-custom-flink-sources On Thu

Re: Latency tracking together with broadcast state can cause job failure

2020-04-21 Thread Robert Metzger
Hey Lasse, has the problem been resolved? (I'm also responding to this to make sure the thread gets attention again :) ) Best, Robert On Wed, Apr 1, 2020 at 10:03 PM Lasse Nedergaard < lassenedergaardfl...@gmail.com> wrote: > Hi > > I have attached a simple project with a test that reproduce t

Re: LookupableTableSource from kafka consumer

2020-04-21 Thread Jark Wu
Hey, You can take JDBCTableSource [1] as an example about how to implement a LookupableTableSource. However, I'm not sure how to support lookup for kafka. Because AFAIK, kafka doesn't have the ability to lookup by key? Best, Jark [1]: https://github.com/apache/flink/blob/master/flink-connectors/

Re: Checkpoint Error Because "Could not find any valid local directory for s3ablock-0001"

2020-04-21 Thread Robert Metzger
I'm not aware of anything. I think the presto s3 file system is generally the recommended S3 FS implementation. On Mon, Apr 13, 2020 at 11:46 PM Lu Niu wrote: > Thank you both. Given the debug overhead, I might just try out presto s3 > file system then. Besides that presto s3 file system doesn't

LookupableTableSource from kafka consumer

2020-04-21 Thread Clay Teeter
Hey, does anyone have any examples that i can use to create a LookupableTableSource from a kafka topic? Thanks! Clay

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Averell
Hello Leonard, Sivaprasanna, But my code was working fine with Flink v1.8. I also tried with a simple String DataStream, and got the same error. /StreamingFileSink .forRowFormat(new Path(path), new SimpleStringEncoder[String]()) .withRollingPolicy(DefaultRollingPolicy.b

Re: Unable to unmarshall response (com.ctc.wstx.stax.WstxInputFactory cannot be cast to javax.xml.stream.XMLInputFactory)

2020-04-21 Thread Fu, Kai
Hi, I’m using Flink 1.8 with JDK 8. -- Best wishes Fu Kai From: Chesnay Schepler Date: Tuesday, April 21, 2020 at 5:15 PM To: "Fu, Kai" , "user@flink.apache.org" Subject: RE: [EXTERNAL] Unable to unmarshall response (com.ctc.wstx.stax.WstxInputFactory cannot be cast to javax.xml.stream.XMLIn

Re: Suppressing illegal Access Warnings

2020-04-21 Thread Chesnay Schepler
My bad, should've scrolled down further. While this probably doesn't affect Flink, I would generally recommend to not do this kind of reflection stuff in general. On 21/04/2020 10:55, Zahid Rahman wrote: I have included source code for the class and method as I have used it  in WordCount.jav

Re: Unable to unmarshall response (com.ctc.wstx.stax.WstxInputFactory cannot be cast to javax.xml.stream.XMLInputFactory)

2020-04-21 Thread Chesnay Schepler
Which Flink version are you using? On 21/04/2020 11:11, Fu, Kai wrote: Hi, I’m running Flink application on AWS Kinesis Flink platform to read a kinesis stream from another account with assumed role, while I’m getting exception like below. But it works when I’m running the application local

Unable to unmarshall response (com.ctc.wstx.stax.WstxInputFactory cannot be cast to javax.xml.stream.XMLInputFactory)

2020-04-21 Thread Fu, Kai
Hi, I’m running Flink application on AWS Kinesis Flink platform to read a kinesis stream from another account with assumed role, while I’m getting exception like below. But it works when I’m running the application locally, I’ve given all the related roles admin permission. Could anyone help wh

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Sivaprasanna
I agree with Leonard. I have just tried the same in Scala 2.11 with Flink 1.10.0 and it works just fine. Cheers, Sivaprasanna On Tue, Apr 21, 2020 at 12:53 PM Leonard Xu wrote: > Hi, Averell > > I guess it’s none of `#withRollingPolicy` and `#withBucketAssigner` and > may cause by generics typ

Re: Suppressing illegal Access Warnings

2020-04-21 Thread Zahid Rahman
I have included source code for the class and method as I have used it in WordCount.java already in the email. Here is an other copy. import java.lang.reflect.Field; import java.lang.reflect.Method; final class DisableAccessWarning { public static void disableAccessWarnings() { try {

Re: Flink upgrade to 1.10: function

2020-04-21 Thread Jark Wu
Thanks Danny! On Tue, 21 Apr 2020 at 16:24, Danny Chan wrote: > The JSON_VALUE was coded into the parser, which is always parsed as the > builtin operator, so there is no change to override it yet. > > I have fired an issue[1] to track this and hope we can resolve it in the > next Calcite releas

Re: Flink upgrade to 1.10: function

2020-04-21 Thread Danny Chan
The JSON_VALUE was coded into the parser, which is always parsed as the builtin operator, so there is no change to override it yet. I have fired an issue[1] to track this and hope we can resolve it in the next Calcite release. [1] https://issues.apache.org/jira/browse/CALCITE-3943 Best, Danny

Re: Suppressing illegal Access Warnings

2020-04-21 Thread Chesnay Schepler
I do not know where this function comes from (DisableAccessWarning().disableAccessWarnings()),**so we can't be sure. ** On 21/04/2020 00:27, Zahid Rahman wrote: Hi, *I was getting these warnings, I think these are due to certain version of Maven libraries which is impacting Java frameworks ev

Re: FlinkKafakaProducer with Confluent SchemaRegistry and KafkaSerializationSchema

2020-04-21 Thread Anil K
Thanks Fabian, I ended up using something like below. public class GenericSerializer implements KafkaSerializationSchema { private final SerializationSchema valueSerializer; private final String topic; public GenericSerializer(String topic, Schema schemaValue, String schemaRegistryUrl) {

Re: FlinkKafakaProducer with Confluent SchemaRegistry and KafkaSerializationSchema

2020-04-21 Thread Fabian Hueske
Thanks for sharing your solution Anil! Cheers, Fabian Am Di., 21. Apr. 2020 um 09:35 Uhr schrieb Anil K : > Thanks Fabian, > > I ended up using something like below. > > public class GenericSerializer implements > KafkaSerializationSchema { > > private final SerializationSchema valueSerialize

Re: Change to StreamingFileSink in Flink 1.10

2020-04-21 Thread Leonard Xu
Hi, Averell I guess it’s none of `#withRollingPolicy` and `#withBucketAssigner` and may cause by generics type that your Encoder’s element type(IN) does not match BucketAssigner element type(IN) or you lost the generics type information when instantiate them. Could you post more code phase?

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-21 Thread Yun Tang
Hi Oleg If the "stateNameToPartitionOffsets" in abnormal checkpoints (sub state > 1GB) is "Kinesis-Stream-Shard-State" [1] instead of "topic-partition-offset-states" [2], I doubt all your descriptions. The _metadata tells us that it is generated from kniesis instead of kafka, and the offsets re