Re: Reading from sockets using dataset api

2020-04-24 Thread Arvid Heise
Hi Kaan, sorry, I haven't considered I/O as the bottleneck. I thought a bit more about your issue and came to a rather simple solution. How about you open a socket on each of your generator nodes? Then you configure your analysis job to read from each of these sockets with a separate source and u

Re: JDBC table api questions

2020-04-24 Thread Flavio Pompermaier
Ok, but how can view then retrieved via table API? If they are returned as Table objects, is there a way to mark them read only? Because VIEWS in Flink SQL are Flink Views, so how can I query JDBC views? On Fri, Apr 24, 2020 at 4:22 AM Zhenghua Gao wrote: > FLINK-16471 introduce a JDBCCatalog, w

Re: Flink

2020-04-24 Thread Konstantin Knauf
Hi Navneeth, I think there might be some misunderstanding here. Let me try to clarify. 1) The so-called native Kubernetes support [1], which was added as an experimental feature in Flink 1.10, is not used by Ververica Platform CE nor by the Lyft K8s Operator as far as I am aware of. 2) The nativ

Re: Flink 1.10 Out of memory

2020-04-24 Thread Stephan Ewen
@Xintong - out of curiosity, where do you see that this tries to fork a process? I must be overlooking something, I could only see the native method call. On Fri, Apr 24, 2020 at 4:53 AM Xintong Song wrote: > @Stephan, > I don't think so. If JVM hits the direct memory limit, you should see the >

Re: Reading from sockets using dataset api

2020-04-24 Thread Kaan Sancak
Yes, that sounds like a great idea and actually that's what I am trying to do. > Then you configure your analysis job to read from each of these sockets with > a separate source and union them before feeding them to the actual job? Before trying to open the sockets on the slave nodes, first I h

Re: Flink 1.10 Out of memory

2020-04-24 Thread Xintong Song
I might be wrong about how JNI works. Isn't a native method always executed in another process? I was searching for the java error message "Cannot allocate memory", and it seems this happens when JVM cannot allocate memory from the OS. Given the exception is thrown from calling a native method, I

Re: Reading from sockets using dataset api

2020-04-24 Thread Arvid Heise
Hm, I confused sockets to work the other way around (so pulling like URLInputStream instead of listening). I'd go by providing the data on a port on each generator node. And then read from that in multiple sources. I think the best solution is to implement a custom InputFormat and then use readInp

Re: Flink 1.10 Out of memory

2020-04-24 Thread Stephan Ewen
I think native methods are not in a forked process. It is just a malloc() call that failed, probably an I/O buffer or so. This might mean that there really is no native memory available any more, meaning the process has hit its limit. In any case, a bit more JVM overhead should solve this. On Fri,

Re: Flink 1.10 Out of memory

2020-04-24 Thread Xintong Song
True. Thanks for the clarification. Thank you~ Xintong Song On Fri, Apr 24, 2020 at 5:21 PM Stephan Ewen wrote: > I think native methods are not in a forked process. It is just a malloc() > call that failed, probably an I/O buffer or so. > This might mean that there really is no native memor

Re: Handling stale data enrichment

2020-04-24 Thread Vinay Patil
Hi Konstantin, Thank you for your answer. Yes, we have timestamps in the subscription stream >the disadvantage that you do not make any progress until you see fresh subscription data. Is this the desired behavior for your use case? No, this is not acceptable. Reason being the subscription data m

Re: Flink 1.10 Out of memory

2020-04-24 Thread Zahid Rahman
https://youtu.be/UEkjRN8jRx4 22:10 - one option is to reduce flink managed memory from default 70% to may be 50%. - This error could be caused also due to missing memory ; - maintaining a local list by programmer so over using user allocated memory caused by heavy processing ; - or usin

Re: Checkpoint Error Because "Could not find any valid local directory for s3ablock-0001"

2020-04-24 Thread Robert Metzger
Thanks for opening the ticket. I've asked a committer who knows the streaming sink well to take a look at the ticket. On Fri, Apr 24, 2020 at 6:47 AM Lu Niu wrote: > Hi, Robert > > BTW, I did some field study and I think it's possible to support streaming > sink using presto s3 filesystem. I thi

Re: Flink 1.10 Out of memory

2020-04-24 Thread Stephan Ewen
@zahid I would kindly ask you to rethink you approach to posting here. Wanting to help answering questions is appreciated, but what you post is always completely disconnected from the actual issue. The questions here usually go beyond the obvious and beyond what a simple Stack Overflow search yiel

Re: Fault tolerance in Flink file Sink

2020-04-24 Thread Dawid Wysakowicz
Hi Eyal, First of all I would say a local filesystem is not a right choice for what you are trying to achieve. I don't think you can achive a true exactly once policy in this setup. Let me elaborate why. Let me clarify a bit how the StreamingFileSink works.  The interesting bit is how it behaves

Re: Fault tolerance in Flink file Sink

2020-04-24 Thread Dawid Wysakowicz
Forgot to cc Kostas On 23/04/2020 12:11, Eyal Pe'er wrote: > > Hi all, > I am using Flink streaming with Kafka consumer connector > (FlinkKafkaConsumer) and file Sink (StreamingFileSink) in a cluster > mode with exactly once policy. > > The file sink writes the files to the local disk. > > I’ve no

Re: Flink Forward 2020 Recorded Sessions

2020-04-24 Thread Sivaprasanna
Cool. Thanks for the information. On Fri, 24 Apr 2020 at 11:20 AM, Marta Paes Moreira wrote: > Hi, Sivaprasanna. > > The talks will be up on Youtube sometime after the conference ends. > > Today, the starting schedule is different (9AM CEST / 12:30PM IST / 3PM > CST) and more friendly to Europe,

JDBC error on numeric conversion (because of DecimalType MIN_PRECISION)

2020-04-24 Thread Flavio Pompermaier
Hi to all, I was doing a test against Postgres and I get an error because the jdbc connect tries to create a DecimalType with precision 0 (min = 1). Should this DecimalType.MIN_PRECISION lowered to 0 or should NUMERIC type of jdbc tables mapped in some other way? The Table was created with the fol

[ANNOUNCE] Development progress of Apache Flink 1.11

2020-04-24 Thread Piotr Nowojski
Hi community, It has been more than 6 weeks since the previous announcement and as we are approaching the expected feature freeze we would like to share the Flink 1.11 status update with you. Initially we were aiming for the feature freeze to happen in late April (now), however it was recently pr

Re: Question about Scala Case Class and List in Flink

2020-04-24 Thread Averell
Hi Timo, This is my case class: /case class Box[T](meta: Metadata, value: T) { def function1: A=>B = {...} def method2(...):A = {...} }/ However, I still get that warning "/Class class data.package$Box cannot be used as a POJO type because not all fields are valid POJO fields, and mus

Re: K8s native - checkpointing to S3 with RockDBStateBackend

2020-04-24 Thread Averell
Thank you Yun Tang. Building my own docker image as suggested solved my problem. However, I don't understand why I need that while I already had that s3-hadoop jar included in my uber jar? Thanks. Regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.co

Re: [ANNOUNCE] Development progress of Apache Flink 1.11

2020-04-24 Thread Till Rohrmann
Thanks for the update Piotr. Cheers, Till On Fri, Apr 24, 2020 at 4:42 PM Piotr Nowojski wrote: > Hi community, > > It has been more than 6 weeks since the previous announcement and as we are > approaching the expected feature freeze we would like to share the Flink > 1.11 status update with yo

checkpointing opening too many file

2020-04-24 Thread ysnakie
Hi everyone We have a Flink Job to write files to HDFS's different directories. It will open many files due to its high parallelism. I also found that if using rocksdb state backend, it will have even more files open during the checkpointing.  We use yarn to schedu

Re: K8s native - checkpointing to S3 with RockDBStateBackend

2020-04-24 Thread David Magalhães
I think the classloaders for the uberjar and the link are different. Not sure if this is the right explanation, but that is why you need to add flink-s3-fs-hadoop inside the plugin folder in the cluster. On Fri, Apr 24, 2020 at 4:07 PM Averell wrote: > Thank you Yun Tang. > Building my own docke

RE: History Server Not Showing Any Jobs - File Not Found?

2020-04-24 Thread Hailu, Andreas
I'm having a further look at the code in HistoryServerStaticFileServerHandler - is there an assumption about where overview.json is supposed to be located? // ah From: Hailu, Andreas [Engineering] Sent: Wednesday, April 22, 2020 1:32 PM To: 'Chesnay Schepler' ; Hailu, Andreas [Engineering] ; us

Re: Flink 1.10 Out of memory

2020-04-24 Thread Zahid Rahman
> " a simple Stack Overflow search yields. " Actually it wasn't stack overflow but a video I saw presented by Robert Metzger. of Apache Flink org. Your mind must have been fogged up with another thought of another email not the contents of my email clearly. He explained the very many solutions

Re: JDBC error on numeric conversion (because of DecimalType MIN_PRECISION)

2020-04-24 Thread Flavio Pompermaier
I think I hit the same problem of SPARK-26538 ( https://github.com/apache/spark/pull/23456). I've handled the case in the same manner in my PR for FLINK-17356 ( https://github.com/apache/flink/pull/11906) On Fri, Apr 24, 2020 at 4:28 PM Flavio Pompermaier wrote: > Hi to all, > I was doing a test

Re: Flink 1.10 Out of memory

2020-04-24 Thread Som Lima
@Zahir what the Apache means is don't be like Jesse Anderson who recommended Flink on the basis Apache only uses maps as seen in video. While Flink uses ValueState and State in Streaming API. Although it appears Jesse Anderson only looked as deep as the data stream helloworld. You are required

ArrayIndexoutofBoundsException.

2020-04-24 Thread Zahid Rahman
@Stephan Ewen. That was the other response I gave. I have thought about it really hard as per your request. Dr. NOUREDDIN SADAWI Shows how to handle that exception. https://youtu.be/c7rsWQvpw4k @ 7:06.

Re: checkpointing opening too many file

2020-04-24 Thread Congxian Qiu
Hi If there are indeed so many files need to upload to hdfs, then currently we do not have any solutions to limit the open files, there exist an issue[1] wants to fix this problem, and a pr for it, maybe you can try the attached pr to try it can solve your problem. [1] https://issues.apache.org/ji

Re: Debug Slowness in Async Checkpointing

2020-04-24 Thread Congxian Qiu
Hi If the bottleneck is the upload part, did you even have tried upload files using multithread[1] [1] https://issues.apache.org/jira/browse/FLINK-11008 Best, Congxian Lu Niu 于2020年4月24日周五 下午12:38写道: > Hi, Robert > > Thanks for relying. Yeah. After I added monitoring on the above path, it > sh