Re: build.gradle troubles with IntelliJ

2022-01-20 Thread Nico Kruber
.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler. > > I tried scala 2.11 and 1.14 SNAPSHOT but nothing works. > > When I look at > https://repository.apache.org/content/repositories/snapshots/org/apache/flin > k/flink-streaming-java.. everything is available

Re: [ANNOUNCE] RocksDB Version Upgrade and Performance

2021-08-04 Thread Nico Kruber
en one of the > > committers can find the relevant patch from RocksDB master and backport > > it. > > That isn't the greatest user experience. > > > > Because of those disadvantages, we would prefer to do the upgrade to the > > newer RocksDB version

Re: Connecting to MINIO Operator/Tenant via SSL

2021-05-31 Thread Nico Kruber
dshakeContext.java:422) > ~[?:1.8.0_292] > at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182) > ~[?:1.8.0_292] > at sun.security.ssl.SSLTransport.decode(SSLTransport.java:152) > ~[?:1.8.0_292] > at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:

Re: [1.9.2] Flink SSL on YARN - NoSuchFileException

2021-04-21 Thread Nico Kruber
> > Your Personal Data: We may collect and process information about you that > may be subject to data protection laws. For more information about how we > use and disclose your personal data, how we protect your information, our > legal basis to use your information, your rights

Re: Password usage in ssl configuration

2020-11-12 Thread Nico Kruber
file will contain the cleartext > passwords for keystore and truststore files. Suppose if any attacker gains > access to this configuration file, using these passwords keystore and > truststore files can be read. What is the community approach to protect > these passwords ? > > Re

Re: Flink 1.8.3 GC issues

2020-09-10 Thread Nico Kruber
What looks a bit strange to me is that with a running job, the SystemProcessingTimeService should actually not be collected (since it is still in use)! My guess is that something is indeed happening during that time frame (maybe job restarts?) and I would propose to check your logs for anything

Re: Example flink run with security options? Running on k8s in my case

2020-08-27 Thread Nico Kruber
ed security back > on and did the curl, using this: > > openssl pkcs12 -passin pass:OhQYGhmtYLxWhnMC -in > /etc/flink-secrets/flink-tls-keystore.key -out rest.pem -nodes > > curl --cacert rest.pem tls-flink-cluster-1-11-jobmanager:8081 > > curl --cacert rest.pem --cert res

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Nico Kruber
Hi Adam, the flink binary will pick up any configuration from the flink-conf.yaml of its directory. If that is the same as in the cluster, you wouldn't have to pass most of your parameters manually. However, if you prefer not having a flink-conf.yaml in place, you could remove the security.ssl.i

Re: Failed to fetch BLOB - IO Exception

2018-10-23 Thread Nico Kruber
>> org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521) >> at >> org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231) >> at >> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnecti

Re: Filescheme GS not found sometimes - inconsistent exceptions for reading from GCS

2018-09-13 Thread Nico Kruber
; I will try setting the temporary directory to something other than the > default, can't hurt. > > Thanks for the help, > Encho > > On Thu, Sep 13, 2018 at 3:41 PM Nico Kruber <mailto:n...@data-artisans.com>> wrote: > > Hi Encho, > the SpillingAdapt

Re: Filescheme GS not found sometimes - inconsistent exceptions for reading from GCS

2018-09-13 Thread Nico Kruber
; org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:1066) > at > org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:827) > > > Any ideas why the behaviour is not determinis

Re: Multi-tenancy environment with mutual auth

2018-07-16 Thread Nico Kruber
; like Flink inter-communications doesnt support SSL mutual auth. Any > plans/ways to support it? We are going to have to create DMZ for each > tenant without that, not preferable of course. > > > - Ashish -- Nico Kruber | Software Engineer data Artisans Follow us @dataArtisans

Re: Flink job hangs using rocksDb as backend

2018-07-11 Thread Nico Kruber
/runtime.io>.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:288) > > org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189) > > org.apa

Re: PartitionNotFoundException after deployment

2018-06-05 Thread Nico Kruber
link.runtime.io > > <http://org.apache.flink.runtime.io>.network.partition.consumer.LocalInputChannel$1.run(LocalInputChannel.java:159) > >>    at java.util.TimerThread.mainLoop(Timer.java:555) > >>    at java.util.TimerThread.run(Timer.java:505) > >

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-22 Thread Nico Kruber
JM and impacted TM. > > Job ID 390a96eaae733f8e2f12fc6c49b26b8b > > -- > Thanks, > Amit > > On Thu, May 3, 2018 at 8:31 PM, Nico Kruber wrote: >> Also, please have a look at the other TaskManagers' logs, in particular >> the one that is running the operat

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-03 Thread Nico Kruber
runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:69) > >> >> > at > >> >> > > >> >> > > >> >> > > > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java

Re: Unsure how to further debug - operator threads stuck on java.lang.Thread.State: WAITING

2018-04-24 Thread Nico Kruber
runtime.operators.hash.MutableHashTable.pro >>>> >>>> <http://tors.hash.MutableHashTable.pro>cessProbeIter(MutableHashTable.java:505) >>>>   at >>>> >>>> org.apache.flink.runtime.operators.hash.Mut

Re: Reg. the checkpointing mechanism

2018-04-06 Thread Nico Kruber
e JobManager logs, I see multiple entries saying "Checkpoint > triggered". > I would like to clarify whether the JobManager is triggering the > checkpoint every 200 ms? Or does the JobManager only initiate the > checkpointing and the TaskManager do it on its own? > > R

Re: How does setMaxParallelism work

2018-03-30 Thread Nico Kruber
ink, where I start with parallelism of > (for example) 1, but Flink notices I have high volume of data to process, and > automatically increases parallelism of a running job? > > Thanks, > Alex > > -Original Message- > From: Nico Kruber [mailto:n...@data-artis

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Nico Kruber
doesn't either > restart or make the YARN app exit as failed? > > > > My launch command is basically: > > flink-${FLINK_VERSION}/bin/flink run -m yarn-cluster -yn > ${NODE_COUNT} -ys ${SLOT_COUNT} -yjm >

Re: Master and Slave files

2018-03-28 Thread Nico Kruber
ey only used by bash scripts? > > Alex -- Nico Kruber | Software Engineer data Artisans Follow us @dataArtisans -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany data Artisans, I

Re: How does setMaxParallelism work

2018-03-28 Thread Nico Kruber
28/03/18 13:21, Data Engineer wrote: > Agreed. But how did Flink decide that it should allot 1 subtask? Why not > 2 or 3? > I am trying to understand the implications of using setMaxParallelism vs > setParallelism > > On Wed, Mar 28, 2018 at 2:58 PM, Nico Kruber <mailto:n

Re: timeWindow emits records before window ends?

2018-03-28 Thread Nico Kruber
t; aggFunction=nextgen.McdrAggregator@7d7758be}, EventTimeTrigger(), > WindowedStream.aggregate(WindowedStream.java:752)) -> Map > >   > > With “duration” 42s and “records sent” 689516. > >   > > I expected no records would be sent out until 18 ms elapse. > >   > &g

Re: How does setMaxParallelism work

2018-03-28 Thread Nico Kruber
am not able to figure out how Flink decides the >> parallelism value. >> For instance, if I setMaxParallelism to 3, I see that for my job, >> there is only 1 subtask that is created. How did Flink decide that >> 1 subtask was enough? >> >>

Re: Incremental checkpointing performance

2018-03-23 Thread Nico Kruber
l checkpoint must copy all current data. > If, between two checkpoints, you write more data than the contents of > the database, e.g. by updating a key multiple times, you may indeed have > more data to store. Judging from the state sizes you gave, this is > probably not the case. >

Re: Flink web UI authentication

2018-03-19 Thread Nico Kruber
Hi Sampath, aside from allowing only certain origins via the configuration parameter "web.access-control-allow-origin", I am not aware of anything like username/password authentication. Chesnay (cc'd) may know more about future plans. You can, however, wrap a proxy like squid around the web UI if y

Re: Flink kafka connector with JAAS configurations crashed

2018-03-19 Thread Nico Kruber
Hi, I'm no expert on Kafka here, but as the tasks are run on the worker nodes (where the TaskManagers are run), please double-check whether the file under /data/apps/spark/kafka_client_jaas.conf on these nodes also contains the same configuration as on the node running the JobManager, i.e. an appro

Re: Submiting jobs via UI/Rest API

2018-03-19 Thread Nico Kruber
Thanks for reporting these issues, 1. This behaviour is actually intended since we do not spawn any thread that is waiting for the job completion (which may or may not occur eventually). Therefore, the web UI always submits jobs in detached mode and you could not wait for job completion anyway. An

Re: Calling close() on Failure

2018-03-19 Thread Nico Kruber
Hi Gregory, I tried to reproduce the behaviour you described but in my case (Flink 1.5-SNAPSHOT, using the SocketWindowWordCount adapted to let the first flatmap be a RichFlatMapFunction with a close() method), the close() method was actually called on the task manager I did not kill. Since the clo

Re: CsvSink

2018-03-19 Thread Nico Kruber
Hi Karim, when I was trying to reproduce your code, I got an exception with the name 'table' being used - by replacing it and completing the job with some input, I did see the csv file popping up. Also, the job was crashing when the file 1.txt already existed. The code I used (running Flink 1.5-SN

Re: flink on mesos

2018-03-19 Thread Nico Kruber
Can you elaborate a bit more on what is not working? (please provide a log file or the standard output/error). Also, can you try a newer flink checkount? The start scripts have been merged into a single one for 'flip6' and 'old' - I guess, mesos-appmaster.sh should be the right script for you now.

Re: Incremental checkpointing performance

2018-03-19 Thread Nico Kruber
Hi Miyuru, Indeed, the behaviour you observed sounds strange and kind of go against the results Stefan presented in [1]. To see what is going on, can you also share your changes to Flink's configuration, i.e. flink-conf.yaml? Let's first make sure you're really comparing RocksDBStateBackend with v

Re: Flink is looking for Kafka topic "n/a"

2018-03-08 Thread Nico Kruber
I think, I found a code path (race between threads) that may lead to two markers being in the list. I created https://issues.apache.org/jira/browse/FLINK-8896 to track this and will have a pull request ready (probably) today. Nico On 07/03/18 10:09, Mu Kong wrote: > Hi Gordon, > > Thanks for y

Re: Akka wants to connect with username "flink"

2018-03-06 Thread Nico Kruber
Hi Lukas, those are akka-internal names that you don't have to worry about. It looks like your TaskManager cannot reach the JobManager. Is 'jobmanager.rpc.address' configured correctly on the TaskManager? And is it reachable by this name? Is port 6123 allowed through the firewall? Are you sure th

Re: bin/start-cluster.sh won't start jobmanager on master machine

2018-03-06 Thread Nico Kruber
Hi Yesheng, `nohup /bin/bash -l bin/jobmanager.sh start cluster ...` looks a bit strange since it should (imho) be an absolute path towards flink. What you could do to diagnose further, is to try to run the ssh command manually, i.e. figure out what is being executed by calling bash -x ./bin/start

Re: Kafka offset auto-commit stops after timeout

2018-03-06 Thread Nico Kruber
Hi Edward, looking through the Kafka code, I do see a path where they deliberately do not want recursive retries, i.e. if the coordinator is unknown. It seems like you are getting into this scenario. I'm no expert on Kafka and therefore I'm not sure on the implications or ways to circumvent/fix th

Re: akka.remote.ReliableDeliverySupervisor Temporary failure in name resolution

2018-03-06 Thread Nico Kruber
Hi Miki, I'm no expert on the Kubernetes part, but could that be related to https://github.com/kubernetes/kubernetes/issues/6667? I'm not sure this is an Akka issue: if it cannot communicate with some address it basically blocks it from further connection attempts for a given time (here 5 seconds)

Re: Flink is looking for Kafka topic "n/a"

2018-03-06 Thread Nico Kruber
Hi Mu, which version of flink are you using? I checked the latest branches for 1.2 - 1.5 to look for findLeaderForPartitions at line 205 in Kafka08Fetcher but they did not match. From what I can see in the code, there is a MARKER partition state with topic "n/a" but that is explicitly removed from

Re: Which test cluster to use for checkpointing tests?

2018-03-06 Thread Nico Kruber
issue, though I prefer to wait > for ongoing changes in the network model and FLIP-6 to be finalised to apply > this change properly (are they?). > > Paris > >> On 6 Mar 2018, at 10:51, Nico Kruber wrote: >> >> Hi Ken, >> sorry, I was mislead by the fac

Re: Which test cluster to use for checkpointing tests?

2018-03-06 Thread Nico Kruber
Hi Ken, sorry, I was mislead by the fact that you are using iterations and those were only documented for the DataSet API. Running checkpoints with closed sources sounds like a more general thing than being part of the iterations rework of FLIP-15. I couldn't dig up anything on jira regarding this

Re: logging question

2018-02-28 Thread Nico Kruber
apache.log4j.ConsoleAppender > log4j.appender.console.Target=System.out > log4j.appender.console.layout=org.apache.log4j.PatternLayout > log4j.appender.console.layout.ConversionPattern=%d{-MM-dd > HH:mm:ss,SSS} %-5p %-60c %x - %m%n >   > # Suppress the irrelevant (wrong) warnings > log4j.logger.org.jboss.netty.chan

Re: Which test cluster to use for checkpointing tests?

2018-02-28 Thread Nico Kruber
Nico [1] https://lists.apache.org/thread.html/3121ad01f5adf4246aa035dfb886af534b063963dee0f86d63b675a1@1447086324@%3Cuser.flink.apache.org%3E On 26/02/18 22:55, Ken Krugler wrote: > Hi Nico, > >> On Feb 26, 2018, at 9:41 AM, Nico Kruber > <mailto:n...@data-artisa

Re: Which test cluster to use for checkpointing tests?

2018-02-26 Thread Nico Kruber
Hi Ken, LocalFlinkMiniCluster should run checkpoints just fine. It looks like it was attempting to even create one but could not finish. Maybe your program was not fully running yet? Can you tell us a little bit more about your set up and how you configured the LocalFlinkMiniCluster? Nico On 23/

Re: TaskManager crashes with PageRank algorithm in Gelly

2018-02-26 Thread Nico Kruber
Hi, without knowing Gelly here, maybe it has to do something with cleaning up the allocated memory as mentioned in [1]: taskmanager.memory.preallocate: Can be either of true or false. Specifies whether task managers should allocate all managed memory when starting up. (DEFAULT: false). When taskma

Re: Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase

2018-02-26 Thread Nico Kruber
Judging from the code, you should separate different jars with a colon ":", i.e. "—addclasspath jar1:jar2" Nico On 26/02/18 10:36, kant kodali wrote: > Hi Gordon, > > Thanks for the response!! How do I add multiple jars to the classpaths? > Are they separated by a semicolon and still using one

Re: Unrecoverable job failure after Json parse error?

2018-01-16 Thread Nico Kruber
o have a solution in hands. > Thanks again. > Adrian >   > > - Original message - > From: Nico Kruber > To: Adrian Vasiliu , user@flink.apache.org > Cc: > Subject: Re: Unrecoverable job failure after Json parse error? > Date: Tue, Jan 16,

Re: Low throughput when trying to send data with Sockets

2018-01-16 Thread Nico Kruber
w is that my source is not parallel, so when I am > increasing parallelism, system's throughput falls. > > Is opening multiple sockets a quick solution to make the source parallel? > > G. > > 2018-01-16 10:51 GMT+00:00 Nico Kruber <mailto:n...@data-artisans.com>&g

Re: Low throughput when trying to send data with Sockets

2018-01-16 Thread Nico Kruber
Hi George, I suspect issuing a read operation for every 68 bytes incurs too much overhead to perform as you would like it to. Instead, create a bigger buffer (64k?) and extract single events from sub-regions of this buffer instead. Please note, however, that then the first buffer will only be proce

Re: Unrecoverable job failure after Json parse error?

2018-01-16 Thread Nico Kruber
Hi Adrian, couldn't you solve this by providing your own DeserializationSchema [1], possibly extending from JSONKeyValueDeserializationSchema and catching the error there? Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html#the-deserializationschema On

Re: Parallel stream consumption

2018-01-16 Thread Nico Kruber
Just found a nice (but old) blog post that explains Flink's integration with Kafka: https://data-artisans.com/blog/kafka-flink-a-practical-how-to I guess, the basics are still valid Nico On 16/01/18 11:17, Nico Kruber wrote: > Hi Jason, > I'd suggest to start with [1] and [2]

Re: Parallel stream consumption

2018-01-16 Thread Nico Kruber
Hi Jason, I'd suggest to start with [1] and [2] for getting the basics of a Flink program. The DataStream API basically wires operators together with streams so that whatever stream gets out of one operator is the input of the next. By connecting both functions to the same Kafka stream source, your

Re: logging question

2018-01-16 Thread Nico Kruber
Just a guess, but probably our logging initialisation changes the global log level (see conf/log4j.properties). DataStream.collect() executes the program along with creating a local Flink "cluster" (if you are testing locally / in an IDE) and initializing logging, among other things. Please commen

Re: How to get automatic fail over working in Flink

2018-01-16 Thread Nico Kruber
Hi James, In this scenario, with the restart strategy set, the job should restart (without YARN/Mesos) as long as you have enough slots available. Can you check with the web interface on http://:8081/ that enough slots are available after killing one TaskManager? Can you provide JobManager and Ta

Re: Failed to serialize remote message [class org.apache.flink.runtime.messages.JobManagerMessages$JobFound

2018-01-16 Thread Nico Kruber
IMHO, this looks like a bug and it makes sense that you only see this with an HA setup: The JobFound message contains the ExecutionGraph which, however, does not implement the Serializable interface. Without HA, when browsing the web interface, this message is (probably) not serialized since it is

Re: Exception on running an Elasticpipe flink connector

2018-01-04 Thread Nico Kruber
Hi Vipul, Yes, this looks like a problem with a different netty version being picked up. First of all, let me advertise Flink 1.4 for this since there we properly shade away our netty dependency (on version 4.0.27 atm) so you (or in this case Elasticsearch) can rely on your required version. Since

Re: ElasticSearch Connector for version 6.x and scala 2.11

2018-01-04 Thread Nico Kruber
Actually, Flink's netty dependency (4.0.27) is shaded away into the "org.apache.flink.shaded.netty4.io.netty" package now (since version 1.4) and should thus not clash anymore. However, other netty versions may come into play from the job itself or from the integration of Hadoop's classpath (if ava

Re: Testing Flink 1.4 unable to write to s3 locally

2018-01-03 Thread Nico Kruber
Hi Kyle, except for putting the jar into the lib/ folder and setting up credentials, nothing else should be required [1]. The S3ErrorResponseHandler class itself is in the jar, as you can see with jar tf flink-s3-fs-presto-1.4.0.jar | grep org.apache.flink.fs.s3presto.shaded.com.amazonaws.services

Re: BackPressure handling

2018-01-02 Thread Nico Kruber
Hi Vishal, let me already point you towards the JIRA issue for the credit-based flow control: https://issues.apache.org/jira/browse/FLINK-7282 I'll have a look at the rest of this email thread tomorrow... Regards, Nico On 02/01/18 17:52, Vishal Santoshi wrote: > Could you please point me to any

Re: S3 Access in eu-central-1

2018-01-02 Thread Nico Kruber
Sorry for the late response, but I finally got around adding this workaround to our "common issues" section with PR https://github.com/apache/flink/pull/5231 Nico On 29/11/17 09:31, Ufuk Celebi wrote: > Hey Dominik, > > yes, we should definitely add this to the docs. > > @Nico: You recently upd

Re: Flink 1.4 with cassandra-connector: Shading error

2017-12-19 Thread Nico Kruber
Hi Dominik, nice assessment of the issue: in the version of the cassandra-driver we use there is even a comment about why: try { // prevent this string from being shaded Class.forName(String.format("%s.%s.channel.Channel", "io", "netty")); shaded = false; } catch (ClassNotFoundException e)

Re: Flink 1.4.0 can not override JAVA_HOME for single-job deployment on YARN

2017-12-14 Thread Nico Kruber
Hi, are you running Flink in an JRE >= 8? We dropped Java 7 support for Flink 1.4. Nico On 14/12/17 12:35, 杨光 wrote: > Hi, > I am usring flink single-job mode on YARN. After i upgrade flink > verson from 1.3.2 to 1.4.0, the parameter > "yarn.taskmanager.env.JAVA_HOME" doesn’t work as before. >

Re: ProgramInvocationException: Could not upload the jar files to the job manager / No space left on device

2017-12-14 Thread Nico Kruber
Hi Regina, judging from the exception you posted, this is not about storing the file in HDFS, but a step before that where the BlobServer first puts the incoming file into its local file system in the directory given by the `blob.storage.directory` configuration property. If this property is not se

Re: AW: Blob server not working with 1.4.0.RC2

2017-12-06 Thread Nico Kruber
exposing 6124 > for the JobManager BLOB server. Thanks > > Bernd > > -Ursprüngliche Nachricht- > Von: Nico Kruber [mailto:n...@data-artisans.com] > Gesendet: Montag, 4. Dezember 2017 14:17 > An: Winterstein, Bernd; user@flink.apache.org > Betreff: Re: Blob s

Re: Checkpoint expired before completing

2017-12-04 Thread Nico Kruber
> Stephan > > > On Fri, Dec 1, 2017 at 10:10 PM, Steven Wu <mailto:stevenz...@gmail.com>> wrote: > > Here is the checkpoint config. no concurrent checkpoints > with 2 minute checkpoint interval and timeout. > >

Re: Blob server not working with 1.4.0.RC2

2017-12-04 Thread Nico Kruber
Hi Bernd, thanks for the report. I tried to reproduce it locally but both a telnet connection to the BlobServer as well as the BLOB download by the TaskManagers work for me. Can you share your configuration that is causing the problem? You could also try increasing the log level to DEBUG and see if

Re: Checkpoint expired before completing

2017-12-01 Thread Nico Kruber
Hi Steven, by default, checkpoints time out after 10 minutes if you haven't used CheckpointConfig#setCheckpointTimeout() to change this timeout. Depending on your checkpoint interval, and your number of concurrent checkpoints, there may already be some other checkpoint processes running while you

Re: Flink 1.2.0->1.3.2 TaskManager reporting to JobManager

2017-11-28 Thread Nico Kruber
Hi Regina, can you explain a bit more on what you are trying to do and how this is set up? I quickly tried to reproduce locally by starting a cluster and could not see this behaviour. Also, can you try to increase the loglevel to INFO and see whether you see anything suspicious in the logs? Nico

Re: Correlation between data streams/operators and threads

2017-11-22 Thread Nico Kruber
t is used, and 3 are free. > > c) Attached > > d) Attached > > e) I'll try the debug mode in Eclipse. > > Thanks, > Shailesh > > On Fri, Nov 17, 2017 at 1:52 PM, Nico Kruber wrote: > > regarding 3. > > a) The taskmanager logs are missing, are th

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-22 Thread Nico Kruber
gt; > If jar -tf target/program.jar | grep MeasurementTable shows the class is > present, are there other dependencies missing? You may need to add runtime > dependencies into your pom or gradle.build file. > > On Tue, Nov 21, 2017 at 2:28 AM, Nico Kruber wrote: > > Hi S

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-21 Thread Nico Kruber
Hi Shankara, sorry for the late response, but honestly, I cannot think of a reason that some of your program's classes (using only a single jar file) are found some others are not, except for the class not being in the jar. Or there's some class loader issue in the Flink Beam runner (which I fin

Re: Correlation between data streams/operators and threads

2017-11-17 Thread Nico Kruber
regarding 3. a) The taskmanager logs are missing, are there any? b) Also, the JobManager logs say you have 4 slots available in total - is this enough for your 5 devices scenario? c) The JobManager log, however, does not really reveal what it is currently doing, can you set the log level to DEBUG

Re: Flink takes too much memory in record serializer.

2017-11-14 Thread Nico Kruber
We're actually also trying to have the serializer stateless in future and may be able to remove the intermediate serialization buffer which is currently growing on heap before we copy the data into the actual target buffer. This intermediate buffer grows and is pruned after serialization if it i

Re: JobManager shows TaskManager was lost/killed while TaskManger Process is still running and the network is OK.

2017-11-13 Thread Nico Kruber
>From what I read in [1], simply add JVM options to env.java.opts as you would when you start a Java program yourself, so setting "-XX:+UseG1GC" should enable G1. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ config.html#common-options On Friday, 15 September 2017

Re: Streaming : a way to "key by partition id" without redispatching data

2017-11-13 Thread Nico Kruber
Hi Gwenhaël, several functions in Flink require keyed streams because they manage their internal state by key. These keys, however, should be independent of the current execution and its parallelism so that checkpoints may be restored to different levels of parallelism (for re-scaling, see [1]).

Re: Flink HA Zookeeper Connection Timeout

2017-11-13 Thread Nico Kruber
Hi Sathya, have you checked this yet? https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ jobmanager_high_availability.html I'm no expert on the HA setup, have you also tried Flink 1.3 just in case? Nico On Wednesday, 8 November 2017 04:02:47 CET Sathya Hariesh Prakash (sathypra)

Re: AvroParquetWriter may cause task managers to get lost

2017-11-13 Thread Nico Kruber
Hi Ivan, sure, the more work you do per record, the slower the sink will be. However, this should not influence (much) the liveness checks inside flink. Do you get some meaningful entries in the TaskManagers' logs indicating the problem? I'm no expert on Avro and don't know how much actual work

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-13 Thread Nico Kruber
Hi Shankara, can you give us some more details, e.g. - how do you run the job? - how do you add/include the jar with the missing class? - is that jar file part of your program's jar or separate? - is the missing class, i.e. "com.huawei.ccn.intelliom.ims.MeasurementTable $measurementTable" (an inner

Re: ResultPartitionMetrics

2017-11-08 Thread Nico Kruber
Hi Aitozi, the difference is the scope: the normal metrics (without taskmanager.net.detailed-metrics) reflect _all_ buffers of a task while the detailed statistics are more fine-grained and give you statistics per input (or output) gate - the "total" there reflects the fact that each gate has mu

Re: Negative values using latency marker

2017-11-06 Thread Nico Kruber
> > Thanks and regards, > Tovi > -----Original Message- > From: Nico Kruber [mailto:n...@data-artisans.com] > Sent: יום ו 03 נובמבר 2017 15:22 > To: user@flink.apache.org > Cc: Sofer, Tovi [ICG-IT] > Subject: Re: Negative values using latency marker > > Hi To

Re: Incremental checkpointing documentation

2017-11-03 Thread Nico Kruber
Hi Elias, let me answer the questions to the best of my knowledge, but in general I think this is as expected. (Let me give a link to the docs explaining the activation [1] for other readers first.) On Friday, 3 November 2017 01:11:52 CET Elias Levy wrote: > What is the interaction of incrementa

Re: Making external calls from a FlinkKafkaPartitioner

2017-11-03 Thread Nico Kruber
Hi Ron, imho your code should be fine (except for a potential visibility problem on the changes of the non-volatile partitionMap member, depending on your needs). The #open() method should be called (once) for each sink initialization (according to the javadoc) and then you should be fine with t

Re: Negative values using latency marker

2017-11-03 Thread Nico Kruber
Hi Tovi, if I see this correctly, the LatencyMarker gets its initial timstamp during creation at the source and the latency is reported as a metric at a sink by comparing the initial timestamp with the current time. If the clocks between the two machines involved diverge, e.g. the sinks clock fa

Re: Building scala examples

2017-09-27 Thread Nico Kruber
to create and run my own streaming program. They only contain java > compiled class, if I am not mistaken. > > Let me try to create a scala example with similar build procedure. > > Thanks! > > > On Mon, Sep 25, 2017 at 10:41 PM, Nico Kruber > > wrote: > > Hi

Re: Can i use Avro serializer as default instead of kryo serializer in Flink for scala API ?

2017-09-25 Thread Nico Kruber
Hi Shashank, enabling Avro as the default de/serializer for Flink should be as simple as the following, according to [1] val env = StreamExecutionEnvironment.getExecutionEnvironment env.getConfig.enableForceAvro() I am, however, no expert on this and the implications regarding the use of Avro f

Re: Building scala examples

2017-09-25 Thread Nico Kruber
Hi Michael, from what I see, Java and Scala examples reside in different packages, e.g. * org.apache.flink.streaming.scala.examples.async.AsyncIOExample vs. * org.apache.flink.streaming.examples.async.AsyncIOExample A quick run on the Flink 1.3. branch revealed flink-examples- streaming_2.10-1.3-S

Re: Using HiveBolt from storm-hive with Flink-Storm compatibility wrapper

2017-09-25 Thread Nico Kruber
Hi Federico, I also did not find any implementation of a hive sink, nor much details on this topic in general. Let me forward this to Timo and Fabian (cc'd) who may know more. Nico On Friday, 22 September 2017 12:14:32 CEST Federico D'Ambrosio wrote: > Hello everyone, > > I'd like to use the H

Re: History Server

2017-09-25 Thread Nico Kruber
Hi Elias, in theory, it could be integrated into a single web interface, but this was not done so far. I guess the main reason for keeping it separate was probably to have a better separation of concerns as the history server is actually independent of the current JobManager execution and merely

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

2017-09-25 Thread Nico Kruber
Hi Elias, indeed that looks strange but was introduced with FLINK-3172 [1] with an argument about using the same configuration key (as opposed to having two different keys as mentioned) starting at https://issues.apache.org/jira/browse/FLINK-3172? focusedCommentId=15091940#comment-15091940 Nico

Re: Question about concurrent checkpoints

2017-09-21 Thread Nico Kruber
On Thursday, 21 September 2017 20:08:01 CEST Narendra Joshi wrote: > Nico Kruber writes: > > according to [1], even with asynchronous state snapshots (see [2]), a > > checkpoint is only complete after all sinks have received the barriers and > > all (asynchronous) snapshot

Re: Savepoints and migrating value state data types

2017-09-21 Thread Nico Kruber
Hi Marc, I assume you have set a UID for your CoProcessFunction as described in [1]? Also, can you provide the Flink version you are working with and the serializer you are using? If you have the UID set, your strategy seems to be the same as proposed by [2]: "Although it is not possible to chan

Re: on Wikipedia Edit Stream example

2017-09-21 Thread Nico Kruber
Hi Haibin, if you execute the program as in the Wiki edit example [1] from mvn as given or from the IDE, a local Flink environment will be set up which is not accessible form the outside by default. This is done by the call to StreamExecutionEnvironment.getExecutionEnvironment(); which also works

Re: Question about concurrent checkpoints

2017-09-21 Thread Nico Kruber
Hi Narendra, according to [1], even with asynchronous state snapshots (see [2]), a checkpoint is only complete after all sinks have received the barriers and all (asynchronous) snapshots have been processed. Since, if the number of concurrent checkpoints is 0, no checkpoint barriers will be emit

Re: Rest API cancel-with-savepoint: 404s when passing path as target-directory

2017-09-20 Thread Nico Kruber
Hi Emily, I'm not familiar with the details of the REST API either but if this is a problem with the proxy, maybe it is already interpreting the encoded URL and passes it on un-encoded - have you tried encoding the path again? That is, encoding the percent-signs: http:// {ip}:20888/proxy/applic

Re: FLINK-6117 issue work around

2017-09-06 Thread Nico Kruber
I looked at the commit you cherry-picked and nothing in there explains the error you got. This rather sounds like something might be mixed up between (remaining artefacts of) flink 1.3 and 1.2. Can you verify that nothing of your flink 1.3 tests remains, e.g. running JobManager or TaskManager i

Re: Sink -> Source

2017-09-01 Thread Nico Kruber
Hi Philipp, afaik, Flink doesn't offer this out-of-the-box. You could either hack something as suggested or use Kafka to glue different jobs together. Both may affect exactly/at-least once guarantees, however. Also refer to https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connector

Re: BlobCache and its functioning

2017-08-31 Thread Nico Kruber
709181065c6710c2252a5846f361ad68 > from localhost/127.0.0.1:43268 > 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743 > > 3) There actually wa

Re: BlobCache and its functioning

2017-08-31 Thread Nico Kruber
Hi Federico, 1) Which version of Flink are you using? 2) Can you also share the JobManager log? 3) Why do you think, Flink is stuck at the BlobCache? Is it really blocked, or do you still have CPU load? Can you post stack traces of the TaskManager (TM) and JobManager processes when you think they

Re: Great number of jobs and numberOfBuffers

2017-08-31 Thread Nico Kruber
oup, filter, for each "batch". And if we start the flink > app on a whole week of data, we will have to start (24 * 7) batches. > Parallelism has the default value except for the output writers (32 and 4) > in order to limit the numbers of files on HDFS. > > > -

Re: Stand alone blob server

2017-08-23 Thread Nico Kruber
Hi Anugrah, you can track the progress at the accompanying jira issue: https://issues.apache.org/jira/browse/FLINK-6916 Currently, roughly half of the tasks are done with a few remaining in PR review stage. Note that the actual implementation differs a bit from what was proposed in FLIP 19 thoug

Re: Flink multithreading, CoFlatMapFunction, CoProcessFunction, internal state

2017-08-21 Thread Nico Kruber
n, for maintaining custom states of my program logic I guess I > cannot use it. > > > Thank you, > Chao > > On 08/16/2017 03:31 AM, Nico Kruber wrote: > > Hi Chao, > > > > 1) regarding the difference of CoFlatMapFunction/CoProcessFunction, let me > > quote the

  1   2   >