Tracing and Flink

2020-08-14 Thread Aaron Levin
is, I'd love to hear from you! Thank you! Best, Aaron Levin

Re: map JSON to scala case class & off-heap optimization

2020-07-09 Thread Aaron Levin
Hi Georg, you can try using the circe library for this which has a way to automatically generate JSON decoders for scala case classes. As it was mentioned earlier, Flink does not come packaged with JSON-decoding generators for Scala like spark does. On Thu, Jul 9, 2020 at 4:45 PM Georg Heiler wr

Re: Does anyone have an example of Bazel working with Flink?

2020-06-18 Thread Aaron Levin
Hi Austin, In our experience, `rules_scala` and `rules_java` are enough for us at this point. It's entirely possible I'm not thinking far enough into the future, though, so don't take our lack of investment as a sign it's not worth investing in :) Best, Aaron Levin On Thu

Re: Does anyone have an example of Bazel working with Flink?

2020-06-11 Thread Aaron Levin
Hi Dan, We use Bazel to compile our Flink applications. We're using "rules_scala" ( https://github.com/bazelbuild/rules_scala) to manage the dependencies and produce jars. We haven't had any issues. However, I have found that sometimes it's difficult to figure out exactly what Flink target or depe

ListState with millions of elements

2020-04-08 Thread Aaron Levin
alize fetching from RocksDB and deserializing will be costly when hitting a key with a list of a million elements, but is there anything else we should consider? Thanks! Best, Aaron Levin

Re: Expected behaviour when changing operator parallelism but starting from an incremental checkpoint

2020-03-13 Thread Aaron Levin
c guarantees of our job. Hard failure in the cases where you cannot change parallelism would be the desired outcome imo. Thank you! [0] https://cwiki.apache.org/confluence/display/FLINK/FLIP-47%3A+Checkpoints+vs.+Savepoints Best, Aaron Levin On Fri, Mar 13, 2020 at 9:08 AM Piotr Nowojski wrote:

Expected behaviour when changing operator parallelism but starting from an incremental checkpoint

2020-03-12 Thread Aaron Levin
://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/upgrading.html Aaron Levin

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-17 Thread Aaron Levin
+1. I personally found it a little confusing when I discovered I had to configure this after already choosing RocksDB as a backend. Also very strongly in favour of "safe and scalable" as the default. Best, Aaron Levin On Fri, Jan 17, 2020 at 4:41 AM Piotr Nowojski wrote: > +1

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-12-04 Thread Aaron Levin
s restore the state and clear it on all operators and not > reference it again. I know this feels like a workaround but I have no better > idea at the moment. > > Cheers, > Gyula > > On Wed, Nov 27, 2019 at 6:08 PM Aaron Levin wrote: >> >> Hi, >> >>

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Aaron Levin
tasks[1] > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-operator-state > Best, > Congxian > > > Aaron Levin 于2019年11月27日周三 上午3:55写道: >> >> Hi, >> >> Some context: after a refactoring, we were unabl

What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-26 Thread Aaron Levin
ppropriately in `snapshotState` (we, uh, already learned that lesson :D) Best, Aaron Levin

Re: Property based testing

2019-09-18 Thread Aaron Levin
ource and the collection sink * make a ScalaCheck property assertion based on the input collection and output collection. Possible to wrap all that in a single method in Scala. LMK if you have any more questions or any of this was not clear! (note: not sure how to do this in Java). Best, Aaron

Re: Update Checkpoint and/or Savepoint Timeout of Running Job without restart?

2019-08-19 Thread Aaron Levin
e.org/jira/browse/FLINK-9465 > Best, > Congxian > > > Aaron Levin 于2019年8月17日周六 上午12:37写道: > >> Hello, >> >> Question: Is it possible to update the checkpoint and/or savepoint >> timeout of a running job without restarting it? If not, is this something >

Update Checkpoint and/or Savepoint Timeout of Running Job without restart?

2019-08-16 Thread Aaron Levin
checkpoint or savepoint to succeed, and then change the settings back. Best, Aaron Levin

Re: Graceful Task Manager Termination and Replacement

2019-07-24 Thread Aaron Levin
gt; Hi, >> >> Maybe region restart strategy can help. It restarts minimum required >> tasks. Note that it’s recommended to use only after 1.9 release, see [1], >> unless you’re running a stateless job. >> >> [1] https://issues.apache.org/jira/browse/FLINK-10712

Graceful Task Manager Termination and Replacement

2019-07-11 Thread Aaron Levin
me for the job. I'd love to decrease this downtime if at all possible. Thanks! Any insight is appreciated! Best, Aaron Levin

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-29 Thread Aaron Levin
wasn't for you filing the ticket. Thank you so much! [0] https://github.com/pantsbuild/jarjar Aaron Levin On Mon, Jan 28, 2019 at 4:16 AM Ufuk Celebi wrote: > Hey Aaron, > > sorry for the late reply (again). > > (1) I think that your final result is in line with what I have > repro

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-25 Thread Aaron Levin
发自我的 iPhone > > 在 2019年1月25日,上午7:12,Aaron Levin 写道: > > Hi Ufuk, > > Update: I've pinned down the issue. It's multiple classloaders loading > `libhadoop.so`: > > ``` > failed to load native hadoop with error: java.lang.UnsatisfiedLinkError: > Native Librar

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-24 Thread Aaron Levin
I can put a jar with `org.apache.hadoop.common.io.compress` in `/flink/install/lib` and then remove it from my jar. It's not an ideal solution but I can't think of anything else. Best, Aaron Levin On Thu, Jan 24, 2019 at 12:18 PM Aaron Levin wrote: > Hi Ufuk, > > I'm starting to believe the bug i

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-24 Thread Aaron Levin
lib/native/libhadoop.so /usr/lib/ $ java -jar lib_test_deploy.jar hadoop java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib Attempting to load hadoop Successfully loaded ``` Any ideas? On Wed, Jan

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-23 Thread Aaron Levin
(however, I'm going to investigate this further as I might not have done it correctly). Best, Aaron Levin On Wed, Jan 23, 2019 at 3:18 PM Aaron Levin wrote: > Hi Ufuk, > > Two updates: > > 1. As suggested in the ticket, I naively copied the every `.so` in > `hadoop-3.0.0/lib/na

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-23 Thread Aaron Levin
019-01-23 19:52:33.081904] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) [2019-01-23 19:52:33.081946] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) [2019-01-23 19:52:33.081967] at java.lang.Thread.run(Thread.java:748) On Tue, Jan 22, 2019 at 2:31 PM Aaron Levin wrote:

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-22 Thread Aaron Levin
ppy libs - I'll get clarification soon). 3. I'm looking into including hadoop's snappy libs in my jar and we'll see if that resolves the problem. Thanks again for your help! Best, Aaron Levin On Tue, Jan 22, 2019 at 10:47 AM Aaron Levin wrote: > Hey, > > Thanks so mu

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-22 Thread Aaron Levin
Hey, Thanks so much for the help! This is awesome. I'll start looking into all of this right away and report back. Best, Aaron Levin On Mon, Jan 21, 2019 at 5:16 PM Ufuk Celebi wrote: > Hey Aaron, > > sorry for the late reply. > > (1) I think I was able to reproduce thi

`env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-17 Thread Aaron Levin
access our files in S3. We do not use the `bundled-with-hadoop` distribution of Flink. Best, Aaron Levin

Re: [Flink 1.7.0] always got 10s ask timeout exception when submitting job with checkpoint via REST

2019-01-10 Thread Aaron Levin
encounter these errors. I believe they may also impact our ability to deploy (as we get a timeout when submitting the job programmatically). I'd love to see a solution to this if one exists! Best, Aaron Levin On Thu, Jan 10, 2019 at 2:58 PM Steven Wu wrote: > We are trying out Flink 1.

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-11-12 Thread Aaron Levin
Hi Aljoscha, Thanks! I will look into this. Best, Aaron Levin On Fri, Nov 9, 2018 at 5:01 AM, Aljoscha Krettek wrote: > Hi, > > I think for this case a model that is similar to how the Streaming File > Source works should be good. You can have a look at > ContinuousFileMonitor

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-11-01 Thread Aaron Levin
most of the implementation of InputFormatSourceFunction so I could insert Watermarks between splits. I'd love any suggestions around improving this! Best, Aaron Levin On Thu, Nov 1, 2018 at 10:41 AM, Aljoscha Krettek wrote: > Hi Aaron, > > I'l like to take a step

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-11-01 Thread Aaron Levin
Hey Friends! Last ping and I'll move this over to a ticket. If anyone can provide any insight or advice, that would be helpful! Thanks again. Best, Aaron Levin On Fri, Oct 26, 2018 at 9:55 AM, Aaron Levin wrote: > Hey, > > Not sure how convo threading works on this list, so in

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-26 Thread Aaron Levin
tmp } } } } On Wed, Oct 24, 2018 at 3:54 AM, Dawid Wysakowicz wrote: > Hi Aaron, > > Could you share the code of you custom function? > > I am also adding Aljosha and Kostas to cc, who should be more helpful on > that topic. > > Best, > > Dawid >

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-24 Thread Aaron Levin
t == null && !hasNext) { throw new NoSuchElementException() } val tmp: InputSplit = nextSplit nextSplit = null tmp } } } } Best, Aaron Levin On Wed, Oct 24, 2018 at 8:00 AM, Kien Truong wrote: > Hi, > > Since InputForma

Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-19 Thread Aaron Levin
related to that casting call? If so, would ya'll be open to a PR which adds an interface one can extend which will set the input format in the stream graph? Or is there a preferred way of achieving this? Thanks! Aaron Levin [0] https://github.com/apache/flink/blob/release-1.6/flink-streaming-jav

Re: Scala 2.12 Support

2018-08-16 Thread Aaron Levin
nce to the fact that 1.6 had come out (or where they got that information). I know a few people have cited the ticket and concluded "not clear what's going on with Scala 2.12 support." If you have the bandwidth, a note from you or anyone else would be helpful! Thanks again! Best

Scala 2.12 Support

2018-08-15 Thread Aaron Levin
org/jira/browse/FLINK-7811 [1] https://issues.apache.org/jira/browse/SPARK-14540 Best, Aaron Levin