x27;ll just serialize an integer id. So the
> amount of data being transferred goes down drastically.
>
> The disableAutoTypeRegistration flag is ignored in the DataStream API at
> the moment.
>
>
>
>
>
>
>
> On Thu, Feb 23, 2017 at 7:00 PM, Dmitry Golubets
> wrote:
>
>> Hi
e classes? If so, what are
> you using for doing that?
>
> Regards,
> Robert
>
> On Fri, Feb 17, 2017 at 9:17 PM, Dmitry Golubets
> wrote:
>
>> Hi Daniel,
>>
>> I've implemented a macro that generates message pack serializers in our
>> codebase.
>
izer(..)
> .
>
> I'm interested on knowing what have you done there for a boost of about
> 50% .
>
> Some small or simple example would be very nice.
>
> Thank you very much in advance.
>
> Kind Regards,
>
> Daniel Santos
>
> On 02/17/2017 12:43 PM, Dmitry Go
t; Till
>
>
> On Fri, Feb 17, 2017 at 12:38 PM, Dmitry Golubets
> wrote:
>
>> Hi,
>>
>> I was using ```cs.knownDirectSubclasses``` recursively to find and
>> register subclasses, which may have resulted in order mess.
>> Later I changed that to
>
Hi,
My streaming job cannot benefit much from parallelization unfortunately.
So I'm looking for things I can tune in Flink, to make it process
sequential stream faster.
So far in our current engine based on Akka Streams (non distributed ofc) we
have 20k msg/sec.
Ported to Flink I'm getting 14k so
ing the savepoint? Flink upgrade, Job upgrade, changing Kryo version,
> changing order in which you register Kryo serialisers?
>
> Best,
> Aljoscha
>
> On Fri, 10 Feb 2017 at 18:26 Dmitry Golubets wrote:
>
>> The docs say that it may improve performance.
>>
>>
Hi,
Can I force Flink to use Akka 2.4 (recompile if needed)?
Is it going to misbehave in a subtle way?
Best regards,
Dmitry
> implementation as default.
>
> However, I’m not sure of the plans in exposing this to the user and making
> it configurable.
> Looping in Stefan (in cc) who mostly worked on this part and see if he can
> provide more info.
>
> - Gordon
>
> On February 14, 2017 at 2:3
Hi,
It looks impossible to implement a keyed state with operator state now.
I know it sounds like "just use a keyed state", but latter requires
updating it on every value change as opposed to operator state and thus can
be expensive (especially if you have to deal with mutable structures inside
w
The docs say that it may improve performance.
How true is it, when custom serializers are provided?
There is also 'disableAutoTypeRegistration' method in the config class,
implying Flink registers types automatically.
So, given that I have an hierarchy:
trait A
class B extends A
class C extends A
Hi,
I need to re-create a Kafka topic when a job is started in "clean" mode.
I can do it, but I'm not sure if I do it in the right place.
Is it fine to put this kind of code in the "main"?
Then it's called on every job submit.
But.. how to detect if a job is being started from a savepoint?
Or is
Update: I've now used 1.1.3 versions as in the example in the docs and it
works!
Looks like these is an incompatibility with the latest logback.
Best regards,
Dmitry
On Wed, Feb 8, 2017 at 3:20 PM, Dmitry Golubets wrote:
> Hi Robert,
>
> After reading that link I've adde
ink/flink-docs-release-1.2/monitoring/best_
> practices.html#use-logback-when-running-flink-on-a-cluster
>
> On Tue, Feb 7, 2017 at 1:07 PM, Dmitry Golubets
> wrote:
>
>> Hi,
>>
>> documentation says: "Users willing to use logback instead of log4j can
>
Hi,
documentation says: "Users willing to use logback instead of log4j can just
exclude log4j (or delete it from the lib/ folder)."
But then Flink just doesn't start. I added logback-classic 1.10 to it's lib
folder, but still get NoClassDefFoundError:
ch/qos/logback/core/joran/spi/JoranException
ree to report it here.
>
> The PRs will be merged later today.
>
>
> On Mon, Feb 6, 2017 at 4:41 PM, Dmitry Golubets
> wrote:
> > Hi guys,
> >
> > I would appreciate if someone could explain to me what's the difference
> > between those two.
> >
>
Hi guys,
I would appreciate if someone could explain to me what's the difference
between those two.
The current description refers to "dynamic scaling", and yet I can't find
anything about it in Flink's docs.
Best regards,
Dmitry
t; passing-them-around-in-your-flink-application
>
> On Thu, Jan 26, 2017 at 5:38 PM, Dmitry Golubets
> wrote:
>
>> Hi,
>>
>> Is there a place for user defined configuration settings?
>> How to read them?
>>
>> Best regards,
>> Dmitry
>>
>
>
Hi,
Is there a place for user defined configuration settings?
How to read them?
Best regards,
Dmitry
aybe you can resolve the
> issue on your side for now.
> I've filed a JIRA for this issue: https://issues.apache.
> org/jira/browse/FLINK-5661
>
>
>
> On Wed, Jan 25, 2017 at 8:24 PM, Dmitry Golubets
> wrote:
>
>> I've build latest Flink from sources and it
I've build latest Flink from sources and it seems that httpclient
dependency from flink-mesos is not shaded. It causes troubles with latest
AWS SDK.
Do I build it wrong or is it a known problem?
Best regards,
Dmitry
Hi,
I've just added my custom MsgPack serializers hoping to see performance
increase. I covered all data types in between chains.
However this Kryo method still takes a lot of CPU: IdentityObjectIntMap.get
Is there something else should be configured?
Or is there no way to get away from Kryo ove
Hi,
I'm looking for the right way to do the following scheme:
1. Read data
2. Split it into partitions for parallel processing
3. In every partition group data in N elements batches
4. Process these batches
My first attempt was:
*dataStream.keyBy(_.key).countWindow(..)*
But countWindow groups by
t; Overall, that seemed the more scalable design to us.
> Can your use case follow a similar approach?
>
> Stephan
>
>
>
> On Tue, Jan 17, 2017 at 10:57 AM, Dmitry Golubets
> wrote:
>
>> Hi Timo,
>>
>> I don't have any key to join on, so I'm
> your own operator. That depends on your use case though.
>
> You can maintain backpressure by using Flink's operator state. But did you
> also thought about a Window Join instead?
>
> I hope that helps.
>
> Timo
>
>
>
>
> Am 17/01/17 um 00:20 s
Hi,
there are only *two *interfaces defined at the moment:
*OneInputStreamOperator*
and
*TwoInputStreamOperator.*
Is there any way to define an operator with arbitrary number of inputs?
My another concern is how to maintain *backpressure *in the operator?
Let's say I read events from two Kafka s
a is
> serialized into a fixed number of buffers instead of being put on the JVM
> heap.
>
> Best, Fabian
>
> 2017-01-16 14:21 GMT+01:00 Dmitry Golubets :
>
>> Hi Ufuk,
>>
>> Do you know what's the reason for serialization of data between different
>>
Hi Ufuk,
Do you know what's the reason for serialization of data between different
threads?
Also, thanks for the link!
Best regards,
Dmitry
On Mon, Jan 16, 2017 at 1:07 PM, Ufuk Celebi wrote:
> Hey Dmitry,
>
> this is not possible if I'm understanding you correctly.
>
> A task chain is execut
Hi,
Let's say we have multiple subtask chains and all of them are executing in
the same task manager slot (i.e. in the same JVM).
What's the point in serializing data between them?
Can it be disabled?
The reason I want keep different chains is that some subtasks should be
executed in parallel to
28 matches
Mail list logo