Hi everyone,
I want to develop a custom task scheduler in Flink, specifically, I want to
control the scheduling of components of an topology into specific hosts, where
should I start with? Is there an example program or tutorial about this? Thanks!
Thanks.
Best regards.
Shuhao Zhang (Tony)
+65-
The output is the timestamps of events in string. (For convenience, the
payload of each event is exactly the timestamp of it.) As soon as the
folding of a time window is finished, the code will print "# events in this
window" indicating the end of the window.
The 10s windows should be [19:10:40, 1
Thank you Jamie and Ufuk both for such helpful answers!
I will continue to explore my options and eagerly await out of the box
Mesos support.
Ryan
On Mon, Jul 4, 2016 at 5:05 AM Ufuk Celebi wrote:
> On Fri, Jul 1, 2016 at 3:41 PM, Ryan Crumley wrote:
> > Questions:
> > 1. Is this a viable app
Could you please elaborate a bit on what exactly the output means and how
you derive that events are leaking into the previous window?
On Mon, 4 Jul 2016 at 13:20 Yukun Guo wrote:
> Thanks for the information. Strange enough, after I set the time
> characteristic to EventTime, the events are lea
I have sent you my code in a separate email, I hope you can solve my issue.
Thanks a lot
Biplob
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7798.html
Sent from the Apache Flink User Mailing
Thanks for the information. Strange enough, after I set the time
characteristic to EventTime, the events are leaking into the previous
window:
...
Mon, 04 Jul 2016 19:10:49 CST
Mon, 04 Jul 2016 19:10:50 CST # ?
Mon, 04 Jul 2016 19:10:50 CST
Mon, 04 Jul 2016 19:10:50 CST
Mon, 04 Jul 2016 19:10:50 C
Can you share the complete program with me? Than I would look into it.
Could be that you count in a wrong way. The iteration should
definitely consume all the initial input once at least.
On Mon, Jul 4, 2016 at 12:07 PM, Biplob Biswas wrote:
> Can anyone check this once, and help me out with this
Can anyone check this once, and help me out with this?
I would be really obliged.
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7795.html
Sent from the Apache Flink User Mailing List archiv
Sorry I wanted to write Kryo but I'm on my mobile
On 4 Jul 2016 12:34 p.m., "Flavio Pompermaier" wrote:
> Because I don't see any good reason for that...maybe also all keyo
> serialization errors that I have from time to time could be symptomatic of
> some other error in how Flink manage t
Because I don't see any good reason for that...maybe also all keyo
serialization errors that I have from time to time could be symptomatic of
some other error in how Flink manage the ibternal buffers...but also this
is just another personal guess I did..
On 4 Jul 2016 12:29 p.m., "Ufuk Celebi"
I also have a lot of use case where caching a dataset would be definitely
useful...maybe using Auxillio (the new name of Tachyon) and write
intermediate results to an in-memory fs could be an better than re-read
over and over tge input source for the moment...What do you think?
On 4 Jul 2016 12:25
It's not possible to tell. You would have to look into the logs of the
job manager to check what happened. The not killed task manager could
have re-connected to the job manager, if it was restarted quickly
after the failure. Why do you think that the task manager would
influence the job result tho
Nested iterations are not supported via a "native iteration" operator.
There is no way to avoid the for loop at the moment.
I think it's not possible to tell why the results are wrong from the
code snippet. How do you propagate the counts back? In general I
expect this program to perform very badl
No, I haven't.
I fear that unkilled taskmanger could have been the cause of this problem.
Last day I run the job and I discovered that on some node there was some
zombie taskmanger yhat wasn't terminated during the stop-cluster.
What do you think?What happens in this situations?old taskmanager are
On Wed, Jun 29, 2016 at 9:19 PM, Bajaj, Abhinav wrote:
> Is their a plan to add the Job id or name to the logs ?
This is now part of the YARN client output and should be part of the
1.1 release.
Regarding your other question: in standalone mode, you have to
manually make sure to not submit mult
Judging from the stack trace the state should be part of the operator
state and not the partitioned RocksDB state. If you have implemented
the Checkpointed interface anywhere, that would be a good place to
pinpoint the anonymous class. Is it possible to share the job code?
– Ufuk
On Fri, Jul 1, 2
I guess Aljoscha was referring to whether you also have broadcasted
input or something like it?
On Fri, Jul 1, 2016 at 7:05 PM, Flavio Pompermaier wrote:
> what do you mean exactly?
>
> On 1 Jul 2016 18:58, "Aljoscha Krettek" wrote:
>>
>> Hi,
>> do you have any data in the coGroup/groupBy operat
On Fri, Jul 1, 2016 at 3:41 PM, Ryan Crumley wrote:
> Questions:
> 1. Is this a viable approach? Any pitfalls to be aware of?
The major pitfall would be future migrations as outlined by Jamie.
> 2. What is the correct term for this deployment mode? Single node
> standalone? Local?
I would say s
If you just re-submit the job without a savepoint, the Kafka consumer
will by default start processing from the latest offset and the
operators will be in an empty state. It should be possible to add a
feature to Flink, which allows turning the latest checkpoint to a
savepoint, from which you then
Hi,
I have a Flink programm, which outputs wrong results once I set the
parallelism to a value larger that 1.
If I run the programm with parallelism 1, everything works fine.
The algorithm works on one input dataset, which will iteratively be
split until the desired output split size is reach
You can also have a look at the YARN client logs, which should print which
JARs are uploaded. The container logs should also log the class path.
On Sun, Jul 3, 2016 at 6:04 PM, Jamie Grier wrote:
> Hi Bruce,
>
> I just spun up an EMR cluster and tried this out. Hadoop 2.7.2 and Flink
> 1.0.3.
Hi,
I think it should be as simple as setting event time as the stream time
characteristic:
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
The problem is that .timeWindow(Time.seconds(10)) will use processing time
if you don't specify a time characteristic. You can enforce using an
22 matches
Mail list logo