Re: unsubscribe

2023-01-15 Thread Cristian Constantinescu
Please email user-unsubscr...@flink.apache.org as described here: https://flink.apache.org/community.html On Sun, Jan 15, 2023 at 4:31 AM Saver Chia wrote: > unsubscribe >

Re: unsubscribe

2023-01-15 Thread Cristian Constantinescu
Please email user-unsubscr...@flink.apache.org as describe here: https://flink.apache.org/community.html On Sun, Jan 15, 2023 at 8:55 AM jay green wrote: > unsubscribe >

Re: Basic questions about resuming stateful Flink jobs

2022-02-16 Thread Cristian Constantinescu
them to the PipelineOptions once you build that object from args. I've never used the Flink libs, just the runner, but from [1] and [3] it looks like you can configure things in code if you prefer that. Hope it helps, Cristian [1] https://beam.apache.org/documentation/runners/flink/ [2] ht

EOS from checkpoints doesn't seem to work

2022-01-28 Thread Cristian Constantinescu
ure out what's going on? I'm sure it's a user mistake, but I'm unsure how to debug it. Cheers, Cristian

[Flink SQL] Insert query fails for partitioned table

2021-02-01 Thread Cristian Cioriia
ver:port’, 'properties.group.id' = 'GroupId', 'scan.startup.mode' = 'earliest-offset', 'format' = 'json' ); [5] [ERROR] Could not execute SQL statement. Reason: org.apache.flink.sql.parser.impl.ParseException: Encountered "METADATA&

info about flinkml

2020-09-14 Thread Cristian Lorenzetto
Hi i m evaluating to adopt flink instead spark for data mining processor. I knew flinkML for this scope but in the last release i cant find it. Why? Can you suggest the best way ? -- Cristian Lorenzetto Direzione ICT e Agenda Digitale U.O. Demand, Progettazione e Sviluppo Software Tel: 041

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-08 Thread Cristian
t; be cleaned up. I think you need to check the job restart strategy you have > set. For example, the following > configuration will make the Flink cluster terminated after 10 attempts. > > restart-strategy: fixed-delay > restart-strategy.fixed-delay.attempts: 10 > > > Bes

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-08 Thread Cristian
near. Flinkpocalypse! On Tue, Sep 8, 2020, at 5:54 AM, Robert Metzger wrote: > Thanks a lot for reporting this problem here Cristian! > > I am not super familiar with the involved components, but the behavior you > are describing doesn't sound right to me. > Which entrypoint ar

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-07 Thread Cristian
That's an excellent question. I can't explain that. All I know is this: - the job was upgraded and resumed from a savepoint - After hours of working fine, it failed (like it shows in the logs) - the Metadata was cleaned up, again as shown in the logs - because I run this in Kubernetes, the conta

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-05 Thread Cristian
b from a checkpoint using - - fromSavepoint? On Sat, Sep 5, 2020, at 2:05 AM, Husky Zeng wrote: > > Hi Cristian, > > From this code , we could see that the Exception or Error was ignored in > dispatcher.shutDownCluster(applicationStatus) . > > `` > org.apache.flink.runt

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-04 Thread Cristian
we would have been in serios problems. On Fri, Sep 4, 2020, at 9:12 PM, Qingdong Zeng wrote: > Hi Cristian, > > In the log,we can see it went to the method > shutDownAsync(applicationStatus,null,true); > > `` > 2020-09

Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-04 Thread Cristian
Hello guys. We run a stand-alone cluster that runs a single job (if you are familiar with the way Ververica Platform runs Flink jobs, we use a very similar approach). It runs Flink 1.11.1 straight from the official docker image. Usually, when our jobs crash for any reason, they will resume from

Re: Dump snapshot of big table in real time using StreamingFileSink

2019-01-18 Thread Cristian C
ay, maybe Kostas can add more since he wrote the StreamingFileSink >> code. I've cc'd him directly. >> >> -Jamie >> >> >> On Fri, Jan 18, 2019 at 9:44 AM Cristian C >> wrote: >> >>> Well, the problem is that, conceptually, the way I'

Re: Dump snapshot of big table in real time using StreamingFileSink

2019-01-18 Thread Cristian C
Well, the problem is that, conceptually, the way I'm trying to approach this is ok. But in practice, it has some edge cases. So back to my original premise: if you both, trigger and checkpoint happen around the same time, there is a chance that the streaming file sink rolls the bucket BEFORE it ha

Get watermark metric as a delta of current time

2019-01-14 Thread Cristian
Hello. Flink emits watermark metrics (currentWatermark) as a Unix timestamp, which is useful in some context but troublesome for others. For instance, when sending data to Datadog, there is no way to meaningfully see or act upon this metric, because there is no support for timestamps. A more u

Re: Flink application does not scale as expected, please help!

2018-06-18 Thread Ovidiu-Cristian MARCU
Hi all, Allow me to add some comments/questions on this issue that is very interesting. According to documentation [1] the pipeline example assumes the source is running with the same parallelism as successive map operator and the workflow optimizes to collocate source and map tasks if possible.

Re: parallelism for window operations

2017-01-27 Thread Ovidiu-Cristian MARCU
10:43, Ovidiu-Cristian MARCU > wrote: > > Thank you, Fabian! > > It works, what I did and results, as an example for other users: > Total slots occupied are 7 (not sure how to check that Source + Flat Map are > in the same slot, assumed slot S1 will be that; also

Re: Monitoring REST API

2016-12-21 Thread Ovidiu-Cristian MARCU
Hi Lydia, I have used sar monitoring (sar -u -n DEV -p -d -r 1) and plotted the average over multiple nodes. 1)So for each node you can collect the sar output, and obtain for example: Linux 3.2.0-4-amd64 (parasilo-4.rennes.grid5000.fr) 2016-01-27 _x86_64_(16 CPU) 12:54:09

Re: Parameters to Control Intra-node Parallelism

2016-07-13 Thread Ovidiu-Cristian MARCU
s case) > than what's suggested in Flink (#slots-per-TM^2 * #TMs * 4, which would be > 12*12*32*4 = 18432). Otherwise, it would throw me the not enough buffers > error. > > Thank you, > Saliya > > > > On Tue, Jul 12, 2016 at 7:39 AM, Ovidiu-Cristian MARCU &

Re: Parameters to Control Intra-node Parallelism

2016-07-12 Thread Ovidiu-Cristian MARCU
Hi, Can you post your configuration parameters (exclude default settings) and cluster description? Best, Ovidiu > On 11 Jul 2016, at 17:49, Saliya Ekanayake wrote: > > Thank you Greg, I'll check if this was the cause for my TMs to disappear. > > On Mon, Jul 11, 2016 at 11:34 AM, Greg Hogan <

Re: Optimizations not performed - please confirm

2016-06-29 Thread Ovidiu-Cristian MARCU
s are done in the > Table API/SQL that will be be released in an updated version in 1.1. > > Cheers, > Aljoscha > > +Timo, Explicitly adding Timo > > On Tue, 28 Jun 2016 at 21:41 Ovidiu-Cristian MARCU > mailto:ovidiu-cristian.ma...@inria.fr>> > wrote: > Hi

Optimizations not performed - please confirm

2016-06-28 Thread Ovidiu-Cristian MARCU
Hi, The optimizer internals described in this document [1] are probably not up-to-date. Can you please confirm if this is still valid: “The following optimizations are not performed Join reordering (or operator reordering in general): Joins / Filters / Reducers are not re-ordered in Flink. This

Re: Flink Version 1.1

2016-05-18 Thread Ovidiu-Cristian MARCU
Hi We are also very interested on the SQL (SQL on Streaming) future support in the next release (even if it is partial work that works :) ) Thank you! Best, Ovidiu > On 18 May 2016, at 14:42, Stephan Ewen wrote: > > Hi! > > That question is coming up more and more. > I think we should start

What / Where / When / How questions in Spark 2.0 ?

2016-05-16 Thread Ovidiu-Cristian MARCU
Hi, We can see in [2] many interesting (and expected!) improvements (promises) like extended SQL support, unified API (DataFrames, DataSets), improved engine (Tungsten relates to ideas from modern compilers and MPP databases - similar to Flink [3]), structured streaming etc. It seems we somehow

Hash tables - joins, cogroup, deltaIteration

2016-04-18 Thread Ovidiu-Cristian MARCU
Hi, Can you please confirm if there is any update regarding the hash tables use cases, as in [1] it is specified that Hash tables are used in Joins and for the Solution set in iterations (pending work to use them for grouping/aggregations)? I am interested in the pending work progress and also

Re: Flink performance pre-packaged vs. self-compiled

2016-04-14 Thread Ovidiu-Cristian MARCU
Hi, Your assumption may be incorrect related to the TeraSort use case for eastcirclek's implementation. How many time did you run your program? It would be helpful to give more details about your experiment, in terms of configuration, dataset size. Best, Ovidiu > On 14 Apr 2016, at 17:14, Rob

Re: Not enough free slots to run the job

2016-03-21 Thread Ovidiu-Cristian MARCU
: It depends. In the example above, the job would restart. As long as there > are enough slots available, jobs will restart. > > > On Mon, Mar 21, 2016 at 3:30 PM, Ovidiu-Cristian MARCU > mailto:ovidiu-cristian.ma...@inria.fr>> > wrote: > Hi Robert, > > I am not

Re: Not enough free slots to run the job

2016-03-21 Thread Ovidiu-Cristian MARCU
remaining slots. > That's why the spare slots approach is currently the only way to go. > > Regards, > Robert > > On Fri, Mar 18, 2016 at 1:30 PM, Ovidiu-Cristian MARCU > mailto:ovidiu-cristian.ma...@inria.fr>> > wrote: > Hi, > > For the situation w

Re: off-heap size feature request

2016-03-19 Thread Ovidiu-Cristian MARCU
gt; the parameters to configure the amount of managed memory > (taskmanager.memory.size, taskmanager.memory.fraction) are valid for on and > off-heap memory. > > Have you tried these parameters and didn't they work as expected? > > Best, Fabian > >

Re: off-heap size feature request

2016-03-18 Thread Ovidiu-Cristian MARCU
the overall process size will be roughly > 4GB. The parameter name "taskmanager.heap.mb" is a bit confusing in case of > off-heap memory usage, because it does not define this size of the heap but > of the overall process. > > Hope this helps, > Fabian > > > > 2016

Not enough free slots to run the job

2016-03-18 Thread Ovidiu-Cristian MARCU
Hi, For the situation where a program specify a maximum parallelism (so it is supposed to use all available task slots) we can have the possibility that one of the task managers is not registered for various reasons. In this case the job will fail for not enough free slots to run the job. For m

off-heap size feature request

2016-03-16 Thread Ovidiu-Cristian MARCU
Hi, Is it possible to add a parameter off-heap.size for the task manager off-heap memory [1]? It is not possible to limit the off-heap memory size, at least I found nothing in the documentation. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#managed-memory

Re: Memory ran out PageRank

2016-03-16 Thread Ovidiu-Cristian MARCU
ely. Can you confirm this? > > The solution set for delta iterations is currently implemented as an > in-memory hash table that works on managed memory segments, but is not > spillable. > > – Ufuk > > On Mon, Mar 14, 2016 at 6:30 PM, Ovidiu-Cristian MARCU > wrote: >

Re: Memory ran out PageRank

2016-03-14 Thread Ovidiu-Cristian MARCU
Correction: successfully CC I am running is on top of your friend, Spark :) Best, Ovidiu > On 14 Mar 2016, at 20:38, Ovidiu-Cristian MARCU > wrote: > > Yes, largely different. I was expecting for the solution set to be spillable. > This is somehow very hard limitation, the lay

Re: Memory ran out PageRank

2016-03-14 Thread Ovidiu-Cristian MARCU
h table that works on managed memory segments, but is not > spillable. > > – Ufuk > > On Mon, Mar 14, 2016 at 6:30 PM, Ovidiu-Cristian MARCU > wrote: >> >> This problem is surprising as I was able to run PR and CC on a larger graph >> (2bil edges) but with thi

Re: Memory ran out PageRank

2016-03-14 Thread Ovidiu-Cristian MARCU
that? > > Cheers, > Martin > > > On 14.03.2016 17:55, Ovidiu-Cristian MARCU wrote: >> Thank you for this alternative. >> I don’t understand how the workaround will fix this on systems with limited >> memory and maybe larger graph. >> >> Running Conn

Re: Memory ran out PageRank

2016-03-14 Thread Ovidiu-Cristian MARCU
x/flink-user/201508.mbox/%3CCAELUF_ByPAB%2BPXWLemPzRH%3D-awATeSz4sGz4v9TmnvFku3%3Dx3A%40mail.gmail.com%3E > > On 14.03.2016 16:55, Ovidiu-Cristian MARCU wrote: >> Hi, >> >> While running PageRank on a synthetic graph I run into this problem: >&

Memory ran out PageRank

2016-03-14 Thread Ovidiu-Cristian MARCU
Hi, While running PageRank on a synthetic graph I run into this problem: Any advice on how should I proceed to overcome this memory issue? IterationHead(Vertex-centric iteration (org.apache.flink.graph.library.PageRank$VertexRankUpdater@7712cae0 | org.apache.flink.graph.library.PageRank$RankMe

Re: Batch Processing Fault Tolerance (DataSet API)

2016-02-22 Thread Ovidiu-Cristian MARCU
d or exist as a PR [1]. So we hope to complete the partial > backtracking soon. > > [1] https://github.com/apache/flink/pull/640 > <https://github.com/apache/flink/pull/640> > > Cheers, > Till > > On Mon, Feb 22, 2016 at 6:00 PM, Ovidiu-Cristian MARCU >

Batch Processing Fault Tolerance (DataSet API)

2016-02-22 Thread Ovidiu-Cristian MARCU
Hi In case of failure of a node what does it mean 'Fault tolerance for programs in the DataSet API works by retrying failed executions’ [1] ? -work already done by the rest of the nodes is not lost, only work of the lost node is recomputed, job execution will continue or -entire job execution is

Re: Apache Flink Web Dashboard - Completed Job history

2015-12-16 Thread Ovidiu-Cristian MARCU
correct me if I am wrong. > > -Matthias > > On 12/16/2015 03:16 PM, Ufuk Celebi wrote: >> >>> On 16 Dec 2015, at 15:00, Ovidiu-Cristian MARCU >>> wrote: >>> >>> Hi >>> >>> If I restart the Flink I don’t see anymore the hi

Apache Flink Web Dashboard - Completed Job history

2015-12-16 Thread Ovidiu-Cristian MARCU
Hi If I restart the Flink I don’t see anymore the history of the completed jobs. Is this a missing feature or what should I do to see the completed job list history? Best regards, Ovidiu

Features with major priority/future release/s

2015-12-07 Thread Ovidiu-Cristian MARCU
Hi, Can you try to describe what is planned for the future releases and eventually link the Jira issues/bugs to it? Some very important features have a Major priority, like: [1] Add a SQL API (on top of Table API) [2] Add KMeans clustering algorithm to ML Library (kmeans ++ & ||) [3] Create eva

Re: flink connectors

2015-11-27 Thread Ovidiu-Cristian MARCU
Hi, The main question here is why the distribution release doesn’t contain the connector dependencies. It is fair to say that it does not have to (which connector to include or all). So just like Spark does, Flink offers binary distribution for hadoop only without considering other dependencies

Re: Apache Flink on Hadoop YARN using a YARN Session

2015-11-20 Thread Ovidiu-Cristian MARCU
asticity you mentioned. > > Yes, resource elasticity in Flink will mitigate such issues. We would be able > to respond to YARN's preemption requests if jobs with higher priorities are > requesting additional resources. > > On Fri, Nov 20, 2015 at 2:07 PM, Ovidiu-Cristian

Re: Apache Flink on Hadoop YARN using a YARN Session

2015-11-20 Thread Ovidiu-Cristian MARCU
#x27;ll > see what we can do. > > Regards, > Robert > > > > On Fri, Nov 20, 2015 at 1:24 PM, Ovidiu-Cristian MARCU > mailto:ovidiu-cristian.ma...@inria.fr>> > wrote: > Hi, > > The link to FAQ > (https://ci.apache.org/projects/flink/f

Re: Apache Flink on Hadoop YARN using a YARN Session

2015-11-20 Thread Ovidiu-Cristian MARCU
; > In general, we recommend to start a YARN session per program. You can also > directly submit a Flink program to YARN. > > Where did you find the link to the FAQ? The link on the front page is > working: http://flink.apache.org/faq.html <http://flink.apache.org/faq.html

Apache Flink on Hadoop YARN using a YARN Session

2015-11-20 Thread Ovidiu-Cristian MARCU
Hi, I am currently interested in experimenting on Flink over Hadoop YARN. I am documenting from the documentation we have here: https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/yarn_setup.html

Re: Creating a representative streaming workload

2015-11-16 Thread Ovidiu-Cristian MARCU
Regarding Flink vs Spark / Storm you can check here: http://www.sparkbigdata.com/102-spark-blog-slim-baltagi/14-results-of-a-benchmark-between-apache-flink-and-apache-spark