Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread simone
jsonSubjects.writeAsText("/tmp/subjects/", WriteMode.OVERWRITE);// //env.execute("JSON generation");/ What is the problem? Did I made some mistake on filtering,grouping or reducing logic? Thanks in advance, Simone.

Re: Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread simone
Sorry, I translated the code into pseudocode too fast. That is indeed an equals. On 16/03/2018 15:58, Kien Truong wrote: Hi, Just a guest, but string compare in Java should be using equals method, not == operator. Regards, Kien On 3/16/2018 9:47 PM, simone wrote: /subject.getField

Re: Strange behavior on filter, group and reduce DataSets

2018-03-19 Thread simone
Hi Fabian, This simple code reproduces the behavior -> https://github.com/xseris/Flink-test-union Thanks, Simone. On 19/03/2018 15:44, Fabian Hueske wrote: Hmmm, I still don't see the problem. IMO, the result should be correct for both plans. The data is replicated, filtered, redu

Re: Strange behavior on filter, group and reduce DataSets

2018-03-21 Thread simone
Hi all, an update: following Stephan directives on how to diagnose the issue, making Person immutable, the problem does not occur. Simone. On 20/03/2018 20:20, Stephan Ewen wrote: To diagnose that, can you please check the following:   - Change the Person data type to be immutable (final

Re: Strange behavior on filter, group and reduce DataSets

2018-03-26 Thread simone
Hi Fabian, any update on this? Did you fix it? Best, Simone. On 22/03/2018 00:24, Fabian Hueske wrote: Hi, That was a bit too early. I found an issue with my approach. Will come back once I solved that. Best, Fabian 2018-03-21 23:45 GMT+01:00 Fabian Hueske <mailto:fhue...@gmail.

Problem with Kafka Consumer

2017-05-16 Thread simone
: /messageStream.keyBy(0)// //.windowAll(GlobalWindows.create())// // .trigger(PurgingTrigger.of(CountTrigger.of(//*N*//)))// // .apply(new RowToQuery());/ How is it possible? Is there any properties on Consumer to be set in order to process more data? Thanks, Simone.

Re: Problem with Kafka Consumer

2017-05-16 Thread simone
calling an external service. Thanks for your help, Simone. On 16/05/2017 14:01, Kostas Kloudas wrote: Hi Simone, I suppose that you use messageStream.keyBy(…).window(…) right? .windowAll() is not applicable to keyedStreams. Some follow up questions are: In your logs, do you see any error

Re: Problem with Kafka Consumer

2017-05-18 Thread simone
Hi Kostas, As suggested I switched to version 1.3-SNAPSHOT and the project run without any problem. I will keep you informed if any other issue occurs. Thanks again for the help. Cheers, Simone. On 16/05/2017 16:36, Kostas Kloudas wrote: Hi Simone, Glad I could help ;) Actually it would

Compilation error while instancing FlinkKafkaConsumer082

2016-02-10 Thread Simone Robutti
Hello, the compiler has been raising an error since I added this line to the code val testData=streamEnv.addSource(new FlinkKafkaConsumer082[String]("data-input",new SimpleStringSchema(),kafkaProp)) Here is the error: Error:scalac: Class org.apache.flink.streaming.api.checkpoint.CheckpointN

Java Maps and Type Information

2016-03-01 Thread Simone Robutti
Hello, to my knowledge is not possible to use a java.util.Map for example in a FlatMapFunction. Is that correct? It gives a typer error at runtime and it doesn't work even with explicit TypeInformation hints. Is there any way to make it work? Thanks, Simone

Re: Java Maps and Type Information

2016-03-01 Thread Simone Robutti
42 GMT+01:00 Aljoscha Krettek : > Hi, > what kind of program are you writing? I just wrote a quick example using > the DataStream API where I’m using Map> as > the output type of one of my MapFunctions. > > Cheers, > Aljoscha > > On 01 Mar 2016, at 16:33, Simone Robutti &

Re: Java Maps and Type Information

2016-03-02 Thread Simone Robutti
able method found for returns(java.lang.Class) method Should I open an issue? 2016-03-01 21:45 GMT+01:00 Simone Robutti : > I tried to simplify it to the bones but I'm actually defining a custom > MapFunction,java.util.Map> that > even with a simple identity function fails at runtime givi

Re: Flink Checkpoint on yarn

2016-03-19 Thread Simone Robutti
the "flink run" process alive so that may be the problem. We just noticed a few minutes ago. If the problem persists, I will eventually come back with a full log. Thanks for now, Simone 2016-03-16 18:04 GMT+01:00 Ufuk Celebi : > Hey Simone, > > from the logs it looks like mult

Re: Flink Checkpoint on yarn

2016-03-19 Thread Simone Robutti
e an idea why they are out of order? Maybe something > is mixed up in the way we gather the logs and we only think that > something is wrong because of this. > > > On Wed, Mar 16, 2016 at 6:11 PM, Simone Robutti > wrote: > > I didn't resubmitted the job. Also the jobs are

Re: Flink Checkpoint on yarn

2016-03-19 Thread Simone Robutti
x27;s something wrong with my configuration because otherwise this really looks like a bug. Thanks in advance, Simone 2016-03-16 18:55 GMT+01:00 Simone Robutti : > Actually the test was intended for a single job. The fact that there are > more jobs is unexpected and it will be the first thin

Re: Flink Checkpoint on yarn

2016-03-19 Thread Simone Robutti
ts > > You can share the complete job manager log file as well if you like. > > – Ufuk > > On Wed, Mar 16, 2016 at 2:50 PM, Simone Robutti > wrote: > > Hello, > > > > I'm testing the checkpointing functionality with hdfs as a backend. > > >

Flink Checkpoint on yarn

2016-03-19 Thread Simone Robutti
Hello, I'm testing the checkpointing functionality with hdfs as a backend. For what I can see it uses different checkpointing files and resume the computation from different points and not from the latest available. This is to me an unexpected behaviour. I log every second, for every worker, a c

Connecting to a remote jobmanager - problem with Akka remote

2016-03-22 Thread Simone Robutti
Hello, we are trying to set up our system to do remote debugging through Intellij. Flink is running on a yarn long running session. We are launching Flink's CliFrontend with the following parameters: > run -m **::48252 /Users//Projects/flink/build-target/examples/batch/WordCount.jar The error r

Re: Flink ML 1.0.0 - Saving and Loading Models to Score a Single Feature Vector

2016-03-29 Thread Simone Robutti
To my knowledge there is nothing like that. PMML is not supported in any form and there's no custom saving format yet. If you really need a quick and dirty solution, it's not that hard to serialize the model into a file. 2016-03-28 17:59 GMT+02:00 Sourigna Phetsarath : > Flinksters, > > Is there

How to test serializability of a Flink job

2016-04-05 Thread Simone Robutti
to write tests that verify that a class or a job could be successfully serialized. Thanks, Simone

Explanation on limitations of the Flink Table API

2016-04-21 Thread Simone Robutti
ething. My requirement is to be able to read a CSV file and manipulate it reading the field names from the file and inferring data types. Thanks, Simone

Re: Explanation on limitations of the Flink Table API

2016-04-21 Thread Simone Robutti
again, Simone 2016-04-21 14:41 GMT+02:00 Fabian Hueske : > Hi Simone, > > in Flink 1.0.x, the Table API does not support reading external data, > i.e., it is not possible to read a CSV file directly from the Table API. > Tables can only be created from DataSet or DataStream which

Create a cluster inside Flink

2016-04-28 Thread Simone Robutti
format of this software (this probably won't be much of a problem, I will write wrappers or at worst replicate the data). Any suggestion is welcome. Thanks, Simone

Creating a custom operator

2016-04-29 Thread Simone Robutti
nvoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) I looked at the location of the error but it's not clear to me how to make my operator recognizable from the optimizer. Thank, Simone

Re: Creating a custom operator

2016-05-03 Thread Simone Robutti
contribution to Flink. So I would like to know if our understandment is correct and custom runtime operators are not supposed to be implemented outside of Flink. Thanks, Simone 2016-04-29 21:32 GMT+02:00 Fabian Hueske : > Hi Simone, > > the GraphCreatingVisitor transforms the common oper

Re: Creating a custom operator

2016-05-03 Thread Simone Robutti
ty and possibilities to achieve an ideal solution. 2016-05-03 13:29 GMT+02:00 Fabian Hueske : > Hi Simone, > > you are right, the interfaces you extend are not considered to be public, > user-facing API. > Adding custom operators to the DataSet API touches many parts of the > s

Bug while using Table API

2016-05-04 Thread Simone Robutti
Hello, while trying my first job using the Table API I got blocked by this error: Exception in thread "main" java.lang.NoSuchFieldError: RULES at org.apache.flink.api.table.plan.rules.FlinkRuleSets$.(FlinkRuleSets.scala:148) at org.apache.flink.api.table.plan.rules.FlinkRuleSets$.(FlinkRuleSets.s

Re: Bug while using Table API

2016-05-04 Thread Simone Robutti
Here is the code: package org.example import org.apache.flink.api.scala._ import org.apache.flink.api.table.TableEnvironment object Job { def main(args: Array[String]) { // set up the execution environment val env = ExecutionEnvironment.getExecutionEnvironment val tEnv = TableEnvir

Re: Creating a custom operator

2016-05-09 Thread Simone Robutti
working on this integration or at least support us in the development. We are waiting for them to start a discussion and as soon as we will have a more clear idea on how to proceed, we will validate it with the stuff you just said. Your confidence in Flink's operators gives up hope to achieve a

Re: Using ML lib SVM with Java

2016-05-09 Thread Simone Robutti
To my knowledge FlinkML does not support an unified API and most things must be used exclusively with Scala Datasets. 2016-05-09 13:31 GMT+02:00 Malte Schwarzer : > Hi folks, > > I tried to get the FlinkML SVM running - but it didn't really work. The > SVM.fit() method requires a DataSet paramete

Re: Using FlinkML algorithms in Streaming

2016-05-11 Thread Simone Robutti
Actually model portability and persistence is a serious limitation to practical use of FlinkML in streaming. If you know what you're doing, you can write a blunt serializer for your model, write it in a file and rebuild the model stream-side with deserialized informations. I tried it for an SVM mo

Re: Bug while using Table API

2016-05-12 Thread Simone Robutti
Ok, I tested it and it works on the same example. :) 2016-05-11 12:25 GMT+02:00 Vasiliki Kalavri : > Hi Simone, > > Fabian has pushed a fix for the streaming TableSources that removed the > Calcite Stream rules [1]. > The reported error does not appear anymore with the curren

Merging sets with common elements

2016-05-25 Thread Simone Robutti
tributed environment. Any suggestion? Thanks, Simone

Re: Merging sets with common elements

2016-05-25 Thread Simone Robutti
full connection of all the elements of said set , you can see the result as the connected components of the graph. 2016-05-25 11:42 GMT+02:00 Till Rohrmann : > Hi Simone, > > could you elaborate a little bit on the actual operation you want to > perform. Given a data set {(1, {1,2

Re: sparse matrix

2016-05-30 Thread Simone Robutti
Hello, right now Flink's local matrices are rather raw and for this kind of usage, you should rely on Breeze. If you need to perform operations, slicing in this case, they are a better option if you don't want to reimplement everything. In case you already developed against Flink's matrices, ther

Re: env.fromElements produces TypeInformation error

2016-06-04 Thread Simone Robutti
I'm not sure if this is the solution and I don't have the possibility to try right now, but you should move the case class "State" definition outside the abstract class. 2016-06-04 17:34 GMT+02:00 Dan Drewes : > > Hi, > > compiling the code: > > def minimize(f: DF, init: T): T = { > > //create

How to use properly the function: withTimestampAssigner((event, timestamp) ->..

2020-11-06 Thread Simone Cavallarin
Hi, I'm taking the timestamp from the event payload that I'm receiving from Kafka. I'm struggling to get the time and I'm confused on how I should use the function ".withTimestampAssigner()". I'm receiving an error on event.getTime() that is telling me: "cannot resolve method "Get Time" in "Obj

Re: How to use properly the function: withTimestampAssigner((event, timestamp) ->..

2020-11-08 Thread Simone Cavallarin
Hi Till, That's great! thank you so much!!! I have spent one week on this. I'm so relieved! Cheers s From: Till Rohrmann Sent: 06 November 2020 17:56 To: Simone Cavallarin Cc: user@flink.apache.org ; Aljoscha Krettek Subject: Re: How to use pr

How to use EventTimeSessionWindows.withDynamicGap()

2020-11-12 Thread Simone Cavallarin
Hi All, I'm trying to use EventTimeSessionWindows.withDynamicGap in my application. I have understood that the gap is computed dynamically by a function on each element. What I should be able to obtain is a Flink application that can automatically manage the windows based on the frequency of th

Re: How to use EventTimeSessionWindows.withDynamicGap()

2020-11-12 Thread Simone Cavallarin
ord for that). So if your timeout increases as time goes on your successive sessions will just get bigger and bigger. Best, Aljoscha On 12.11.20 15:56, Simone Cavallarin wrote: > Hi All, > > I'm trying to use EventTimeSessionWindows.withDynamicGap in my application. I > have understood

Re: How to use EventTimeSessionWindows.withDynamicGap()

2020-11-13 Thread Simone Cavallarin
+user@ From: Simone Cavallarin Sent: 13 November 2020 16:46 To: Aljoscha Krettek Subject: Re: How to use EventTimeSessionWindows.withDynamicGap() Hi Aljoscha, When you said: You could use a stateful operation (like a ProcessFunction) to put a dynamic &quo

Re: How to use EventTimeSessionWindows.withDynamicGap()

2020-11-14 Thread Simone Cavallarin
back on point 1 and be used on the windowing process) * (m.. okay now complitely lost...) Thanks s ____ From: Simone Cavallarin Sent: 13 November 2020 16:55 To: Aljoscha Krettek Cc: user Subject: Re: How to use EventTimeSessionWindows.

Re: How to use EventTimeSessionWindows.withDynamicGap()

2020-11-17 Thread Simone Cavallarin
to just use a stateful ProcessFunction to not have to deal with the somewhat finicky setup of the stateful extractor just to force it into the requirements of the session windows API. Best, Aljoscha On 14.11.20 10:50, Simone Cavallarin wrote: > Hi Aljoscha, > > I found a similar question of mine

Re: How to use EventTimeSessionWindows.withDynamicGap()

2020-11-19 Thread Simone Cavallarin
Many thanks for the Help!! Simone From: Aljoscha Krettek Sent: 19 November 2020 11:46 To: user@flink.apache.org Subject: Re: How to use EventTimeSessionWindows.withDynamicGap() On 17.11.20 17:37, Simone Cavallarin wrote: > Hi, > > I have been worki

Print on screen DataStream content

2020-11-23 Thread Simone Cavallarin
Hi All, On my code I have a DataStream that I would like to access. I need to understand what I'm getting for each transformation to check if the data that I'm working on make sense. How can I print into the console or get a file (csv, txt) for the variables: "stream", "enriched" and "result"?

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
0 10:01 To: Pankaj Chand Cc: Austin Cawley-Edwards ; Simone Cavallarin ; user@flink.apache.org Subject: Re: Print on screen DataStream content When Flink is running on a cluster, `DataStream#print()` prints to files in the log directory. Regards, David On Tue, Nov 24, 2020 at 6:03 AM Pa

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
2550-4e16-b381-554f86d3812f] Thanks From: Timo Walther Sent: 24 November 2020 11:50 To: user@flink.apache.org Subject: Re: Print on screen DataStream content Hi Simone, if you are just executing DataStream pipelines locally in your IDE while prototyping. You should be able to use `DataS

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
ok, thanks you all for the help! s From: David Anderson Sent: 24 November 2020 15:16 To: Simone Cavallarin Cc: user@flink.apache.org Subject: Re: Print on screen DataStream content Simone, What you want to do is to override the toString() method on Event so

Re: Process windows not firing - > Can it be a Watermak issue?

2020-12-02 Thread Simone Cavallarin
> Thanks I didn't know that. noted! Why StreamExecutionEnvironment.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) is so important? Thanks s From: David Anderson Sent: 02 December 2020 16:01 To: Simone Cavallarin Cc: user Subject:

Re: Process windows not firing - > Can it be a Watermak issue?

2020-12-02 Thread Simone Cavallarin
Hi Till Super, understood! I will also read the website with the link that you provided me. Thanks and have a nice eve. best s From: Till Rohrmann Sent: 02 December 2020 17:44 To: Simone Cavallarin Cc: user Subject: Re: Process windows not firing - >

How to Implement a simple boolean .trigger()

2020-12-13 Thread Simone Cavallarin
using trigger(...). And: https://gist.github.com/mxm/c5831ead9c9d9ad68731f5f2f3793154 But still... Some help would be really appreciated! Thanks! Simone

Re: Storing JPMML-Model Object as a Variable Closure?

2016-09-05 Thread Simone Robutti
I think you could make use of this small component I've developed: https://gitlab.com/chobeat/Flink-JPMML It's specifically for using JPMML on Flink. Maybe there's too much stuff for what you need but you could reuse the code of the Operator to do what you need. 2016-09-05 14:11 GMT+02:00 Bauss,

Re: Storing JPMML-Model Object as a Variable Closure?

2016-09-05 Thread Simone Robutti
control of that part. 2016-09-05 15:24 GMT+02:00 Bauss, Julian : > Hi Simone, > > > > that sounds promising! > > Unfortunately your link leads to a 404 page. > > > > Best Regards, > > > > Julian > > > > *Von:* Simone Robutti [mailto:simone.r

Re: Storing JPMML-Model Object as a Variable Closure?

2016-09-05 Thread Simone Robutti
r of instances was minimal but this is still extremely experimental so take it with a grain of salt. I believe that this is highly dependent on the expected size of the PMML models though. 2016-09-05 16:33 GMT+02:00 Bauss, Julian : > Hi Simone, > > > > thank you for your feedback!

Discard message LeaderSessionMessage(null,ConnectionTimeout)

2016-09-13 Thread Simone Robutti
know what could be the cause of the error, if it is related to the issue I'm facing on the flink-kafka connector and maybe if it's due to the upgrade to the version 0.10. I know the producer is still in a PR but it showed to work properly in a local environment. Thank you, Simone

Merge the states of different partition in streaming

2016-09-27 Thread Simone Robutti
what I need? Is there a smarter way to count distinct evolving elements by their status in a streaming? Mind that the original source of events are updates to the status of an element and the requirement is that I want to count only the latest status available. Thank you in advance, Simone

Re: Merge the states of different partition in streaming

2016-09-28 Thread Simone Robutti
Solved. Probably there was an error in the way I was testing. Also I simplified the job and it works now. 2016-09-27 16:01 GMT+02:00 Simone Robutti : > Hello, > > I'm dealing with an analytical job in streaming and I don't know how to > write the last part. > > Actu

Counting latest state of stateful entities in streaming

2016-09-29 Thread Simone Robutti
. Thank you in advance, Simone

Re: Counting latest state of stateful entities in streaming

2016-09-30 Thread Simone Robutti
ouldn't make it run yet but for what I got, this is slightly different from what I need. 2016-09-30 10:04 GMT+02:00 Fabian Hueske : > Hi Simone, > > I think I have a solution for your problem: > > val s: DataStream[(Long, Int, ts)] = ??? // (id, state, time) > > val

Re: SVM classification problem.

2016-10-02 Thread Simone Robutti
No, you don't get 100% accurracy in this case. You don't even want that, it would be a severe case of overfitting. You would have that only in the case that your dataset is linearly separable or separable with a finely tuned kernel, but in that case SVM would be an overkill and more traditional met