Hello Christophe,
That's very interesting, I've been working with MOA/SAMOA recently and was
considering if we could create some
easy integration with Flink.
I have a Master student this year that could do some work on this,
hopefully we can create something interesting
there.
Regards,
Theodore
Hello Fabio,
what you describe sounds very possible, the easiest way to do it would be
to save your incoming data in HDFS as you already do if I understand
correctly,
and then use the batch ALS algorithm [1] to create your recommendations
from the static data, which you could do at regular interva
Hello all,
I'm sure you've considered this already, but what this data does not
include is all the potential future users,
i.e. slower moving organizations (banks etc.) which could be on Java 7
still.
Whether those are relevant is up for debate.
Cheers,
Theo
On Thu, Mar 23, 2017 at 12:14 PM, Ro
Hello all,
I've started thinking about online learning in Flink and one of the issues
that has come
up in other frameworks is the ability to prioritize "control" over "data"
events in iterations.
To set an example, say we develop an ML model, that ingests events in
parallel, performs
an aggregati
Hello Mäki,
I think what you would like to do is train a model using batch, and use the
Flink streaming API as a way to serve your model and make predictions.
While we don't have an integrated way to do that in FlinkML currently, I
definitely think that's possible. I know Marton Balassi has been
cution of the Connected functions (map1/map2 in this case) are not
> affected by the timestamps. In other words it is pretty much arbitrary
> which input arrives at the CoMapFunction first.
>
> So I think you did everything correctly.
>
> Gyula
>
> Theodore Vasiloudis ezt í
Hello all,
I was playing around with the the IncrementalLearningSkeleton example and I
had a couple of questions regarding the behavior of connected streams.
In the example the elements are assigned timestamps, and there is a stream,
model, that produces
Double[] elements by ingesting and process
Hello all,
I was preparing an exercise for some Master students and I went through
running the Java
quickstart setup [1] again to verify everything works as expected.
I ran into a problem when running from within IDEA, we've encountered this
in the past during trainings.
While the quickstart gui
Hello Kursat,
We don't have a multi class classifier in FlinkML currently.
Regards,
Theodore
--
Sent from a mobile device. May contain autocorrect errors.
On Oct 19, 2016 12:33 AM, "Kürşat Kurt" wrote:
> Hi;
>
>
> I am trying to learn Flink Ml lib.
>
> Where can i find detailed multiclass cl
That is my bad, I must have been testing against a private branch when
writing the guide, the SVM as it stands only has a predict operation for
Vector not LabeledVector.
IMHO I would like to have a predict operator for LabeledVector for all
predictors (that would just call the existing Vector pred
Hello Kursat,
As noted in the documentation, the SVM implementation is for binary
classification only for the time being.
Regards,
Theodore
--
Sent from a mobile device. May contain autocorrect errors.
On Oct 13, 2016 8:53 PM, "Kürşat Kurt" wrote:
> Hi;
>
>
>
> I am trying to classify docume
Have you tried profiling the application to see where most of the time is
spent during the runs?
If most of the time is spent reading in the data maybe any difference
between the two methods is being obscured.
--
Sent from a mobile device. May contain autocorrect errors.
On Sep 6, 2016 4:55 PM,
Hello Dan,
are you broadcasting the 85GB of data then? I don't get why you wouldn't
store that file on HDFS so it's accessible by your workers.
If you have the full code available somewhere we might be able to help
better.
For L-BFGS you should only be broadcasting the model (i.e. the weight
ve
like it creates multiple copies per co-map operation. I
>> use the keyed version to match side inputs with the data.
>>
>> Sent from my iPhone
>>
>> On Aug 5, 2016, at 12:36 PM, Theodore Vasiloudis <
>> theodoros.vasilou...@gmail.com> wrote:
>>
>>
uld have used side-inputs.
>
> Sameer
>
>
>
>
> On Thu, Aug 4, 2016 at 8:56 PM, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com> wrote:
>
>> Hello all,
>>
>> for a prototype we are looking into we would like to read a big matrix
>> from HDF
Hello all,
for a prototype we are looking into we would like to read a big matrix from
HDFS, and for every element that comes in a stream of vectors do on
multiplication with the matrix. The matrix should fit in the memory of one
machine.
We can read in the matrix using a RichMapFunction, but tha
Hello Malte,
As Simone said there is no Java support currently for FlinkML unfortunately.
Regards,
Theodore
On Mon, May 9, 2016 at 3:05 PM, Simone Robutti wrote:
> To my knowledge FlinkML does not support an unified API and most things
> must be used exclusively with Scala Datasets.
>
> 2016-0
Hello all,
you can find my slides on Large-Scale Machine Learning with FlinkML here
(from SICS Data Science day and FOSDEM 2016):
http://www.slideshare.net/TheodorosVasiloudis/flinkml-large-scale-machine-learning-with-apache-flink
Best,
Theodore
On Mon, Apr 4, 2016 at 3:19 PM, Rubén Casado
wrot
t; at
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34)
> at
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59)
> at
> org.apache.flink.runtime.operators.sort.UnilateralSortMer
s.java:52)
> at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:577)
> at
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:68)
> ... 9 more
>
On Wed, Jan 20, 2016 at 9:45 PM, Stephan Ewen wrote:
> Can you again po
ependencies are used.
>
> Alternatively, you could compile an example program with example input
> data which can reproduce the problem. Then I could also take a look at it.
>
> Cheers,
> Till
>
>
> On Wed, Jan 20, 2016 at 5:58 PM, Theodore Vasiloudis <
> theodoros.vas
ll request (
> https://github.com/apache/flink/pull/1528) or from this branch (
> https://github.com/StephanEwen/incubator-flink kryo) and see if that
> fixes it?
>
>
> Thanks,
> Stephan
>
>
>
>
>
> On Wed, Jan 20, 2016 at 3:33 PM, Theodore Vasiloudis <
> t
tion of readLibSVM is what's wrong here.
I've tried the new version commited recently by Chiwan, but I still get the
same error.
I'll see if I can spot a bug in readLibSVM.
On Wed, Jan 20, 2016 at 1:43 PM, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com> wrote:
> It
apReferenceResolver" - there
> should be no reference resolution during serialization / deserialization.
>
> Can you try what happens when you explicitly register the type
> SparseVector at the ExecutionEnvironment?
>
> Stephan
>
>
> On Wed, Jan 20, 2016 at 11:24 AM, The
Hello all,
I'm trying to run a job using FlinkML and I'm confused about the source of
an error.
The job reads a libSVM formatted file and trains an SVM classifier on it.
I've tried this with small datasets and everything works out fine.
When trying to run the same job on a large dataset (~11GB
2 Dresden
> E-Mail: d...@se.inf.tu-dresden.de
>
> On Wed, Oct 28, 2015 at 3:50 PM, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com> wrote:
>
>> Your build.sbt seems correct.
>> It might be that you are missing some basic imports.
>>
>> In your code
Your build.sbt seems correct.
It might be that you are missing some basic imports.
In your code have you imported
import org.apache.flink.api.scala._
?
On Tue, Oct 27, 2015 at 8:45 PM, Vasiliki Kalavri wrote:
> Hi Do,
>
> I don't really have experience with sbt, but one thing that might caus
This sounds similar to this problem:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-ML-as-Dependency-td1582.html
The reason is (quoting Till, replace gradle with sbt here):
the flink-ml pom contains as a dependency an artifact with artifactId
> breeze_${scala.binary.ver
>
> You could generate your own case classes which have more than the 22
> fields, though.
Actually that is not possible with case classes in Scala 2.10, you would
have to use a normal class if you have more than 22 fields.
This constraint was removed in 2.11.
On Wed, Oct 14, 2015 at 11:42 AM, T
Hello Trevor,
I assume you using the MultipleLinearRegression class in a manner similar
to our examples, i.e.:
// Create multiple linear regression learnerval mlr =
MultipleLinearRegression().setIterations(10).setStepsize(0.5).setConvergenceThreshold(0.001)
// Obtain training and testing data set
30 matches
Mail list logo