wait for "writeAsText" to finish

2017-04-20 Thread Lydia
delay the execution of the Map-Job until the file is written completely? Right now I I am doing it the following way: val written = result.writeAsText(„…“) if(written.getDataSet.count() > 0){ ...do Map-Job...} Thanks in advance! Best regards, Lydia

Monitoring REST API

2016-12-21 Thread Lydia Ickler
or e.g. CPU or Network? Thanks in advance! Lydia

Monitoring Flink on Yarn

2016-12-19 Thread Lydia Ickler
gards, Lydia

multiple k-means in parallel

2016-11-27 Thread Lydia Ickler
Hi, I want to run k-means with different k in parallel. So each worker should calculate its own k-means. Is that possible? If I do a map on a list of integers to then apply k-means I get the following error: Task not serializable I am looking forward to your answers! Lydia

Write matrix/vector

2016-05-29 Thread Lydia Ickler
Hi, I would like to know how to write a Matrix or Vector (Dense/Sparse) to file? Thanks in advance! Best regards, Lydia

sparse matrix

2016-05-29 Thread Lydia Ickler
supported in that case. 2. I want to drop/delete one specific row and column from the matrix and therefore also reduce the dimension. How is the smartest way to do so? Thanks in advance! Lydia

Re: Scatter-Gather Iteration aggregators

2016-05-13 Thread Lydia Ickler
regator.reset(); aggregator.aggregate(sum); } setNewVertexValue(sum); } } > Am 13.05.2016 um 09:25 schrieb Vasiliki Kalavri : > > Hi Lydia, > > an iteration aggregator combines all aggregates globally once per superstep > and makes them available in the *next* superste

Re: Scatter-Gather Iteration aggregators

2016-05-12 Thread Lydia Ickler
016 um 08:04 schrieb Vasiliki Kalavri : > > Hi Lydia, > > registered aggregators through the ScatterGatherConfiguration are accessible > both in the VertexUpdateFunction and in the MessageFunction. > > Cheers, > -Vasia. > > On 12 May 2016 at 20:08, Lydia Ickler

Scatter-Gather Iteration aggregators

2016-05-12 Thread Lydia Ickler
Hi, I have a question regarding the Aggregators of a Scatter-Gather Iteration. Is it possible to have a global aggregator that is accessible in VertexUpdateFunction() and MessagingFunction() at the same time? Thanks in advance, Lydia

normalize vertex values

2016-05-12 Thread Lydia Ickler
Hi all, If I have a Graph g: Graph g and I would like to normalize all vertex values by the absolute max of all vertex values -> what API function would I choose? Thanks in advance! Lydia

Re: Find differences

2016-04-07 Thread Lydia Ickler
Nevermind! I figured it out with groupby and Reducegroup Von meinem iPhone gesendet > Am 07.04.2016 um 11:51 schrieb Lydia Ickler : > > Hi, > > If i have 2 DataSets A and B of Type Tuple3 how would > I get a subset of A (based on the fields (0,1)) that does not occur in B

Find differences

2016-04-07 Thread Lydia Ickler
Hi, If i have 2 DataSets A and B of Type Tuple3 how would I get a subset of A (based on the fields (0,1)) that does not occur in B? Is there maybe an already implemented method? Best regards, Lydia Von meinem iPhone gesendet

varying results: local VS cluster

2016-04-04 Thread Lydia Ickler
Hi all, I have an issue regarding execution on 1 machine VS 5 machines. If I execute the following code the results are not the same though I would expect them to be since the input file is the same. Do you have any suggestions? Thanks in advance! Lydia ExecutionEnvironment env

Re: wait until BulkIteration finishes

2016-03-31 Thread Lydia Ickler
Hi, thanks for your reply! Could you please give me an example for the close() step? I can’t find an example online only for open(). There I can „save“ my new result? Best regards, Lydia > Am 31.03.2016 um 18:16 schrieb Stephan Ewen : > > Hi Lydia! > > The same function

Re: wait until BulkIteration finishes

2016-03-31 Thread Lydia Ickler
Hi Till, thanks for your reply! Is there a way to store intermediate results of the bulk iteration to use then in the next iteration except the data set one sends already by default? Best regards, Lydia > Am 31.03.2016 um 12:01 schrieb Till Rohrmann : > > Hi Lydia, > >

wait until BulkIteration finishes

2016-03-31 Thread Lydia Ickler
Hi all, is there a way to tell the program that it should wait until the BulkIteration finishes before the rest of the program is executed? Best regards, Lydia

BulkIteration and BroadcastVariables

2016-03-30 Thread Lydia Ickler
code below I would like to store all newEigenValue. Unfortunately I didn’t find a way to do so. Is it possible to set/change BroadcastVariables? Or is it only possible to „get“ them? Thanks in advance! Lydia //read input file DataSet> matrixA = readMatrix(env, input); //initial: //Approxim

for loop slow

2016-03-26 Thread Lydia Ickler
Hi, I have an issue with a for-loop. If I set the maximal iteration number i to more than 3 it gets stuck and I cannot figure out why. With 1, 2 or 3 it runs smoothly. I attached the code below and marked the loop with //PROBLEM. Thanks in advance! Lydia package

Re: normalizing DataSet with cross()

2016-03-22 Thread Lydia Ickler
s > well, I would assume. > > > On Tue, Mar 22, 2016 at 3:15 PM, Lydia Ickler <mailto:ickle...@googlemail.com>> wrote: > Hi Till, > > maybe it is doing so because I rewrite the ds in the next step again and then > the working steps get mixed? > I am reading

Re: normalizing DataSet with cross()

2016-03-22 Thread Lydia Ickler
Hi Till, maybe it is doing so because I rewrite the ds in the next step again and then the working steps get mixed? I am reading the data from a local .csv file with readMatrix(env, „filename") See code below. Best regards, Lydia //read input file DataSet> ds = readMatrix(en

normalizing DataSet with cross()

2016-03-22 Thread Lydia Ickler
ode below) Any suggestions? Thanks in advance! Lydia ds.cross(ds.aggregate(Aggregations.MAX, 2)).map(new normalizeByMax()); public static final class normalizeByMax implements MapFunction, Tuple3>, Tuple3> { public Tuple3 map( Tuple2, Tupl

Help with DeltaIteration

2016-03-19 Thread Lydia Ickler
Hi, I have a question regarding the Delta Iteration. I basically want to iterate as long as the former and the new calculated set are different. Stop if they are the same. Right now I get a result set that has entries with duplicate „row“ indices which should not be the case. I guess I am doing

MatrixMultiplication

2016-03-14 Thread Lydia Ickler
to tackle the problem with the Gelly API? (Since the matrix is an adjacency matrix). And if so how would you tackle it? Thanks in advance and best regards, Lydia package de.tuberlin.dima.aim3.assignment3; import org.apache.flink.api.common.functions.MapFunction; import

filter dataset

2016-02-29 Thread Lydia Ickler
to send a parameter to make the filter more flexible? How would be the smartest way to do so? Best regards, Lydia

DistributedMatrix in Flink

2016-02-04 Thread Lydia Ickler
? Does it fill the „content“ of v into the variable rows? And another question: What is the function treeAggregate doing ? And how would you tackle a „copy“ of that in Flink? Thanks in advance! Best regards, Lydia private[flink] def multiplyGramianMatrixBy(v: DenseVector[Double]): DenseVector

Re: cluster execution

2016-02-01 Thread Lydia Ickler
xD… a simple "hdfs dfs -chmod -R 777 /users" fixed it! > Am 01.02.2016 um 12:17 schrieb Till Rohrmann : > > Hi Lydia, > > I looks like that. I guess you should check your hdfs access rights. > > Cheers, > Till > > On Mon, Feb 1, 2016

Re: cluster execution

2016-02-01 Thread Lydia Ickler
Till Rohrmann : > > Hi Lydia, > > what do you mean with master? Usually when you submit a program to the > cluster and don’t specify the parallelism in your program, then it will be > executed with the parallelism.default value as parallelism. You can specify > the value in

cluster execution

2016-01-28 Thread Lydia Ickler
ter executes in parallel. Any suggestions? Best regards, Lydia DataSet> matrixA = readMatrix(env, input); DataSet> initial = matrixA.groupBy(0).sum(2); //normalize by maximum value initial = initial.cross(initial.max(2)).map(new normalizeByMax()); matrixA.join(initial).where(1).equalTo(0)

Re: rowmatrix equivalent

2016-01-26 Thread Lydia Ickler
Hi Till, maybe I will do that :) If I have some other questions I will let you know! Best regards, Lydia > Am 24.01.2016 um 17:33 schrieb Till Rohrmann : > > Hi Lydia, > > Flink does not come with a distributed matrix implementation as Spark does it > with the RowMatrix,

Re: MatrixMultiplication

2016-01-25 Thread Lydia Ickler
Hi Till, thanks for your reply :) Yes, it finished after ~27 minutes… Best regards, Lydia > Am 25.01.2016 um 14:27 schrieb Till Rohrmann : > > Hi Lydia, > > Since matrix multiplication is O(n^3), I would assume that it would simply > take 1000 times longer than the multipl

MatrixMultiplication

2016-01-25 Thread Lydia Ickler
Hi, I want do a simple MatrixMultiplication and use the following code (see bottom). For matrices 50x50 or 100x100 it is no problem. But already with matrices of 1000x1000 it would not work anymore and gets stuck in the joining part. What am I doing wrong? Best regards, Lydia package

rowmatrix equivalent

2016-01-24 Thread Lydia Ickler
Hi all, this is maybe a stupid question but what within Flink is the equivalent to Sparks’ RowMatrix ? Thanks in advance, Lydia

Re: eigenvalue solver

2016-01-12 Thread Lydia Ickler
Hi Till, Thanks for the Paper Link! Do you have maybe a Code snippet in mind from BLAS, breeze or spark where to Start from? Best regards, Lydia Von meinem iPhone gesendet > Am 12.01.2016 um 10:46 schrieb Till Rohrmann : > > Hi Lydia, > > there is no Eigenvalue solver incl

eigenvalue solver

2016-01-12 Thread Lydia Ickler
Hi, I wanted to know if there are any implementations yet within the Machine Learning Library or generally that can efficiently solve eigenvalue problems in Flink? Or if not do you have suggestions on how to approach a parallel execution maybe with BLAS or Breeze? Thanks in advance! Lydia

Re: writeAsCsv

2015-10-07 Thread Lydia Ickler
ok, thanks! :) I will try that! > Am 07.10.2015 um 21:35 schrieb Lydia Ickler : > > Hi, > > stupid question: Why is this not saved to file? > I want to transform an array to a DataSet but the Graph stops at collect(). > > //Transform Spectrum to DataSet > List

writeAsCsv

2015-10-07 Thread Lydia Ickler
Hi, stupid question: Why is this not saved to file? I want to transform an array to a DataSet but the Graph stops at collect(). //Transform Spectrum to DataSet List> dataList = new LinkedList>(); double[][] arr = filteredSpectrum.getAs2DDoubleArray(); for (int i=0;i

source binary file

2015-10-06 Thread Lydia Ickler
Hi, how would I read a BinaryFile from HDFS with the Flink Java API? I can only find the Scala way… All the best, Lydia

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Thanks, Till! I used the ALS from FlinkML and it works :) Best regards, Lydia > Am 02.10.2015 um 14:14 schrieb Till Rohrmann : > > Hi Lydia, > > I think the APIs of the versions 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT > are not compatible. Thus, it’s not just simply settin

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
Then I exported it to the cluster that runs with 0.10-SNAPSHOT. > Am 02.10.2015 um 12:15 schrieb Stephan Ewen : > > @Lydia Did you create your POM files for your job with an 0.8.x quickstart? > > Can you try to simply re-create your project's POM files with a new > quickstar

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
> Am 02.10.2015 um 11:55 schrieb Lydia Ickler : > > 0.10-SNAPSHOT

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
atest stable release (0.9.1) for > your flink job and on the cluster. > > On Fri, Oct 2, 2015 at 11:55 AM, Lydia Ickler <mailto:ickle...@googlemail.com>> wrote: > Hi, > > but inside the pom of flunk-job is the flink version set to 0.8 > > 0.8-incuba

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
tween the Flink version you've used to > compile your job and the Flink version installed on the cluster. > > Maven automagically pulls newer 0.10-SNAPSHOT versions every time you're > building your job. > > On Fri, Oct 2, 2015 at 11:45 AM, Lydia Ickler <mailto:ic

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
: java.lang.NoSuchMethodError: org.apache.flink.api.scala.typeutils.CaseClassTypeInfo.(Ljava/lang/Class;Lscala/collection/Seq;Lscala/collection/Seq;)V I guess something like the flink-scala-0.10-SNAPSHOT.jar is missing. How can I add that to the path? Best regards, Lydia

DataSet transformation

2015-10-01 Thread Lydia Ickler
Hi all, so I have a case class Spectrum(mz: Float, intensity: Float) and a DataSet[Spectrum] to read my data in. Now I want to know if there is a smart way to transform my DataSet into a two dimensional Array ? Thanks in advance, Lydia

error message

2015-09-30 Thread Lydia Ickler
Hi, what jar am I missing ? The error is: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.flink.api.scala.ExecutionEnvironment.readCsvFile$default$4()Z

data flow example on cluster

2015-09-29 Thread Lydia Ickler
Hi all, I want to run the data-flow Wordcount example on a Flink Cluster. The local execution with „mvn exec:exec -Dinput=kinglear.txt -Doutput=wordcounts.txt“ is already working. How is the command to execute it on the cluster? Best regards, Lydia

Re: HBase issue

2015-09-24 Thread Lydia Ickler
I am really trying to get HBase to work... Is there maybe a tutorial for all the config files? Best regards, Lydia > Am 23.09.2015 um 17:40 schrieb Maximilian Michels : > > In the issue, it states that it should be sufficient to append the > hbase-protocol.jar file to the Hado

Re: HBase issue

2015-09-24 Thread Lydia Ickler
Hi I tried that but unfortunately it still gets stuck at the second split. Can it be that I have set something in my configurations wrong? In Hadoop? Or Flink? The strange thing is that the HBaseWriteExample works great! Best regards, Lydia > Am 23.09.2015 um 17:40 schrieb Maximilian Mich

Re: no valid hadoop home directory can be found

2015-09-23 Thread Lydia Ickler
oop.home.dir’ (-Dhadoop.home.dir=…) > > – Ufuk > >> On 23 Sep 2015, at 12:43, Lydia Ickler wrote: >> >> Hi all, >> >> I get the following error message that no valid hadoop home directory can be >> found when trying to initialize the HBase configuration. >

no valid hadoop home directory can be found

2015-09-23 Thread Lydia Ickler
Hi all, I get the following error message that no valid hadoop home directory can be found when trying to initialize the HBase configuration. Where would I specify that path? 12:41:02,043 INFO org.apache.flink.addons.hbase.TableInputFormat - Initializing HBaseConfiguration 12:41

HBase issue

2015-09-22 Thread Lydia Ickler
down. Does anyone has an idea why this is happening? Best regards, Lydia 22:28:10,178 DEBUG org.apache.flink.runtime.operators.DataSourceTask - Opening input split Locatable Split (2) at [grips5:60020]: DataSource (at createInput(ExecutionEnvironment.java:502

Re: Job stuck at "Assigning split to host..."

2015-07-27 Thread Lydia Ickler
Hi Ufuk, yes, I figured out that the HMaster of hbase did not start properly! Now everything is working :) Thanks for your help! Best regards, Lydia > Am 27.07.2015 um 11:45 schrieb Ufuk Celebi : > > Any update on this Lydia? > > On 23 Jul 2015, at 16:38, Ufu

Re: Job stuck at "Assigning split to host..."

2015-07-23 Thread Lydia Ickler
Hi Ufuk, no, I don’t mind! Where would I change the log level? Best regards, Lydia > Am 23.07.2015 um 14:41 schrieb Ufuk Celebi : > > Hey Lydia, > > it looks like the HBase client is losing its connection to HBase. Before > that, everything seems to be working just fine (X

Job stuck at "Assigning split to host..."

2015-07-23 Thread Lydia Ickler
Hi, I am trying to read data from a HBase Table via the HBaseReadExample.java Unfortunately, my run gets always stuck at the same position. Do you guys have any suggestions? In the master node it says: 14:05:04,239 INFO org.apache.flink.runtime.jobmanager.JobManager - Received job bb9

Re: HBase on 4 machine cluster - OutOfMemoryError

2015-07-18 Thread Lydia Ickler
ently the RPC message is very > large. > > Is the data that you request in one row? > > Am 18.07.2015 00:50 schrieb "Lydia Ickler" <mailto:ickle...@googlemail.com>>: > Hi all, > > I am trying to read a data set from HBase within a cluster application

HBase on 4 machine cluster - OutOfMemoryError

2015-07-17 Thread Lydia Ickler
space and then it just closes the zookeeper… Do you have a suggestion how to avoid this OutOfMemoryError? Best regards, Lydia

DataSet Conversion

2015-07-13 Thread Lydia Ickler
Hi guys, is it possible to convert a Java DataSet to a Scala Dataset? Right now I get the following error: Error:(102, 29) java: incompatible types: 'org.apache.flink.api.java.DataSet cannot be converted to org.apache.flink.api.scala.DataSet‘ Thanks in advance, Lydia

HBase & Machine Learning

2015-07-11 Thread Lydia Ickler
there should Be a more sophisticated way. Do you have a code snippet or an idea how to do so? Many thanks in advance and best regards, Lydia