delay the execution of the Map-Job until the
file is written completely?
Right now I I am doing it the following way:
val written = result.writeAsText(„…“)
if(written.getDataSet.count() > 0){ ...do Map-Job...}
Thanks in advance!
Best regards,
Lydia
or e.g. CPU or Network?
Thanks in advance!
Lydia
gards,
Lydia
Hi,
I want to run k-means with different k in parallel.
So each worker should calculate its own k-means. Is that possible?
If I do a map on a list of integers to then apply k-means I get the following
error:
Task not serializable
I am looking forward to your answers!
Lydia
Hi,
I would like to know how to write a Matrix or Vector (Dense/Sparse) to file?
Thanks in advance!
Best regards,
Lydia
supported in that
case.
2. I want to drop/delete one specific row and column from the matrix and
therefore also reduce the dimension.
How is the smartest way to do so?
Thanks in advance!
Lydia
regator.reset();
aggregator.aggregate(sum);
}
setNewVertexValue(sum);
}
}
> Am 13.05.2016 um 09:25 schrieb Vasiliki Kalavri :
>
> Hi Lydia,
>
> an iteration aggregator combines all aggregates globally once per superstep
> and makes them available in the *next* superste
016 um 08:04 schrieb Vasiliki Kalavri :
>
> Hi Lydia,
>
> registered aggregators through the ScatterGatherConfiguration are accessible
> both in the VertexUpdateFunction and in the MessageFunction.
>
> Cheers,
> -Vasia.
>
> On 12 May 2016 at 20:08, Lydia Ickler
Hi,
I have a question regarding the Aggregators of a Scatter-Gather Iteration.
Is it possible to have a global aggregator that is accessible in
VertexUpdateFunction() and MessagingFunction() at the same time?
Thanks in advance,
Lydia
Hi all,
If I have a Graph g: Graph g
and I would like to normalize all vertex values by the absolute max of all
vertex values -> what API function would I choose?
Thanks in advance!
Lydia
Nevermind! I figured it out with groupby and
Reducegroup
Von meinem iPhone gesendet
> Am 07.04.2016 um 11:51 schrieb Lydia Ickler :
>
> Hi,
>
> If i have 2 DataSets A and B of Type Tuple3 how would
> I get a subset of A (based on the fields (0,1)) that does not occur in B
Hi,
If i have 2 DataSets A and B of Type Tuple3 how would I
get a subset of A (based on the fields (0,1)) that does not occur in B?
Is there maybe an already implemented method?
Best regards,
Lydia
Von meinem iPhone gesendet
Hi all,
I have an issue regarding execution on 1 machine VS 5 machines.
If I execute the following code the results are not the same though I would
expect them to be since the input file is the same.
Do you have any suggestions?
Thanks in advance!
Lydia
ExecutionEnvironment env
Hi,
thanks for your reply!
Could you please give me an example for the close() step? I can’t find an
example online only for open().
There I can „save“ my new result?
Best regards,
Lydia
> Am 31.03.2016 um 18:16 schrieb Stephan Ewen :
>
> Hi Lydia!
>
> The same function
Hi Till,
thanks for your reply!
Is there a way to store intermediate results of the bulk iteration to use then
in the next iteration except the data set one sends already by default?
Best regards,
Lydia
> Am 31.03.2016 um 12:01 schrieb Till Rohrmann :
>
> Hi Lydia,
>
>
Hi all,
is there a way to tell the program that it should wait until the BulkIteration
finishes before the rest of the program is executed?
Best regards,
Lydia
code below I would like to store all newEigenValue.
Unfortunately I didn’t find a way to do so.
Is it possible to set/change BroadcastVariables? Or is it only possible to
„get“ them?
Thanks in advance!
Lydia
//read input file
DataSet> matrixA = readMatrix(env, input);
//initial:
//Approxim
Hi,
I have an issue with a for-loop.
If I set the maximal iteration number i to more than 3 it gets stuck and I
cannot figure out why.
With 1, 2 or 3 it runs smoothly.
I attached the code below and marked the loop with //PROBLEM.
Thanks in advance!
Lydia
package
s
> well, I would assume.
>
>
> On Tue, Mar 22, 2016 at 3:15 PM, Lydia Ickler <mailto:ickle...@googlemail.com>> wrote:
> Hi Till,
>
> maybe it is doing so because I rewrite the ds in the next step again and then
> the working steps get mixed?
> I am reading
Hi Till,
maybe it is doing so because I rewrite the ds in the next step again and then
the working steps get mixed?
I am reading the data from a local .csv file with readMatrix(env, „filename")
See code below.
Best regards,
Lydia
//read input file
DataSet> ds = readMatrix(en
ode below)
Any suggestions?
Thanks in advance!
Lydia
ds.cross(ds.aggregate(Aggregations.MAX, 2)).map(new normalizeByMax());
public static final class normalizeByMax implements
MapFunction, Tuple3>,
Tuple3> {
public Tuple3 map(
Tuple2, Tupl
Hi,
I have a question regarding the Delta Iteration.
I basically want to iterate as long as the former and the new calculated set
are different. Stop if they are the same.
Right now I get a result set that has entries with duplicate „row“ indices
which should not be the case.
I guess I am doing
to tackle the problem with the Gelly API? (Since the
matrix is an adjacency matrix). And if so how would you tackle it?
Thanks in advance and best regards,
Lydia
package de.tuberlin.dima.aim3.assignment3;
import org.apache.flink.api.common.functions.MapFunction;
import
to send a parameter to make the filter more flexible?
How would be the smartest way to do so?
Best regards,
Lydia
? Does it fill the „content“ of v into the
variable rows?
And another question:
What is the function treeAggregate doing ? And how would you tackle a „copy“ of
that in Flink?
Thanks in advance!
Best regards,
Lydia
private[flink] def multiplyGramianMatrixBy(v: DenseVector[Double]):
DenseVector
xD… a simple "hdfs dfs -chmod -R 777 /users" fixed it!
> Am 01.02.2016 um 12:17 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> I looks like that. I guess you should check your hdfs access rights.
>
> Cheers,
> Till
>
> On Mon, Feb 1, 2016
Till Rohrmann :
>
> Hi Lydia,
>
> what do you mean with master? Usually when you submit a program to the
> cluster and don’t specify the parallelism in your program, then it will be
> executed with the parallelism.default value as parallelism. You can specify
> the value in
ter executes in parallel.
Any suggestions?
Best regards,
Lydia
DataSet> matrixA = readMatrix(env, input);
DataSet> initial = matrixA.groupBy(0).sum(2);
//normalize by maximum value
initial = initial.cross(initial.max(2)).map(new normalizeByMax());
matrixA.join(initial).where(1).equalTo(0)
Hi Till,
maybe I will do that :)
If I have some other questions I will let you know!
Best regards,
Lydia
> Am 24.01.2016 um 17:33 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> Flink does not come with a distributed matrix implementation as Spark does it
> with the RowMatrix,
Hi Till,
thanks for your reply :)
Yes, it finished after ~27 minutes…
Best regards,
Lydia
> Am 25.01.2016 um 14:27 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> Since matrix multiplication is O(n^3), I would assume that it would simply
> take 1000 times longer than the multipl
Hi,
I want do a simple MatrixMultiplication and use the following code (see bottom).
For matrices 50x50 or 100x100 it is no problem. But already with matrices of
1000x1000 it would not work anymore and gets stuck in the joining part.
What am I doing wrong?
Best regards,
Lydia
package
Hi all,
this is maybe a stupid question but what within Flink is the equivalent to
Sparks’ RowMatrix ?
Thanks in advance,
Lydia
Hi Till,
Thanks for the Paper Link!
Do you have maybe a Code snippet in mind from BLAS, breeze or spark where to
Start from?
Best regards,
Lydia
Von meinem iPhone gesendet
> Am 12.01.2016 um 10:46 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> there is no Eigenvalue solver incl
Hi,
I wanted to know if there are any implementations yet within the Machine
Learning Library or generally that can efficiently solve eigenvalue problems in
Flink?
Or if not do you have suggestions on how to approach a parallel execution maybe
with BLAS or Breeze?
Thanks in advance!
Lydia
ok, thanks! :)
I will try that!
> Am 07.10.2015 um 21:35 schrieb Lydia Ickler :
>
> Hi,
>
> stupid question: Why is this not saved to file?
> I want to transform an array to a DataSet but the Graph stops at collect().
>
> //Transform Spectrum to DataSet
> List
Hi,
stupid question: Why is this not saved to file?
I want to transform an array to a DataSet but the Graph stops at collect().
//Transform Spectrum to DataSet
List> dataList = new LinkedList>();
double[][] arr = filteredSpectrum.getAs2DDoubleArray();
for (int i=0;i
Hi,
how would I read a BinaryFile from HDFS with the Flink Java API?
I can only find the Scala way…
All the best,
Lydia
Thanks, Till!
I used the ALS from FlinkML and it works :)
Best regards,
Lydia
> Am 02.10.2015 um 14:14 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> I think the APIs of the versions 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT
> are not compatible. Thus, it’s not just simply settin
Then I exported it to the cluster that runs with 0.10-SNAPSHOT.
> Am 02.10.2015 um 12:15 schrieb Stephan Ewen :
>
> @Lydia Did you create your POM files for your job with an 0.8.x quickstart?
>
> Can you try to simply re-create your project's POM files with a new
> quickstar
> Am 02.10.2015 um 11:55 schrieb Lydia Ickler :
>
> 0.10-SNAPSHOT
atest stable release (0.9.1) for
> your flink job and on the cluster.
>
> On Fri, Oct 2, 2015 at 11:55 AM, Lydia Ickler <mailto:ickle...@googlemail.com>> wrote:
> Hi,
>
> but inside the pom of flunk-job is the flink version set to 0.8
>
> 0.8-incuba
tween the Flink version you've used to
> compile your job and the Flink version installed on the cluster.
>
> Maven automagically pulls newer 0.10-SNAPSHOT versions every time you're
> building your job.
>
> On Fri, Oct 2, 2015 at 11:45 AM, Lydia Ickler <mailto:ic
:
java.lang.NoSuchMethodError:
org.apache.flink.api.scala.typeutils.CaseClassTypeInfo.(Ljava/lang/Class;Lscala/collection/Seq;Lscala/collection/Seq;)V
I guess something like the flink-scala-0.10-SNAPSHOT.jar is missing.
How can I add that to the path?
Best regards,
Lydia
Hi all,
so I have a case class Spectrum(mz: Float, intensity: Float)
and a DataSet[Spectrum] to read my data in.
Now I want to know if there is a smart way to transform my DataSet into a two
dimensional Array ?
Thanks in advance,
Lydia
Hi,
what jar am I missing ?
The error is:
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.flink.api.scala.ExecutionEnvironment.readCsvFile$default$4()Z
Hi all,
I want to run the data-flow Wordcount example on a Flink Cluster.
The local execution with „mvn exec:exec -Dinput=kinglear.txt
-Doutput=wordcounts.txt“ is already working.
How is the command to execute it on the cluster?
Best regards,
Lydia
I am really trying to get HBase to work...
Is there maybe a tutorial for all the config files?
Best regards,
Lydia
> Am 23.09.2015 um 17:40 schrieb Maximilian Michels :
>
> In the issue, it states that it should be sufficient to append the
> hbase-protocol.jar file to the Hado
Hi I tried that but unfortunately it still gets stuck at the second split.
Can it be that I have set something in my configurations wrong? In Hadoop? Or
Flink?
The strange thing is that the HBaseWriteExample works great!
Best regards,
Lydia
> Am 23.09.2015 um 17:40 schrieb Maximilian Mich
oop.home.dir’ (-Dhadoop.home.dir=…)
>
> – Ufuk
>
>> On 23 Sep 2015, at 12:43, Lydia Ickler wrote:
>>
>> Hi all,
>>
>> I get the following error message that no valid hadoop home directory can be
>> found when trying to initialize the HBase configuration.
>
Hi all,
I get the following error message that no valid hadoop home directory can be
found when trying to initialize the HBase configuration.
Where would I specify that path?
12:41:02,043 INFO org.apache.flink.addons.hbase.TableInputFormat
- Initializing HBaseConfiguration
12:41
down.
Does anyone has an idea why this is happening?
Best regards,
Lydia
22:28:10,178 DEBUG org.apache.flink.runtime.operators.DataSourceTask
- Opening input split Locatable Split (2) at [grips5:60020]: DataSource (at
createInput(ExecutionEnvironment.java:502
Hi Ufuk,
yes, I figured out that the HMaster of hbase did not start properly!
Now everything is working :)
Thanks for your help!
Best regards,
Lydia
> Am 27.07.2015 um 11:45 schrieb Ufuk Celebi :
>
> Any update on this Lydia?
>
> On 23 Jul 2015, at 16:38, Ufu
Hi Ufuk,
no, I don’t mind!
Where would I change the log level?
Best regards,
Lydia
> Am 23.07.2015 um 14:41 schrieb Ufuk Celebi :
>
> Hey Lydia,
>
> it looks like the HBase client is losing its connection to HBase. Before
> that, everything seems to be working just fine (X
Hi,
I am trying to read data from a HBase Table via the HBaseReadExample.java
Unfortunately, my run gets always stuck at the same position.
Do you guys have any suggestions?
In the master node it says:
14:05:04,239 INFO org.apache.flink.runtime.jobmanager.JobManager
- Received job bb9
ently the RPC message is very
> large.
>
> Is the data that you request in one row?
>
> Am 18.07.2015 00:50 schrieb "Lydia Ickler" <mailto:ickle...@googlemail.com>>:
> Hi all,
>
> I am trying to read a data set from HBase within a cluster application
space
and then it just closes the zookeeper…
Do you have a suggestion how to avoid this OutOfMemoryError?
Best regards,
Lydia
Hi guys,
is it possible to convert a Java DataSet to a Scala Dataset?
Right now I get the following error:
Error:(102, 29) java: incompatible types:
'org.apache.flink.api.java.DataSet cannot be converted to
org.apache.flink.api.scala.DataSet‘
Thanks in advance,
Lydia
there should Be a more sophisticated way.
Do you have a code snippet or an idea how to do so?
Many thanks in advance and best regards,
Lydia
58 matches
Mail list logo