Hi Till,
you're right, my implementation wouldn't scale well for a very large number of
features.Thank you for that hint! However, i'm not using that much features, so
this shouldn't be the cause for the strange behaviour.
Yes, the 30 Minutes is the time for all jobs together. It's the time di
Hi Dan,
first a general remark: I fear that your L-BFGS implementation is not well
suited for large scale problems. You might wanna take a look at [1].
In the case of the while loop solution you're actually executing n jobs
with n being the number of iterations. Thus, you have to add the executio
Usually, the while loop solution should perform much worse since it will
execute with each new iteration all previous iterations steps without
persisting the intermediate results. Thus, it should have a quadratic
complexity in terms of iteration step operations instead of a linear
complexity. Addit
Have you tried profiling the application to see where most of the time is
spent during the runs?
If most of the time is spent reading in the data maybe any difference
between the two methods is being obscured.
--
Sent from a mobile device. May contain autocorrect errors.
On Sep 6, 2016 4:55 PM,
Hi Dan,
Flink currently allocates each task slot an equal portion of managed
memory. I don't know the best way to count task slots.
https://ci.apache.org/projects/flink/flink-docs-master/concepts/index.html#workers-slots-resources
If you assign TaskManagers less memory then Linux will use the me
Hi,
I am not broadcasting the data but the model, i.e. the weight vector
contained in the "State".
You are right, it would be better for the implementation with the while
loop to have the data on HDFS. But that's exactly the point of my
question: Why are the Flink Iterations not faster if you do
Hello Dan,
are you broadcasting the 85GB of data then? I don't get why you wouldn't
store that file on HDFS so it's accessible by your workers.
If you have the full code available somewhere we might be able to help
better.
For L-BFGS you should only be broadcasting the model (i.e. the weight
ve
Hi Greg,
thanks for your response!
I just had a look and realized that it's just about 85 GB of data. Sorry
about that wrong information.
It's read from a csv file on the master node's local file system. The 8
nodes have more than 40 GB available memory each and since the data is
equally di
Hi Dan,
Where are you reading the 200 GB "data" from? How much memory per node? If
the DataSet is read from a distributed filesystem and if with iterations
Flink must spill to disk then I wouldn't expect much difference. About how
many iterations are run in the 30 minutes? I don't know that this i