; > > > > In the Flink frame work I would map this to a LabeledVector (y,
>> > > > > > DenseVector(x)). (I don't want to use the id as a feature)
>> > > > > >
>> > > > > > When I apply finally the predict() method I get a LabeledVector
>> > > > > > (y_predicted, DenseVector(x)).
>> > > > > >
>> > > > > > Now my problem is that I would like to plot the predicted target
>> > > value
>> > > > > > according to its time.
>> > > > > >
>> > > > > > What I have to do now is:
>> > > > > >
>> > > > > > a = predictedDataSet.map ( LabeledVector => Tuple2(x,y_p))
>> > > > > > b = originalDataSet.map("id, x1, x2, ..., xn, y" => Tuple2(x,id))
>> > > > > >
>> > > > > > a.join(b).where("x").equalTo("x") { (a,b) => (id, y_p)
>> > > > > >
>> > > > > > This is really a cumbersome process for such an simple thing. Is
>> > > there
>> > > > > any
>> > > > > > approach which makes this more simple. If not, can we extend the
>> ML
>> > > > API.
>> > > > > to
>> > > > > > allow ids?
>> > > > > >
>> > > > > > Best regards,
>> > > > > > Felix
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
--
Mikio Braun - http://blog.mikiobraun.de, http://twitter.com/mikiobraun
le learning engines at the same time with different learning rates is
>> pretty plausible.
>>
>> Also, using something like adagrad will knock down high learning rates very
>> quickly if you get a nearly divergent step. This can make initially high
>> learning rates quite plausible.
>>
--
Mikio Braun - http://blog.mikiobraun.de, http://twitter.com/mikiobraun
d adadelta. All are pretty
>> easy to implement.
>>
>> Here is some visualization of various methods that provides some insights:
>> http://imgur.com/a/Hqolp
>>
>> Vowpal wabbit has some tricks that allow very large initial learning rates
>> to be used w
We should probably look into this nevertheless. Requiring full grid search for
a simple algorithm like mlr sounds like overkill.
Do you have written down the math of your implementation somewhere?
-M
- Ursprüngliche Nachricht -
Von: "Till Rohrmann"
Gesendet: 02.06.2015 11:31
An: "de
Mikio Braun created FLINK-2117:
--
Summary: Add a set of data generators
Key: FLINK-2117
URL: https://issues.apache.org/jira/browse/FLINK-2117
Project: Flink
Issue Type: Improvement
Mikio Braun created FLINK-2116:
--
Summary: Make pipeline extension require less coding
Key: FLINK-2116
URL: https://issues.apache.org/jira/browse/FLINK-2116
Project: Flink
Issue Type
ising as L1 is not differentiable everywhere and you'd have to
use different regularizations... .
So it probably makes sense to separate the loss from the cost function
(which is then only defined by the model and the loss function), and
have the regularization extra.
-M
--
Mikio Br
te: Double, regularizationConstant: Double): WeightVector
>
> def regularizationValue(weigthVector: WeightVector): Double
> }
>
> Both ansätze are semantically equivalent. I have no strong preference for
> either of them. What do you think is the better approach?
>
>
> On Thu
D to train on practically anything
which has a data-dependent gradient.
What do you think?
-M
On Thu, May 28, 2015 at 4:03 PM, Mikio Braun wrote:
> Oh wait.. continue to type. accidentally sent out the message to early.
>
> On Thu, May 28, 2015 at 4:03 PM, Mikio Braun
> wrote:
>>
Oh wait.. continue to type. accidentally sent out the message to early.
On Thu, May 28, 2015 at 4:03 PM, Mikio Braun wrote:
> Hi Till and Theodore,
>
> I think the code is cleaned up a lot now, introducing the
> mapWithBcVariable helped a lot.
>
> I also get that the goal
sulate the prediction function as part of the
>>> loss function and also add the regularization function to it. This would
>>> simplify the code of SGD. A possible interface for a loss function could
>>> look like
>>>
>>> trait LossFunction {
>>>
11 matches
Mail list logo