Thanks dbtsai for the info.
Are you using the case class for:
Case(response, vec) => ?
Also, what library do I need to import to use .toBreeze ?
Thanks,
tri
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Friday, December 12, 2014 3:27 PM
To: Bui, Tri
Cc: [email protected]
Subject: Re: Do I need to applied feature scaling via StandardScaler for LBFGS
for Linear Regression?
You can do something like the following.
val rddVector = input.map({
case (response, vec) => {
val newVec = MLUtils.appendBias(vec)
newVec.toBreeze(newVec.size - 1) = response
newVec
}
}
val scalerWithResponse = new StandardScaler(true, true).fit(rddVector)
val trainingData = scalerWithResponse.transform(rddVector).map(x=> {
(x(x.size - 1), Vectors.dense(x.toArray.slice(0, x.size -1))
})
Sincerely,
DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Fri, Dec 12, 2014 at 12:23 PM, Bui, Tri <[email protected]> wrote:
> Thanks for the info.
>
> How do I use StandardScaler() to scale example data (10246.0,[14111.0,1.0]) ?
>
> Thx
> tri
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Friday, December 12, 2014 1:26 PM
> To: Bui, Tri
> Cc: [email protected]
> Subject: Re: Do I need to applied feature scaling via StandardScaler for
> LBFGS for Linear Regression?
>
> It seems that your response is not scaled which will cause issue in LBFGS.
> Typically, people train Linear Regression with zero-mean/unit-variable
> feature and response without training the intercept. Since the response is
> zero-mean, the intercept will be always zero. When you convert the
> coefficients to the oringal space from the scaled space, the intercept can be
> computed by w0 = y - \sum <x_n> w_n where <x_n> is the average of column n.
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Fri, Dec 12, 2014 at 10:49 AM, Bui, Tri <[email protected]>
> wrote:
>> Thanks for the confirmation.
>>
>> Fyi..The code below works for similar dataset, but with the feature
>> magnitude changed, LBFGS converged to the right weights.
>>
>> Example, time sequential Feature value 1, 2, 3, 4, 5, would generate the
>> error while sequential feature 14111, 14112, 14113,14115 would converge to
>> the right weight. Why?
>>
>> Below is code to implement standardscaler() for sample data
>> (10246.0,[14111.0,1.0])):
>>
>> val scaler1 = new StandardScaler().fit(train.map(x => x.features))
>> val
>> train1 = train.map(x => (x.label, scaler1.transform(x.features)))
>>
>> But I keeps on getting error: "value features is not a member of (Double,
>> org.apache.spark.mllib.linalg.Vector)"
>>
>> Should my feature vector be .toInt instead of Double?
>>
>> Also, the error org.apache.spark.mllib.linalg.Vector should have an
>> "s" to match import library org.apache.spark.mllib.linalg.Vectors
>>
>> Thanks
>> Tri
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> Sent: Friday, December 12, 2014 12:16 PM
>> To: Bui, Tri
>> Cc: [email protected]
>> Subject: Re: Do I need to applied feature scaling via StandardScaler for
>> LBFGS for Linear Regression?
>>
>> You need to do the StandardScaler to help the convergency yourself.
>> LBFGS just takes whatever objective function you provide without doing any
>> scaling. I will like to provide LinearRegressionWithLBFGS which does the
>> scaling internally in the nearly feature.
>>
>> Sincerely,
>>
>> DB Tsai
>> -------------------------------------------------------
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>
>>
>> On Fri, Dec 12, 2014 at 8:49 AM, Bui, Tri
>> <[email protected]> wrote:
>>> Hi,
>>>
>>>
>>>
>>> Trying to use LBFGS as the optimizer, do I need to implement feature
>>> scaling via StandardScaler or does LBFGS do it by default?
>>>
>>>
>>>
>>> Following code generated error “ Failure again! Giving up and
>>> returning, Maybe the objective is just poorly behaved ?”.
>>>
>>>
>>>
>>> val data = sc.textFile("file:///data/Train/final2.train")
>>>
>>> val parsedata = data.map { line =>
>>>
>>> val partsdata = line.split(',')
>>>
>>> LabeledPoint(partsdata(0).toDouble, Vectors.dense(partsdata(1).split('
>>> ').map(_.toDouble)))
>>>
>>> }
>>>
>>>
>>>
>>> val train = parsedata.map(x => (x.label,
>>> MLUtils.appendBias(x.features))).cache()
>>>
>>>
>>>
>>> val numCorrections = 10
>>>
>>> val convergenceTol = 1e-4
>>>
>>> val maxNumIterations = 50
>>>
>>> val regParam = 0.1
>>>
>>> val initialWeightsWithIntercept = Vectors.dense(new
>>> Array[Double](2))
>>>
>>>
>>>
>>> val (weightsWithIntercept, loss) = LBFGS.runLBFGS(train,
>>>
>>> new LeastSquaresGradient(),
>>>
>>> new SquaredL2Updater(),
>>>
>>> numCorrections,
>>>
>>> convergenceTol,
>>>
>>> maxNumIterations,
>>>
>>> regParam,
>>>
>>> initialWeightsWithIntercept)
>>>
>>>
>>>
>>> Did I implement LBFGS for Linear Regression via “LeastSquareGradient()”
>>> correctly?
>>>
>>>
>>>
>>> Thanks
>>>
>>> Tri
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected] For
>> additional commands, e-mail: [email protected]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For
> additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional
commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]