t; PCA will not 'improve' clustering per se but can make it faster.
> You may want to specify what you are actually trying to optimize.
>
>
> On Tue, Aug 9, 2016, 03:23 Rohit Chaddha
> wrote:
>
>> I would rather have less features to make better inferences on t
ore dominant in you classification, you can then run your
> model again with the smaller set of features.
> The two approaches are quite different, what I'm suggesting involves
> training (supervised learning) in the context of a target function, with
> SVD you are doing unsupervis
gt;
> >> Great question Rohit. I am in my early days of ML as well and it would
> be
> >> great if we get some idea on this from other experts on this group.
> >>
> >> I know we can reduce dimensions by using PCA, but i think that does not
> >> allow us
I have a data-set where each data-point has 112 factors.
I want to remove the factors which are not relevant, and say reduce to 20
factors out of these 112 and then do clustering of data-points using these
20 factors.
How do I do these and how do I figure out which of the 20 factors are
useful fo
The predict method takes a Vector object
I am unable to figure out how to make this spark vector object for getting
predictions from my model.
Does anyone has some code in java for this ?
Thanks
Rohit
---
T E S T S
---
Running org.apache.spark.api.java.OptionalSuite
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.052 sec -
in org.apache.spark.api.java.OptionalSuite
Running o
I have a custom object called A and corresponding Dataset
when I call datasetA.show() method i get the following
+++-+-+---+
|id|da|like|values|uid|
+++-+-+---+
|A.toString()...|
|A.toString().
After looking at the comments - I am not sure what the proposed fix is ?
On Fri, Jul 29, 2016 at 12:47 AM, Sean Owen wrote:
> Ah, right. This wasn't actually resolved. Yeah your input on 15899
> would be welcome. See if the proposed fix helps.
>
> On Thu, Jul 28, 2016 at 11:52
On Fri, Jul 29, 2016 at 12:06 AM, Rohit Chaddha
wrote:
> I am simply trying to do
> session.read().json("file:///C:/data/a.json");
>
> in 2.0.0-preview it was working fine with
> sqlContext.read().json("C:/data/a.json");
>
>
> -Rohit
>
> On Fri, J
rtainly be an absolute
> URI with an absolute path. What exactly is your input value for this
> property?
>
> On Thu, Jul 28, 2016 at 11:28 AM, Rohit Chaddha
> wrote:
> > Hello Sean,
> >
> > I have tried both file:/ and file:///
> > Bit it does not work an
Hello Sean,
I have tried both file:/ and file:///
Bit it does not work and give the same error
-Rohit
On Thu, Jul 28, 2016 at 11:51 PM, Sean Owen wrote:
> IIRC that was fixed, in that this is actually an invalid URI. Use
> file:/C:/... I think.
>
> On Thu, Jul 28, 2016 at 10:
My bad. Please ignore this question.
I accidentally reverted to sparkContext causing the issue
On Thu, Jul 28, 2016 at 11:36 PM, Rohit Chaddha
wrote:
> In spark 2.0 there is an addtional parameter of type ClassTag in the
> broadcast method of the sparkContext
>
> What is this vari
In spark 2.0 there is an addtional parameter of type ClassTag in the
broadcast method of the sparkContext
What is this variable and how to do broadcast now?
here is my exisitng code with 2.0.0-preview
Broadcast> b = jsc.broadcast(u.collectAsMap());
what changes needs to be done in 2.0 for this
I upgraded from 2.0.0-preview to 2.0.0
and I started getting the following error
Caused by: java.net.URISyntaxException: Relative path in absolute URI:
file:C:/ibm/spark-warehouse
Any ideas how to fix this
-Rohit
It is present in mlib but I don't seem to find it in ml package.
Any suggestions please ?
-Rohit
Hi Krishna,
Great .. I had no idea about this. I tried your suggestion by using
na.drop() and got a rmse = 1.5794048211812495
Any suggestions how this can be reduced and the model improved ?
Regards,
Rohit
On Mon, Jul 25, 2016 at 4:12 AM, Krishna Sankar wrote:
> Thanks Nick. I also ran into t
Great thanks both of you. I was struggling with this issue as well.
-Rohit
On Mon, Jul 25, 2016 at 4:12 AM, Krishna Sankar wrote:
> Thanks Nick. I also ran into this issue.
> VG, One workaround is to drop the NaN from predictions (df.na.drop()) and
> then use the dataset for the evaluator. In
17 matches
Mail list logo