Hi All,
I'm trying a simple K-Means example as per the website:
val parsedData = data.map(s => Vectors.dense(s.split(',').map(_.toDouble)))
but I'm trying to write a Java based validation method first so that
missing values are omitted or replaced with 0.
public RDD prepareKMeans(JavaRDD data)
ng shorter
> than 2, and convert the rest to dense vectors?...In fact if you're
> expecting length exactly 2 might want to filter ==2...
>
>
> On Thu, Jan 8, 2015 at 10:58 AM, Devl Devel
> wrote:
>
>> Hi All,
>>
>> I'm trying a simple K-Means example as
ala#L134
>
> That retags RDDs which were created from Java to prevent the exception
> you're running into.
>
> Hope this helps!
> Joseph
>
> On Thu, Jan 8, 2015 at 12:48 PM, Devl Devel
> wrote:
>
>> Thanks for the suggestion, can anyone offer any advice o
Thanks, that helps a bit at least with the NaN but the MSE is still very
high even with that step size and 10k iterations:
training Mean Squared Error = 3.3322561285919316E7
Does this method need say 100k iterations?
On Thu, Jan 15, 2015 at 5:42 PM, Robin East wrote:
> -dev, +user
>
> You
SPARK-5273
On Thu, Jan 15, 2015 at 8:23 PM, Joseph Bradley
wrote:
> It looks like you're training on the non-scaled data but testing on the
> scaled data. Have you tried this training & testing on only the scaled
> data?
>
> On Thu, Jan 15, 2015 at 10:42 AM, Devl Deve
Hi Spark Developers,
First, apologies if this doesn't belong on this list but the
comments/praise are relevant to all developers. This is just a small note
about what we really like about Spark, I/we don't mean to start a whole
long discussion thread in this forum but just share our positive
exper
Hi Dmitriy,
Thanks for the input, I think as per my previous email it would be good to
have a bridge project that for example, creates a IgniteFS RDD, similar to
the JDBC or HDFS one in which we can extract blocks and populate RDD
partitions, I'll post this proposal on your list.
Thanks
Devl
O
Hey All,
start-slaves.sh and stop-slaves.sh make use of SSH to connect to remote
clusters. Are there alternative methods to do this without SSH?
For example using:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT
is fine but there is no way to kill the Worker without usin
Hi All
We are having some trouble with:
sparkConf.set("spark.driver.userClassPathFirst","true");
sparkConf.set("spark.executor.userClassPathFirst","true");
and would appreciate some independent verification. The issue comes down to
this:
Spark 1.3.1 hadoop 2.6 is deployed on the cluster. In my
Hi All,
I've built spark 1.5.0 with hadoop 2.6 with a fresh download :
build/mvn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package
I try to run SparkR it launches the normal R without the spark addons:
./bin/sparkR --master local[*]
Picked up JAVA_TOOL_OPTIONS: -javaagent:/usr/shar
Hi
So far I've been managing to build Spark from source but since a change in
spark-streaming-flume I have no idea how to generate classes (e.g.
SparkFlumeProtocol) from the avro schema.
I have used sbt to run avro:generate (from the top level spark dir) but it
produces nothing - it just says:
>
rom my iPhone
>
> > On Aug 11, 2014, at 8:32 AM, Hari Shreedharan
> wrote:
> >
> > Jay running sbt compile or assembly should generate the sources.
> >
> >> On Monday, August 11, 2014, Devl Devel
> wrote:
> >>
> >> Hi
> >>
>
When compiling the master checkout of spark. The Intellij compile fails
with:
Error:(45, 8) not found: value $scope
^
which is caused by HTML elements in classes like HistoryPage.scala:
val content =
...
How can I compile these classes that have html node ele
13 matches
Mail list logo