Thank you all very much for your responses
We are going to test these recommendations.
Adnan, in regards to the HDFS URI, this is actually the manner in which we
are accessing the file system already. It was simply removed from the post.
Thank you,
Asaf
On Thu, Apr 10, 2014 at 5:33 PM, Sha
On Sat, Apr 12, 2014 at 9:19 AM, ge ko wrote:
> Hi,
>
> I'm wondering why the master is registering itself at startup, exactly 3
> times (same number as the number of workers). Log excerpt:
> ""
> 2014-04-11 21:08:15,363 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger
> started
> 2014-04-11 21:08:1
Hi Tom,
Thank you very much for your detailed explanation. I think it is very
helpful to me.
On Sat, Apr 12, 2014 at 1:06 PM, Tom V wrote:
> The last writer is suggesting using the triangle inequality to cut down
> the search space. If c is the centroid of cluster C, then the closest any
> p
The last writer is suggesting using the triangle inequality to cut down the
search space. If c is the centroid of cluster C, then the closest any
point in C is to x is ||x-c|| - r(C), where r(C) is the (precomputed)
radius of the cluster---the distance of the farthest point in C to c.
Whether you
Hi Guillaume,
This sounds a good idea to me. I am a newbie here. Could you further
explain how will you determine which clusters to keep? According to the
distance between each element with each cluster center?
Will you keep several clusters for each element for searching nearest
neighbours? Thank
In spark release 0.7.1, I added support for running multiple worker processes
on a single slave machine. I built it for performance testing multiple
workers on a single machine in standalone mode.
Set the following in conf/spark-env.sh and bounce your cluster :
export SPARK_WORKER_INSTANCES=3
Th
Thanks, Prabeesh. I figured it out. The java file did conflict with the scala
file. Thanks for the hint.
Jmaes
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Compile-SimpleApp-scala-encountered-error-please-can-any-one-help-tp4160p4168.html
Sent from the A
prabeesh
Thanks for the reply. By one copy of SimpleApp.scala, do you mean one copy
of this .scala file? I only have one in a newly create test project. I do
have one copy of SimpleApp.java but in a different directory
(src/main/java), .scala file is in src/main/scala directory. Will java and
scal
Hi,
I'm doing this here for multiple tens of millions of elements (and
the goal is to reach multiple billions), on a relatively small
cluster (7 nodes 4 cores 32GB RAM). We use multiprobe KLSH. All you
have to do is run a Kmeans on your data, then compute the distanc
Hi Reza,
Thank you for your information. I will try it.
On Fri, Apr 11, 2014 at 11:21 PM, Reza Zadeh wrote:
> Hi Xiaoli,
>
> There is a PR currently in progress to allow this, via the sampling scheme
> described in this paper: stanford.edu/~rezab/papers/dimsum.pdf
>
> The PR is at https://git
Hi,
I'm starting using Spark and have installed Spark within CDH5 using
ClouderaManager.
I set up one master (hadoop-pg-5) and 3 workers (hadoop-pg-7[-8,-9]).
Master WebUI looks good, all workers seem to be registered.
If I open "spark-shell" and try to execute the wordcount example, the
executio
Hi,
I'm wondering why the master is registering itself at startup, exactly 3
times (same number as the number of workers). Log excerpt:
""
2014-04-11 21:08:15,363 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger
started
2014-04-11 21:08:15,478 INFO Remoting: Starting remoting
2014-04-11 21:08:15,838
Hi,
I'm starting using Spark and have installed Spark within CDH5 using
ClouderaManager.
I set up one master (hadoop-pg-5) and 3 workers (hadoop-pg-7[-8,-9]).
Master WebUI looks good, all workers seem to be registered.
If I open "spark-shell" and try to execute the wordcount example, the
executio
13 matches
Mail list logo