Hi all,Can somebody point me to the implementation of predict() in
LogisticRegressionModel of spark mllib? I could find a predictPoint() in the
class LogisticRegressionModel, but where is predict()?
Thanks & Regards, Meethu M
Hi,
We are using Mesos fine grained mode because we can have multiple instances of
spark to share machines and each application get resources dynamically
allocated. Thanks & Regards, Meethu M
On Wednesday, 4 November 2015 5:24 AM, Reynold Xin
wrote:
If you are using Spark with M
Hi,
Please refer the Java code examples at
http://spark.apache.org/docs/latest/ml-guide.html#example-pipeline . You can
add new stages to the pipeline as shown in this example. Thanks & Regards,
Meethu M
On Monday, 12 October 2015 1:52 PM, Nethaji Chandrasiri
wrote:
Hi,
Are th
Try coalesce(1) before writing Thanks & Regards, Meethu M
On Tuesday, 15 September 2015 6:49 AM, java8964
wrote:
#yiv1620377612 #yiv1620377612 --.yiv1620377612hmmessage
P{margin:0px;padding:0px;}#yiv1620377612
body.yiv1620377612hmmessage{font-size:12pt;font-family:Calibri;}#yiv162
g 12, 2015 at 3:08 PM, Burak Yavuz wrote:
Are you running from master? Could you delete line 222 of
make-distribution.sh?We updated when we build sparkr.zip. I'll submit a fix for
it for 1.5 and master.
Burak
On Wed, Aug 12, 2015 at 3:31 AM, MEETHU MATHEW wrote:
Hi, I am trying to create
Hi,Try using coalesce(1) before calling saveAsTextFile() Thanks & Regards,
Meethu M
On Wednesday, 5 August 2015 7:53 AM, Brandon White
wrote:
What is the best way to make saveAsTextFile save as only a single file?
Hi,
I am getting the assertion error while trying to run build/sbt unidoc same as
you described in Building scaladoc using "build/sbt unidoc" failure .Could you
tell me how you get it working ?
| |
| | | | | |
| Building scaladoc using "build/sbt unidoc" failureHello,I am trying to bu
Try using coalesce Thanks & Regards,
Meethu M
On Wednesday, 3 June 2015 11:26 AM, "ÐΞ€ρ@Ҝ (๏̯͡๏)"
wrote:
I am running a series of spark functions with 9000 executors and its resulting
in 9000+ files that is execeeding the namespace file count qutota.
How can Spark be configured to
hreads within
> a function or you want run multiple jobs using multiple threads? I am
> wondering why python thread module can't be used? Or you have already gave
> it a try?
>
> On 18 May 2015 16:39, "MEETHU MATHEW" wrote:
>>
>> Hi Akhil,
>>
>>
Hi,I think you cant supply an initial set of centroids to kmeans Thanks &
Regards,
Meethu M
On Friday, 15 May 2015 12:37 AM, Suman Somasundar
wrote:
Hi,,
I want to run a definite number of iterations in Kmeans. There is a command
line argument to set maxIterations, but even if I
you happened to have a look at the spark job server? Someone wrote a
python wrapper around it, give it a try.
ThanksBest Regards
On Thu, May 14, 2015 at 11:10 AM, MEETHU MATHEW wrote:
Hi all,
Quote "Inside a given Spark application (SparkContext instance), multiple
parallel job
Hi all,
Quote "Inside a given Spark application (SparkContext instance), multiple
parallel jobs can run simultaneously if they were submitted from separate
threads. "
How to run multiple jobs in one SPARKCONTEXT using separate threads in pyspark?
I found some examples in scala and java, but co
Hi all,
I started spark-shell in spark-1.3.0 and did some actions. The UI was showing 8
cores under the running applications tab. But when I exited the spark-shell
using exit, the application is moved to completed applications tab and the
number of cores is 0. Again when I exited the spark-shell
Hi,
I am trying to run examples of spark(master branch from git) from
Intellij(14.0.2) but facing errors. These are the steps I followed:
1. git clone the master branch of apache spark.2. Build it using mvn
-DskipTests clean install3. In Intellij select Import Projects and choose the
POM.xml
Hi,
I am not able to read from HDFS(Intel distribution hadoop,Hadoop version is
1.0.3) from spark-shell(spark version is 1.2.1). I built spark using the
commandmvn -Dhadoop.version=1.0.3 clean package and started spark-shell and
read a HDFS file using sc.textFile() and the exception is
WARN
Hi,Try this.Change spark-mllib to spark-mllib_2.10
libraryDependencies ++=Seq( "org.apache.spark" % "spark-core_2.10" % "1.1.1"
"org.apache.spark" % "spark-mllib_2.10" % "1.1.1" )
Thanks & Regards,
Meethu M
On Friday, 12 December 2014 12:22 PM, amin mohebbi
wrote:
I'm trying to bu
ovember 2014 2:39 PM, MEETHU MATHEW
wrote:
Hi,I have a similar problem.I modified the code in mllib and examples.I did
mvn install -pl mllib mvn install -pl examples
But when I run the program in examples using run-example,the older version of
mllib (before the changes were ma
Hi,I have a similar problem.I modified the code in mllib and examples.I did mvn
install -pl mllib mvn install -pl examples
But when I run the program in examples using run-example,the older version of
mllib (before the changes were made) is getting executed.How to get the changes
made in mllib
Hi,
I was also trying Ispark..But I couldnt even start the notebook..I am getting
the following error.
ERROR:tornado.access:500 POST /api/sessions (127.0.0.1) 10.15ms
referer=http://localhost:/notebooks/Scala/Untitled0.ipynb
How did you start the notebook?
Thanks & Regards,
Meethu M
O
Hi,
This question was asked earlier and I did it in the way specified..I am
getting java.lang.ClassNotFoundException..
Can somebody explain all the steps required to build a spark app using IntelliJ
(latest version)starting from creating the project to running it..I searched a
lot but couldnt
Hi all,
My code was working fine in spark 1.0.2 ,but after upgrading to 1.1.0, its
throwing exceptions and tasks are getting failed.
The code contains some map and filter transformations followed by groupByKey
(reduceByKey in another code ). What I could find out is that the code works
fine un
Try to set --total-executor-cores to limit how many total cores it can use.
Thanks & Regards,
Meethu M
On Thursday, 2 October 2014 2:39 AM, Akshat Aranya wrote:
I guess one way to do so would be to run >1 worker per node, like say, instead
of running 1 worker and giving it 8 cores, you c
Hi all,
I need the kmeans code written against Pyspark for some testing purpose.
Can somebody tell me the difference between these two files.
spark-1.0.1/examples/src/main/python/kmeans.py and
spark-1.0.1/python/pyspark/mllib/clustering.py
Thanks & Regards,
Meethu M
,")
((fileds(0),fileds(1)), fileds(2).toDouble)
})
val d2 = d1.reduceByKey(_+_)
d2.foreach(println)
2014-08-28 20:04 GMT+08:00 MEETHU MATHEW :
Hi all,
>
>
>I have an RDD which has values in the format "id,date,cost".
>
>
>I want to group the elements
Hi all,
I have an RDD which has values in the format "id,date,cost".
I want to group the elements based on the id and date columns and get the sum
of the cost for each group.
Can somebody tell me how to do this?
Thanks & Regards,
Meethu M
Hi,
Plz give a try by changing the worker memory such that worker memory>executor
memory
Thanks & Regards,
Meethu M
On Friday, 22 August 2014 5:18 PM, Yadid Ayzenberg wrote:
Hi all,
I have a spark cluster of 30 machines, 16GB / 8 cores on each running in
standalone mode. Previously my
Hi ,
How to increase the heap size?
What is the difference between spark executor memory and heap size?
Thanks & Regards,
Meethu M
On Monday, 18 August 2014 12:35 PM, Akhil Das
wrote:
I believe spark.shuffle.memoryFraction is the one you are looking for.
spark.shuffle.memoryFraction
Hi all,
Sorry for taking this topic again,still I am confused on this.
I set SPARK_DAEMON_JAVA_OPTS="-XX:+UseCompressedOops -Xmx8g"
when I run my application,I got the following line in logs.
Spark Command: java -cp
::/usr/local/spark-1.0.1/conf:/usr/local/spark-1.0.1/assembly
Hi,
Instead of spark://10.1.3.7:7077 use spark://vmsparkwin1:7077 try this
$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master
> spark://vmsparkwin1:7077 --executor-memory 1G --total-executor-cores 2
> ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 10
Thanks & Regards,
Meethu
17, 2014 at 1:35 PM, MEETHU MATHEW wrote:
>
> Hi all,
>
>
>I just upgraded to spark 1.0.1. In spark 1.0.0 when I start Ipython notebook
>using the following command,it used to come in the running applications tab in
>master:8080 web UI.
>
>
>IPYTHON_OPTS="noteboo
Hi all,
I just upgraded to spark 1.0.1. In spark 1.0.0 when I start Ipython notebook
using the following command,it used to come in the running applications tab in
master:8080 web UI.
IPYTHON_OPTS="notebook --pylab inline" $SPARK_HOME/bin/pyspark
But now when I run it,its not getting listed
Hi all,
I want to know how collect() works, and how it is different from take().I am
just reading a file of 330MB which has 43lakh rows with 13 columns and calling
take(430) to save to a variable.But the same is not working with
collect().So is there any difference in the operation of both.
set SPARK_PUBLIC_DNS or something of
that kin? This error suggests the worker is trying to bind a server on the
master's IP, which clearly doesn't make sense
On Mon, Jun 30, 2014 at 11:59 PM, MEETHU MATHEW wrote:
Hi,
>
>
>I did netstat -na | grep 192.168.125.174 and its
which is just a list of your hosts, one per
line. Then you can just use "./sbin/start-slaves.sh" to start the worker on all
of your machines.
Note that this is already setup correctly if you're using the spark-ec2 scripts.
On Tue, Jul 1, 2014 at 5:53 AM, MEETHU MATHEW w
128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m
org.apache.spark.deploy.worker.Worker spark://x.x.x.174:7077
Thanks
Best Regards
On Tue, Jul 1, 2014 at 6:08 PM, MEETHU MATHEW wrote:
>
> Hi ,
>
>
>I am using Spark Standalone mode with one master and 2 slaves.I am not able
>to start the wor
Hi ,
I am using Spark Standalone mode with one master and 2 slaves.I am not able to
start the workers and connect it to the master using
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://x.x.x.174:7077
The log says
Exception in thread "main" org.jboss.netty.channel.ChannelE
khil Das wrote:
Are you sure you have this ip 192.168.125.174 bind for that machine? (netstat
-na | grep 192.168.125.174)
Thanks
Best Regards
On Mon, Jun 30, 2014 at 5:34 PM, MEETHU MATHEW wrote:
Hi all,
>
>
>I reinstalled spark,reboot the system,but still I am not able
Hi,
Try setting driver-java-options with spark-submit or set
spark.executor.extraJavaOptions in spark-default.conf
Thanks & Regards,
Meethu M
On Monday, 30 June 2014 1:28 PM, hansen wrote:
Hi,
When i send the following statements in spark-shell:
val file =
sc.textFile("hdfs://names
m -Xmx512m
org.apache.spark.deploy.worker.Worker spark://master:7077
Can somebody tell me a solution.
Thanks & Regards,
Meethu M
On Friday, 27 June 2014 4:28 PM, MEETHU MATHEW wrote:
Hi,
ya I tried setting another PORT also,but the same problem..
master is set in etc/hosts
Thanks & Regards,
Meethu
74:0 :/
Check the ip address of that master machine (ifconfig) looks like the ip
address has been changed (hoping you are running this machines on a LAN)
Thanks
Best Regards
On Fri, Jun 27, 2014 at 12:00 PM, MEETHU MATHEW wrote:
Hi all,
>
>
>My Spark(Standalone mode) was runn
Hi all,
My Spark(Standalone mode) was running fine till yesterday.But now I am getting
the following exeception when I am running start-slaves.sh or start-all.sh
slave3: failed to launch org.apache.spark.deploy.worker.Worker:
slave3: at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Thre
Hi all,
I have a doubt regarding the options in spark-env.sh. I set the following
values in the file in master and 2 workers
SPARK_WORKER_MEMORY=7g
SPARK_EXECUTOR_MEMORY=6g
SPARK_DAEMON_JAVA_OPTS+="- Dspark.akka.timeout=30
-Dspark.akka.frameSize=1 -Dspark.blockManagerHeartBeatMs=80
ffle.spill to false?
>
>
>
>2014-06-17 5:59 GMT-07:00 MEETHU MATHEW :
>
>
>
>>
>> Hi all,
>>
>>
>>I want to do a recursive leftOuterJoin between an RDD (created from file)
>>with 9 million rows(size of the file is 100MB) and 30 other RDDs(c
Hi Jianshi,
I have used wild card characters (*) in my program and it worked..
My code was like this
b = sc.textFile("hdfs:///path to file/data_file_2013SEP01*")
Thanks & Regards,
Meethu M
On Wednesday, 18 June 2014 9:29 AM, Jianshi Huang
wrote:
It would be convenient if Spark's textFi
Hi all,
I want to do a recursive leftOuterJoin between an RDD (created from file)
with 9 million rows(size of the file is 100MB) and 30 other RDDs(created from
30 diff files in each iteration of a loop) varying from 1 to 6 million rows.
When I run it for 5 RDDs,its running successfully in
4 AM, MEETHU MATHEW wrote:
> Hi,
> I am getting ArrayIndexOutOfBoundsException while reading from bz2 files in
> HDFS.I have come across the same issue in JIRA at
> https://issues.apache.org/jira/browse/SPARK-1861, but it seems to be
> resolved. I have tried the workaround suggested(S
Akhil Das wrote:
Can you paste the piece of code!?
Thanks
Best Regards
On Mon, Jun 9, 2014 at 5:24 PM, MEETHU MATHEW wrote:
Hi,
>I am getting ArrayIndexOutOfBoundsException while reading from bz2 files in
>HDFS.I have come across the same issue in JIRA at
>https://issues.apac
Hi,
I am getting ArrayIndexOutOfBoundsException while reading from bz2 files in
HDFS.I have come across the same issue in JIRA at
https://issues.apache.org/jira/browse/SPARK-1861, but it seems to be resolved.
I have tried the workaround suggested(SPARK_WORKER_CORES=1),but its still
showing err
Hi,
I want to know how I can stop a running SparkContext in a proper way so that
next time when I start a new SparkContext, the web UI can be launched on the
same port 4040.Now when i quit the job using ctrl+z the new sc are launched in
new ports.
I have the same problem with ipython notebook.
Hi ,
I am currently using SPARK 0.9 configured with the Hadoop 1.2.1 cluster.What
should I do if I want to upgrade it to spark 1.0.0?Do I need to download the
latest version and replace the existing spark with new one and make the
configuration changes again from the scratch or is there any oth
50 matches
Mail list logo