use yarn :)
"spark-submit --master yarn"
On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni wrote:
> So what was the answer?
>
>
>
> Sent from my Verizon, Samsung Galaxy smartphone
>
> Original message
> From: Andrew Holway
> Date: 1/1
Darn. I didn't respond to the list. Sorry.
On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni wrote:
> thanks Neil. I followed original suggestion from Andrw and everything is
> working fine now
> kr
>
> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers wrote:
>
>> Hello,
>>
>> Can you drop the url:
Hey,
I am making some calls with Boto3 in my pyspark which is working fine in
master=local mode but when I switch to master=yarn I am getting
"NoCredentialsError: Unable to locate credentials" which is a bit annoying
as I cannot work out why!
I have been running this application fine on Mesos and
> .add("timezone", StringType).add("day", StringType)
> .add("minute", StringType)
>
> val jsonContentWithSchema = sqlContext.jsonRDD(jsonRdd, schema)
> println(s"- And the Json withSchema has
> ${jsonConten
n a spark code
>
> 2 - try to replace your distributedJsonRead. instead of reading from s3,
> generate a string out of a snippet of your json object
>
> 3 - Spark can read data from s3 as well , just do a
> sc.textFile('s3://) ==> http://www.sparktutorials.
> net/r
.protocol.Py4JError: An error occurred while calling
o33.__getnewargs__. Trace:
py4j.Py4JException: Method __getnewargs__([]) does not exist
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
at py4j.Ga
Hi,
Can anyone tell me what is causing this error
Spark 2.0.0
Python 2.7.5
df = sqlContext.createDataFrame(foo, schema)
https://gist.github.com/mooperd/368e3453c29694c8b2c038d6b7b4413a
Traceback (most recent call last):
File "/home/centos/fun-functions/spark-parrallel-read-from-s3/tick.py",
li
I'm getting this error trying to build spark on Centos7. It is not googling
very well:
[error] (tags/compile:compileIncremental) java.io.IOException: Cannot run
program
"/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/bin/javac"
(in directory "/home/spark/spark"): error=2, No such fil
Thats around 750MB/s which seems quite respectable even in this day and age!
How many and what kind of disks to you have attached to your nodes? What
are you expecting?
On Tue, Nov 8, 2016 at 11:08 PM, Elf Of Lothlorein
wrote:
> Hi
> I am trying to save a RDD to disk and I am using the
> saveAs
something that could be accomplished
with shiny server for instance?
Thanks,
Andrew Holway
I think running it on a Mesos cluster could give you better control over
this kinda stuff.
On Fri, Nov 4, 2016 at 7:41 AM, blazespinnaker
wrote:
> Is there a good method / discussion / documentation on how to sandbox a
> spark
> executor? Assume the code is untrusted and you don't want it to
Sorry: Spark 2.0.0
On Tue, Nov 1, 2016 at 10:04 AM, Andrew Holway <
andrew.hol...@otternetworks.de> wrote:
> Hello,
>
> I've been getting pretty serious with DC/OS which I guess could be
> described as a somewhat polished distribution of Mesos. I'm not sure how
Hello,
I've been getting pretty serious with DC/OS which I guess could be
described as a somewhat polished distribution of Mesos. I'm not sure how
relevant DC/OS is to this problem.
I am using this pyspark program to test the cassandra connection:
http://bit.ly/2eWAfxm (github)
I can that the df
Hi,
I am having a hard time getting to the bottom of this problem. I'm really
not sure where to start with it. Everything works fine in local mode.
Cheers,
Andrew
[testing@instance-16826 ~]$ /opt/mapr/spark/spark-1.5.2/bin/spark-submit
--num-executors 21 --executor-cores 5 --master yarn-client
Hello,
I'm not sure how appropriate job postings are to a user group.
We're getting deep into spark and are looking for some talent in our Kochi
office.
http://bit.ly/Spark-Eng - Apache Spark Engineer / Architect - Kochi
http://bit.ly/Spark-Dev - Lead Apache Spark Developer - Kochi
Sorry for th
>
> df <- read.df(sqlContext, source="jdbc",
> url="jdbc:mysql://hostname:3306?user=user&password=pass",
> dbtable="database.table")
>
I got a bit further but am now getting the following error. This error is
being thrown without the database being touched. I tested this by making
the database una
I'm managing to read data via JDBC using the following but I can't work out
how to write something back to the Database.
df <- read.df(sqlContext, source="jdbc",
url="jdbc:mysql://hostname:3306?user=user&password=pass",
dbtable="database.table")
Does this functionality exist in 1.5.2?
Thanks,
Is there a data frame operation to do this?
+-+
| A B C D |
+-+
| 1 2 3 4 |
| 5 6 7 8 |
+-+
+-+
| A B C D |
+-+
| 3 5 6 8 |
| 0 0 0 0 |
+-+
+-+
| A B C D |
+-+
| 8 8 8 8 |
| 1 1 1 1 |
+-+
Concatenated together to make this.
Hello,
I would like to make a list of files (parquet or json) in a specific
HDFS directory with python so I can do some logic on which files to
load into a dataframe.
Any ideas?
Thanks,
Andrew
-
To unsubscribe, e-mail: user-un
P.S. We are working with Python.
On Thu, Jan 21, 2016 at 8:24 PM, Andrew Holway
wrote:
> Hello,
>
> I am importing this data from HDFS into a data frame with
> sqlContext.read.json().
>
> {“a": 42, “a": 56, "Id": "621368e2f829f230", “smunkId&
Hello,
I am importing this data from HDFS into a data frame with
sqlContext.read.json().
{“a": 42, “a": 56, "Id": "621368e2f829f230", “smunkId":
"CKm26sDMucoCFReRGwodbHAAgw", “popsicleRange": "17610", "time":
"2016-01-20T23:59:53+00:00”}
I want to do some date/time operations on this json data b
21 matches
Mail list logo