subject:"createDataFrame causing a strange error."

Re: createDataFrame causing a strange error.

2016-11-29 Thread Andrew Holway

Hi Marco, I was not able to find out what was causing the problem but a "git stash" seems to have fixed it :/ Thanks for your help... :) On Mon, Nov 28, 2016 at 10:50 PM, Marco Mistroni wrote: > Hi Andrew, > sorry but to me it seems s3 is the culprit > I have downloaded your json file and

Re: createDataFrame causing a strange error.

2016-11-28 Thread Marco Mistroni

Hi Andrew, sorry but to me it seems s3 is the culprit I have downloaded your json file and stored locally. Then write this simple app (a subset of what you have in ur github, sorry i m littebit rusty on how to create new column out of existing ones) which basically read the json file It's in Sc

Re: createDataFrame causing a strange error.

2016-11-28 Thread Andrew Holway

I extracted out the boto bits and tested in vanilla python on the nodes. I am pretty sure that the data from S3 is ok. I've applied a public policy to the bucket s3://time-waits-for-no-man. There is a publicly available object here: https://s3-eu-west-1.amazonaws.com/time-waits-for-no-man/1973-01-1

Re: createDataFrame causing a strange error.

2016-11-27 Thread Marco Mistroni

Hi pickle erros normally point to serialisation issue. i am suspecting something wrong with ur S3 data , but is just a wild guess... Is your s3 object publicly available? few suggestions to nail down the problem 1 - try to see if you can read your object from s3 using boto3 library 'offline',

Re: createDataFrame causing a strange error.

2016-11-27 Thread Andrew Holway

I get a slight different error when not specifying a schema: Traceback (most recent call last): File "/home/centos/fun-functions/spark-parrallel-read-from-s3/tick.py", line 61, in df = sqlContext.createDataFrame(foo) File "/usr/hdp/2.5.0.0-1245/spark2/python/lib/pyspark.zip/pyspark/sql/co

createDataFrame causing a strange error.

2016-11-27 Thread Andrew Holway

Hi, Can anyone tell me what is causing this error Spark 2.0.0 Python 2.7.5 df = sqlContext.createDataFrame(foo, schema) https://gist.github.com/mooperd/368e3453c29694c8b2c038d6b7b4413a Traceback (most recent call last): File "/home/centos/fun-functions/spark-parrallel-read-from-s3/tick.py", li

Re: createDataFrame causing a strange error.

Re: createDataFrame causing a strange error.

Re: createDataFrame causing a strange error.

Re: createDataFrame causing a strange error.

Re: createDataFrame causing a strange error.

createDataFrame causing a strange error.

6 matches

Site Navigation

Mail list logo

Footer information