date:20180906

Re: Error in show()

2018-09-06 Thread Apostolos N. Papadopoulos

Can you isolate the row that is causing the problem? I mean start using show(31) up to show(60). Perhaps this will help you to understand the problem. regards, Apostolos On 07/09/2018 01:11 πμ, dimitris plakas wrote: Hello everyone, I am new in Pyspark and i am facing an issue. Let me expl

Re: [External Sender] Re: How to make pyspark use custom python?

2018-09-06 Thread Femi Anthony

Are you sure that pyarrow is deployed on your slave hosts ? If not, you will either have to get it installed or ship it along when you call spark-submit by zipping it up and specifying the zipfile to be shipped using the --py-files zipfile.zip option A quick check would be to ssh to a slave host,

Re: How to make pyspark use custom python?

2018-09-06 Thread mithril

The whole content in `spark-env.sh` is ``` SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=10.104.85.78:2181,10.104.114.131:2181,10.135.2.132:2181 -Dspark.deploy.zookeeper.dir=/spark" PYSPARK_PYTHON="/usr/local/miniconda3/bin/python" ``` I ran `/usr/l

Error in show()

2018-09-06 Thread dimitris plakas

Hello everyone, I am new in Pyspark and i am facing an issue. Let me explain what exactly is the problem. I have a dataframe and i apply on this a map() function (dataframe2=datframe1.rdd.map(custom_function()) dataframe = sqlContext.createDataframe(dataframe2) when i have dataframe.show(30,True

Re: getting error: value toDF is not a member of Seq[columns]

2018-09-06 Thread Mich Talebzadeh

Ok somehow this worked! // Save prices to mongoDB collection val document = sparkContext.parallelize((1 to 1). map(i => Document.parse(s"{key:'$key',ticker:'$ticker',timeissued:'$timeissued',price:$price,CURRENCY:'$CURRENCY',op_type:$op_type,op

Re: How to make pyspark use custom python?

2018-09-06 Thread Patrick McCarthy

It looks like for whatever reason your cluster isn't using the python you distributed, or said distribution doesn't contain what you think. I've used the following with success to deploy a conda environment to my cluster at runtime: https://henning.kropponline.de/2016/09/24/running-pyspark-with-co

Re: CBO not working for Parquet Files

2018-09-06 Thread emlyn

rajat mishra wrote > When I try to computed the statistics for a query where partition column > is in where clause, the statistics returned contains only the sizeInBytes > and not the no of rows count. We are also having the same issue. We have our data in partitioned parquet files and were hoping

Re: getting error: value toDF is not a member of Seq[columns]

2018-09-06 Thread Mich Talebzadeh

thanks if you define columns class as below scala> case class columns(KEY: String, TICKER: String, TIMEISSUED: String, *PRICE: Double)* defined class columns scala> var df = Seq(columns("key", "ticker", "timeissued", 1.23f)).toDF df: org.apache.spark.sql.DataFrame = [KEY: string, TICKER: string .

Re: getting error: value toDF is not a member of Seq[columns]

2018-09-06 Thread Jungtaek Lim

This code works with Spark 2.3.0 via spark-shell. scala> case class columns(KEY: String, TICKER: String, TIMEISSUED: String, PRICE: Float) defined class columns scala> import spark.implicits._ import spark.implicits._ scala> var df = Seq(columns("key", "ticker", "timeissued", 1.23f)).toDF 18/09/

Re: getting error: value toDF is not a member of Seq[columns]

2018-09-06 Thread Mich Talebzadeh

I am trying to understand why spark cannot convert a simple comma separated columns as DF. I did a test I took one line of print and stored it as a one liner csv file as below var allInOne = key+","+ticker+","+timeissued+","+price println(allInOne) cat crap.csv 6e84b11d-cb03-44c0-aab6-37e06e06c

Unsubscribe

2018-09-06 Thread Anu B Nair

Hi, I have tried all possible way to unsubscripted from this group. Can anyone help? -- Anu

Re: Error in show()

Re: [External Sender] Re: How to make pyspark use custom python?

Re: How to make pyspark use custom python?

Error in show()

Re: getting error: value toDF is not a member of Seq[columns]

Re: How to make pyspark use custom python?

Re: CBO not working for Parquet Files

Re: getting error: value toDF is not a member of Seq[columns]

Re: getting error: value toDF is not a member of Seq[columns]

Re: getting error: value toDF is not a member of Seq[columns]

Unsubscribe

11 matches

Site Navigation

Mail list logo

Footer information