I am trying to overwrite a spark dataframe using the following option but I
am not successful
spark_df.write.format('com.databricks.spark.csv').option("header",
"true",mode='overwrite').save(self.output_file_path)
the mode=overwrite command is not successful
--
Warm regards,
Devesh.
Hi,
Can we create dummy variables for categorical variables in sparkR like we
do using "dummies" package in R
--
Warm regards,
Devesh.
Hi,
I want to create average of numerical columns in iris dataset using sparkR
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages"
"com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"')
library(SparkR)
sc=sparkR.init(master="local",sparkHome =
"/Users/devesh/Downloads/spark-1.4.1-bin-hadoop2.6",sparkP
Hi,
I have applied the following code on airquality dataset available in R ,
which has some missing values. I want to omit the rows which has NAs
library(SparkR) Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages"
"com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')
sc <- sparkR.init("local",sparkHo
sible that createDataFrame converts R's values to null, so dropna()
> works with that. But perhaps read.df() does not convert R s to null, as
> those are most likely interpreted as strings when they come in from the
> csv. Just a guess, can anyone confirm?
>
> Deb
>
&g
iltered_aq <- filter(aq, aq$Ozone != "NA" & aq$Solar_R != "NA")
>> head(filtered_aq)
>>
>> Perhaps it would be better to have an option for read.df to convert any
>> "NA" it encounters into null types, like createDataFrame does for , and
&g
e does for , and
> then one would be able to use dropna() etc.
>
>
>
> On Mon, Jan 25, 2016 at 3:24 AM, Devesh Raj Singh
> wrote:
>
>> Hi,
>>
>> Yes you are right.
>>
>> I think the problem is with reading of csv files. read.df is not
>> c
Hi,
I want to merge 2 dataframes in sparkR columnwise similar to cbind in R. We
have "unionAll" for r bind but could not find anything for cbind in sparkR
--
Warm regards,
Devesh.
Hi,
i am trying to create dummy variables in sparkR by creating new columns for
categorical variables. But it is not appending the columns
df <- createDataFrame(sqlContext, iris)
class(dtypes(df))
cat.column<-vector(mode="character",length=nrow(df))
cat.column<-collect(select(df,df$Species))
le
ter wrote:
>
> I had problems doing this as well - I ended up using 'withColumn', it's
> not particularly graceful but it worked (1.5.2 on AWS EMR)
>
> cheerd
>
> On 3 February 2016 at 22:06, Devesh Raj Singh
> wrote:
>
>> Hi,
>>
>> i am try
12225) which is still under
> discussion. If you desire this feature, you could comment on it.
>
>
>
> *From:* Franc Carter [mailto:franc.car...@gmail.com]
> *Sent:* Wednesday, February 3, 2016 7:40 PM
> *To:* Devesh Raj Singh
> *Cc:* user@spark.apache.org
> *Subject:* Re: sparkR not abl
Hi,
I have written a code to create dummy variables in sparkR
df <- createDataFrame(sqlContext, iris)
class(dtypes(df))
cat.column<-vector(mode="character",length=nrow(df))
cat.column<-collect(select(df,df$Species))
lev<-length(levels(as.factor(unlist(cat.column
for (j in 1:lev){
dummy.df
Hi,
I am using Spark 1.5.1
When I do this
df <- createDataFrame(sqlContext, iris)
#creating a new column for category "Setosa"
df$Species1<-ifelse((df)[[5]]=="setosa",1,0)
head(df)
output: new column created
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1
>
> When calling createDataFrame on iris, the “.” Character in column names
> will be replaced with “_”.
>
> It seems that when you create a DataFrame from the CSV file, the “.”
> Character in column names are still there.
>
>
>
> *From:* Devesh Raj Singh [mailto:raj.dev
Hi,
I want to read a spark dataframe using python and then convert the spark
dataframe to pandas dataframe then convert the pandas dataframe back to
spark dataframe ( after doing some data analysis) . Please suggest.
--
Warm regards,
Devesh.
Hi,
I want to read CSV file in pyspark
I am running pyspark on pycharm
I am trying to load a csv using pyspark
import os
import sys
os.environ['SPARK_HOME']="/Users/devesh/Downloads/spark-1.5.1-bin-hadoop2.6"
sys.path.append("/Users/devesh/Downloads/spark-1.5.1-bin-hadoop2.6/python/")
# Now we
Hi,
I have imported spark csv dataframe in python and read the spark data the
converted the dataframe to pandas dataframe using toPandas()
I want to convert the pandas dataframe back to spark csv and write the csv
to a location.
Please suggest
--
Warm regards,
Devesh.
above solution you can read CSV directly into a dataframe as
> well.
>
> Regards,
> Gourav
>
> On Tue, Feb 23, 2016 at 12:03 PM, Devesh Raj Singh > wrote:
>
>> Hi,
>>
>> I have imported spark csv dataframe in python and read the spark data the
>> conve
18 matches
Mail list logo