I guess this is related to https://issues.apache.org/jira/browse/SPARK-11976
When calling createDataFrame on iris, the “.” Character in column names will be replaced with “_”. It seems that when you create a DataFrame from the CSV file, the “.” Character in column names are still there. From: Devesh Raj Singh [mailto:raj.deves...@gmail.com] Sent: Friday, February 5, 2016 2:44 PM To: user@spark.apache.org Cc: Sun, Rui Subject: different behavior while using createDataFrame and read.df in SparkR Hi, I am using Spark 1.5.1 When I do this df <- createDataFrame(sqlContext, iris) #creating a new column for category "Setosa" df$Species1<-ifelse((df)[[5]]=="setosa",1,0) head(df) output: new column created Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa but when I saved the iris dataset as a CSV file and try to read it and convert it to sparkR dataframe df <- read.df(sqlContext,"/Users/devesh/Github/deveshgit2/bdaml/data/iris/", source = "com.databricks.spark.csv",header = "true",inferSchema = "true") now when I try to create new column df$Species1<-ifelse((df)[[5]]=="setosa",1,0) I get the below error: 16/02/05 12:11:01 ERROR RBackendHandler: col on 922 failed Error in select(x, x$"*", alias(col, colName)) : error in evaluating the argument 'col' in selecting a method for function 'select': Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : org.apache.spark.sql.AnalysisException: Cannot resolve column name "Sepal.Length" among (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species); at org.apache.spark.s -- Warm regards, Devesh.