Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Shivaram Venkataraman
at 12:55 PM >> To: "shiva...@eecs.berkeley.edu" >> Cc: Aleksander Eskilson , "dev@spark.apache.org" >> >> Subject: Re: SparkR DataFrame Column Casts esp. from CSV Files >> >> Yes, spark-csv does not infer types yet, but it is planned to be >&g

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Shivaram Venkataraman
om: Hossein Falaki > Date: Wednesday, June 3, 2015 at 12:55 PM > To: "shiva...@eecs.berkeley.edu" > Cc: Aleksander Eskilson , "dev@spark.apache.org" > > Subject: Re: SparkR DataFrame Column Casts esp. from CSV Files > > Yes, spark-csv does not infer ty

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Eskilson,Aleksander
eskil...@cerner.com>>, "dev@spark.apache.org<mailto:dev@spark.apache.org>" mailto:dev@spark.apache.org>> Subject: Re: SparkR DataFrame Column Casts esp. from CSV Files Yes, spark-csv does not infer types yet, but it is planned to be implemented soon. To work around th

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Hossein Falaki
Yes, spark-csv does not infer types yet, but it is planned to be implemented soon. To work around the current limitations (of spark-csv and SparkR), you can specify the schema in read.df() to get your desired types from spark-csv. For example: myschema <- structType(structField(“id", "integer"

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Reynold Xin
schema after loading a DF. > > Thanks, > Alek > > > From: Shivaram Venkataraman > Reply-To: "shiva...@eecs.berkeley.edu" > Date: Wednesday, June 3, 2015 at 12:29 PM > To: Aleksander Eskilson > Cc: "dev@spark.apache.org" , "hoss...@databricks.com"

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Eskilson,Aleksander
g>" mailto:dev@spark.apache.org>>, "hoss...@databricks.com<mailto:hoss...@databricks.com>" mailto:hoss...@databricks.com>> Subject: Re: SparkR DataFrame Column Casts esp. from CSV Files cc Hossein who knows more about the spark-csv options You are right that the defa

Re: SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Shivaram Venkataraman
cc Hossein who knows more about the spark-csv options You are right that the default CSV reader options end up creating all columns as string. I know that the JSON reader infers the schema [1] but I don't know if the CSV reader has any options to do that. Regarding the SparkR syntax to cast colum

SparkR DataFrame Column Casts esp. from CSV Files

2015-06-03 Thread Eskilson,Aleksander
It appears that casting columns remains a bit of a trick in Spark’s DataFrames. This is an issue because tools like spark-csv will set column types to String by default and will not attempt to infer types. Although spark-csv supports specifying types for columns in its options, it’s not clear h