createDataFrame question

2016-02-09 Thread jdkorigan
Hi, I would like to transform my rdd to a sql.dataframe.Dataframe, is there a possible conversion to do the job? or what would be the easiest way to do it? def ConvertVal(iter): # some code return sqlContext.createDataFrame(Row("val1", "val2", "val3", "val4")) rdd = sc.textFile("").map

Re: createDataFrame question

2016-02-09 Thread jdkorigan
When using this function: rdd = sc.textFile("").mapPartitions(ConvertVal).toDF() I get an exception and the last line is: TypeError: 'JavaPackage' object is not callable Since my function return value is already DataFrame, maybe there is a way to access this type from my rdd? -- View this mess

Re: createDataFrame question

2016-02-09 Thread jdkorigan
The correct way, is just to remove "sqlContext.createDataFrame", and everything works correctly def ConvertVal(iter): # some code return Row("val1", "val2", "val3", "val4") rdd = sc.textFile("").mapPartitions(ConvertVal).toDF() -- View this message in context: http://apache-spark-u

DataFrame and char encoding

2016-02-22 Thread jdkorigan
Hi, I'm trying to find a solution to display string with accents or special char(latin-1). Is there a way to create a DataFrame with a special char encoding? fields = [StructField(StructField("subject", StringType(), True)] schema = StructType(fields) DF = sqlContext.createDataFrame(RDD, schema)