I have the perfect counter example where some of the data scientists prototype in Python and the production materials is done in Scala. But I get your point, as a matter of fact I realised the toDF method took parameters a little while after posting this. However the toDF still needs you to go from a List to an RDD, or create a useless Dataframe and call toDF on it re-creating a complete data structure. I just feel that the createDataFrame(_: Seq) is not really useful as it is, because I think there are practically no circumstances where you'd want to create a DataFrame without column names.
I'm not implying a n-th overloaded method should be created, rather than change the signature of the existing method with an optional Seq of column names. Regards, Olivier. Le dim. 3 mai 2015 à 07:44, Reynold Xin <r...@databricks.com> a écrit : > Part of the reason is that it is really easy to just call toDF on Scala, > and we already have a lot of createDataFrame functions. > > (You might find some of the cross-language differences confusing, but I'd > argue most real users just stick to one language, and developers or > trainers are the only ones that need to constantly switch between > languages). > > On Sat, May 2, 2015 at 11:05 AM, Olivier Girardot < > o.girar...@lateral-thoughts.com> wrote: > >> Hi everyone, >> SQLContext.createDataFrame has different behaviour in Scala or Python : >> >> >>> l = [('Alice', 1)] >> >>> sqlContext.createDataFrame(l).collect() >> [Row(_1=u'Alice', _2=1)] >> >>> sqlContext.createDataFrame(l, ['name', 'age']).collect() >> [Row(name=u'Alice', age=1)] >> >> and in Scala : >> >> scala> val data = List(("Alice", 1), ("Wonderland", 0)) >> scala> sqlContext.createDataFrame(data, List("name", "score")) >> <console>:28: error: overloaded method value createDataFrame with >> alternatives: ... cannot be applied to ... >> >> What do you think about allowing in Scala too to have a Seq of column >> names >> for the sake of consistency ? >> >> Regards, >> >> Olivier. >> > >