Actually you can call df.collect_list("a").

2015-12-25 16:00 GMT+08:00 Jeff Zhang <zjf...@gmail.com>:

> You can use udf to convert one column for array type. Here's one sample
>
> val conf = new SparkConf().setMaster("local[4]").setAppName("test")
> val sc = new SparkContext(conf)
> val sqlContext = new SQLContext(sc)
> import sqlContext.implicits._
> import sqlContext._
> sqlContext.udf.register("f", (a:String) => Array(a,a))
> val df1 = Seq(
>   (1, "jeff", 12),
>   (2, "andy", 34),
>   (3, "pony", 23),
>   (4, "jeff", 14)
> ).toDF("id", "name", "age")
>
> val df2=df1.withColumn("name", expr("f(name)"))
> df2.printSchema()
> df2.show()
>
>
> On Fri, Dec 25, 2015 at 3:44 PM, zml张明磊 <mingleizh...@ctrip.com> wrote:
>
>> Thanks, Jeff. It’s not choose some columns of a Row. It’s just choose all
>> data in a column and convert it to an Array. Do you understand my mean ?
>>
>>
>>
>> In Chinese
>>
>> 我是想基于这个列名把这一列中的所有数据都选出来,然后放到数组里面去。
>>
>>
>>
>>
>>
>> *发件人:* Jeff Zhang [mailto:zjf...@gmail.com]
>> *发送时间:* 2015年12月25日 15:39
>> *收件人:* zml张明磊
>> *抄送:* dev@spark.apache.org
>> *主题:* Re: How can I get the column data based on specific column name
>> and then stored these data in array or list ?
>>
>>
>>
>> Not sure what you mean. Do you want to choose some columns of a Row and
>> convert it to an Arrray ?
>>
>>
>>
>> On Fri, Dec 25, 2015 at 3:35 PM, zml张明磊 <mingleizh...@ctrip.com> wrote:
>>
>>
>>
>> Hi,
>>
>>
>>
>>        I am a new to Scala and Spark and trying to find relative API in 
>> DataFrame
>> to solve my problem as title described. However, I just only find this
>> API *DataFrame.col(colName : String) : Column * which returns an object
>> of Column. Not the content. If only DataFrame support such API which like 
>> *Column.toArray
>> : Type* is enough for me. But now, it doesn’t. How can I do can achieve
>> this function ?
>>
>>
>>
>> Thanks,
>>
>> Minglei.
>>
>>
>>
>>
>>
>> --
>>
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to