use df.selectExpr to evaluate complex expression (instead of just column
names).

On Thu, May 5, 2016 at 11:53 AM, Xinh Huynh <xinh.hu...@gmail.com> wrote:

> Hi,
>
> I am having trouble accessing an array element in JSON data with a
> dataframe. Here is the schema:
>
> val json1 = """{"f1":"1", "f1a":[{"f2":"2"}] } }"""
> val rdd1 = sc.parallelize(List(json1))
> val df1 = sqlContext.read.json(rdd1)
> df1.printSchema()
>
> root |-- f1: string (nullable = true) |-- f1a: array (nullable = true) |
> |-- element: struct (containsNull = true) | | |-- f2: string (nullable =
> true)
>
> I would expect to be able to select the first element of "f1a" this way:
> df1.select("f1a[0]").show()
>
> org.apache.spark.sql.AnalysisException: cannot resolve 'f1a[0]' given
> input columns f1, f1a;
>
> This is with Spark 1.6.0.
>
> Please help. A follow-up question is: can I access arbitrary levels of
> nested JSON array of struct of array of struct?
>
> Thanks,
> Xinh
>

Reply via email to