I have a Dataset<String> ds which consists of json rows.
*Sample Json Row (This is just an example of one row in the dataset)*
[
{"name": "foo", "address": {"state": "CA", "country": "USA"},
"docs":[{"subject": "english", "year": 2016}]}
{"name": "bar", "address": {"state": "OH", "country": "USA"},
"docs":[{"subject": "math", "year": 2017}]}
]
ds.printSchema()
root
|-- value: string (nullable = true)
Now I want to convert into the following dataset using Spark 2.2.0
name | address | docs
----------------------------------------------------------------------------------
"foo" | {"state": "CA", "country": "USA"} | [{"subject": "english",
"year": 2016}]
"bar" | {"state": "OH", "country": "USA"} | [{"subject": "math", "year": 2017}]
Preferably Java but Scala is also fine as long as there are functions
available in Java API