Looks like the exception was caused by resolved.get(prefix ++ a) returning
None :
        a => StructField(a.head, resolved.get(prefix ++ a).get, nullable =
true)

There are three occurrences of resolved.get() in createSchema() - None
should be better handled in these places.

My two cents.

On Wed, May 27, 2015 at 1:46 PM, Michael Stone <mst...@mathom.us> wrote:

> On Wed, May 27, 2015 at 01:13:43PM -0700, Ted Yu wrote:
>
>> Can you tell us a bit more about (schema of) your JSON ?
>>
>
> It's fairly simple, consisting of 22 fields with values that are mostly
> strings or integers, except that some of the fields are objects
> with http header/value pairs. I'd guess it's something in those latter
> fields that is causing the problems. The data is 800M rows that I didn't
> create in the first place and I'm in the process of making a simpler test
> case. What I was mostly wondering is if there were an obvious mechanism
> that I'm just missing to get jsonRDD to spit out more information about
> which specific rows it's having problems with.
>
>  You can find sample JSON
>> in sql/core/src/test//scala/org/apache/spark/sql/json/
>> TestJsonData.scala
>>
>
> I know the jsonRDD works in general, I've used it before without problems.
> It even works on subsets of this data.
> Mike Stone
>

Reply via email to