Hi,

I'm processing the json I have in a text file using DataFrames, however right 
now I'm trying to figure out a way to access a certain value within the rows of 
my data frame if I only know the field name and not the respective field 
position in the schema.

I noticed that row.schema and row.dtypes give me information about the 
auto-generate schema, but I cannot see a straigh forward patch for this, I'm 
trying to create a PairRdd out of this 

Is there any easy way to figure out the field position by it's field name (the 
key it had in the json)?

so this

val sqlContext = new SQLContext(sc)
val rawIncRdd = 
sc.textFile("hdfs://1.2.3.4:8020/user/hadoop/incidents/unstructured/inc-0-500.txt")
 val df = sqlContext.jsonRDD(rawIncRdd)
df.foreach(line => println(line.getString(0)))


would turn into something like this

val sqlContext = new SQLContext(sc)
val rawIncRdd = 
sc.textFile("hdfs://1.2.3.4:8020/user/hadoop/incidents/unstructured/inc-0-500.txt")
 val df = sqlContext.jsonRDD(rawIncRdd)
df.foreach(line => println(line.getString("field_name")))

thanks for the advice

Reply via email to