Hi, I'm processing the json I have in a text file using DataFrames, however right now I'm trying to figure out a way to access a certain value within the rows of my data frame if I only know the field name and not the respective field position in the schema.
I noticed that row.schema and row.dtypes give me information about the auto-generate schema, but I cannot see a straigh forward patch for this, I'm trying to create a PairRdd out of this Is there any easy way to figure out the field position by it's field name (the key it had in the json)? so this val sqlContext = new SQLContext(sc) val rawIncRdd = sc.textFile("hdfs://1.2.3.4:8020/user/hadoop/incidents/unstructured/inc-0-500.txt") val df = sqlContext.jsonRDD(rawIncRdd) df.foreach(line => println(line.getString(0))) would turn into something like this val sqlContext = new SQLContext(sc) val rawIncRdd = sc.textFile("hdfs://1.2.3.4:8020/user/hadoop/incidents/unstructured/inc-0-500.txt") val df = sqlContext.jsonRDD(rawIncRdd) df.foreach(line => println(line.getString("field_name"))) thanks for the advice