Is it possible to reference a column from a SchemaRDD using the column's
name instead of its number?
For example, let's say I've created a SchemaRDD from an avro file:
val sqlContext = new SQLContext(sc)
import sqlContext._
val caper=sqlContext.avroFile("hdfs://localhost:9000/sma/raw_avro/caper
ase Row(ranId: String) => }
On Mon, Feb 16, 2015 at 8:54 AM, Eric Bell <mailto:e...@ericjbell.com>> wrote:
Is it possible to reference a column from a SchemaRDD using the
column's name instead of its number?
For example, let's say I've created a S
I'm a spark newbie working on his first attempt to do write an ETL
program. I could use some feedback to make sure I'm on the right path.
I've written a basic proof of concept that runs without errors and seems
to work, although I might be missing some issues when this is actually
run on more t
Thanks Charles. I just realized a few minutes ago that I neglected to
show the step where I generated the key on the person ID. Thanks for the
pointer on the HDFS URL.
Next step is to process data from multiple RDDS. My data originates from
7 tables in a MySQL database. I used sqoop to create