How to retreive the value from sql.row by column name

2015-02-16 Thread Eric Bell
Is it possible to reference a column from a SchemaRDD using the column's name instead of its number? For example, let's say I've created a SchemaRDD from an avro file: val sqlContext = new SQLContext(sc) import sqlContext._ val caper=sqlContext.avroFile("hdfs://localhost:9000/sma/raw_avro/caper

Re: How to retreive the value from sql.row by column name

2015-02-16 Thread Eric Bell
ase Row(ranId: String) => } On Mon, Feb 16, 2015 at 8:54 AM, Eric Bell <mailto:e...@ericjbell.com>> wrote: Is it possible to reference a column from a SchemaRDD using the column's name instead of its number? For example, let's say I've created a S

Spark newbie desires feedback on first program

2015-02-16 Thread Eric Bell
I'm a spark newbie working on his first attempt to do write an ETL program. I could use some feedback to make sure I'm on the right path. I've written a basic proof of concept that runs without errors and seems to work, although I might be missing some issues when this is actually run on more t

Re: Spark newbie desires feedback on first program

2015-02-16 Thread Eric Bell
Thanks Charles. I just realized a few minutes ago that I neglected to show the step where I generated the key on the person ID. Thanks for the pointer on the HDFS URL. Next step is to process data from multiple RDDS. My data originates from 7 tables in a MySQL database. I used sqoop to create