BTW is this documented as it seems to be potential issue. Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 25 April 2016 at 18:47, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > cheers > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 25 April 2016 at 18:35, Michael Armbrust <mich...@databricks.com> > wrote: > >> When you define a class inside of a method, it implicitly has a pointer >> to the outer scope of the method. Spark doesn't have access to this scope, >> so this makes it hard (impossible?) for us to construct new instances of >> that class. >> >> So, define your classes that you plan to use with Spark at the top level. >> >> On Mon, Apr 25, 2016 at 9:36 AM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Hi, >>> >>> I notice buiding with sbt if I define my case class *outside of main >>> method* like below it works >>> >>> >>> case class Accounts( TransactionDate: String, TransactionType: String, >>> Description: String, Value: Double, Balance: Double, AccountName: String, >>> AccountNumber : String) >>> >>> object Import_nw_10124772 { >>> def main(args: Array[String]) { >>> val conf = new SparkConf(). >>> setAppName("Import_nw_10124772"). >>> setMaster("local[12]"). >>> set("spark.driver.allowMultipleContexts", "true"). >>> set("spark.hadoop.validateOutputSpecs", "false") >>> val sc = new SparkContext(conf) >>> // Create sqlContext based on HiveContext >>> val sqlContext = new HiveContext(sc) >>> import sqlContext.implicits._ >>> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >>> println ("\nStarted at"); sqlContext.sql("SELECT >>> FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') >>> ").collect.foreach(println) >>> // >>> // Get a DF first based on Databricks CSV libraries ignore column >>> heading because of column called "Type" >>> // >>> val df = >>> sqlContext.read.format("com.databricks.spark.csv").option("inferSchema", >>> "true").option("header", >>> "true").load("hdfs://rhes564:9000/data/stg/accounts/nw/10124772") >>> //df.printSchema >>> // >>> val a = df.filter(col("Date") > "").map(p => >>> Accounts(p(0).toString,p(1).toString,p(2).toString,p(3).toString.toDouble,p(4).toString.toDouble,p(5).toString,p(6).toString)) >>> >>> >>> However, if I put that case class with the main method, it throws "No >>> TypeTag available for Accounts" error >>> >>> Apparently when case class is defined inside of the method that it is >>> being used, it is not fully defined at that point. >>> >>> Is this a bug within Spark? >>> >>> Thanks >>> >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >> >> >