BTW is this documented as it seems to be potential issue.

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 25 April 2016 at 18:47, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> cheers
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 25 April 2016 at 18:35, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> When you define a class inside of a method, it implicitly has a pointer
>> to the outer scope of the method.  Spark doesn't have access to this scope,
>> so this makes it hard (impossible?) for us to construct new instances of
>> that class.
>>
>> So, define your classes that you plan to use with Spark at the top level.
>>
>> On Mon, Apr 25, 2016 at 9:36 AM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I notice buiding with sbt if I define my case class *outside of main
>>> method* like below it works
>>>
>>>
>>> case class Accounts( TransactionDate: String, TransactionType: String,
>>> Description: String, Value: Double, Balance: Double, AccountName: String,
>>> AccountNumber : String)
>>>
>>> object Import_nw_10124772 {
>>>   def main(args: Array[String]) {
>>>   val conf = new SparkConf().
>>>                setAppName("Import_nw_10124772").
>>>                setMaster("local[12]").
>>>                set("spark.driver.allowMultipleContexts", "true").
>>>                set("spark.hadoop.validateOutputSpecs", "false")
>>>   val sc = new SparkContext(conf)
>>>   // Create sqlContext based on HiveContext
>>>   val sqlContext = new HiveContext(sc)
>>>   import sqlContext.implicits._
>>>   val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>   println ("\nStarted at"); sqlContext.sql("SELECT
>>> FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss')
>>> ").collect.foreach(println)
>>>   //
>>>   // Get a DF first based on Databricks CSV libraries ignore column
>>> heading because of column called "Type"
>>>   //
>>>   val df =
>>> sqlContext.read.format("com.databricks.spark.csv").option("inferSchema",
>>> "true").option("header",
>>> "true").load("hdfs://rhes564:9000/data/stg/accounts/nw/10124772")
>>>   //df.printSchema
>>>   //
>>>    val a = df.filter(col("Date") > "").map(p =>
>>> Accounts(p(0).toString,p(1).toString,p(2).toString,p(3).toString.toDouble,p(4).toString.toDouble,p(5).toString,p(6).toString))
>>>
>>>
>>> However, if I put that case class with the main method, it throws "No
>>> TypeTag available for Accounts" error
>>>
>>> Apparently when case class is defined inside of the method that it is
>>> being used, it is not fully defined at that point.
>>>
>>> Is this a bug within Spark?
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>
>>
>

Reply via email to