Ryan, you are right. That was issue. It works now. Thanks.
On Thu, Mar 23, 2017 at 8:26 PM, Ryan <[email protected]> wrote:
> you should import either spark.implicits or sqlContext.implicits, not
> both. Otherwise the compiler will be confused about two implicit
> transformations
>
> following code works for me, spark version 2.1.0
>
> object Test {
> def main(args: Array[String]) {
> val spark = SparkSession
> .builder
> .master("local")
> .appName(getClass.getSimpleName)
> .getOrCreate()
> import spark.implicits._
> val df = Seq(TeamUser("t1", "u1", "r1")).toDF()
> df.printSchema()
> spark.close()
> }
> }
>
> case class TeamUser(teamId: String, userId: String, role: String)
>
>
> On Fri, Mar 24, 2017 at 5:23 AM, shyla deshpande <[email protected]
> > wrote:
>
>> I made the code even more simpler still getting the error
>>
>> error: value toDF is not a member of Seq[com.whil.batch.Teamuser]
>> [ERROR] val df = Seq(Teamuser("t1","u1","r1")).toDF()
>>
>> object Test {
>> def main(args: Array[String]) {
>> val spark = SparkSession
>> .builder
>> .appName(getClass.getSimpleName)
>> .getOrCreate()
>> import spark.implicits._
>> val sqlContext = spark.sqlContext
>> import sqlContext.implicits._
>> val df = Seq(Teamuser("t1","u1","r1")).toDF()
>> df.printSchema()
>> }
>> }
>> case class Teamuser(teamid:String, userid:String, role:String)
>>
>>
>>
>>
>> On Thu, Mar 23, 2017 at 1:07 PM, Yong Zhang <[email protected]> wrote:
>>
>>> Not sure I understand this problem, why I cannot reproduce it?
>>>
>>>
>>> scala> spark.version
>>> res22: String = 2.1.0
>>>
>>> scala> case class Teamuser(teamid: String, userid: String, role: String)
>>> defined class Teamuser
>>>
>>> scala> val df = Seq(Teamuser("t1", "u1", "role1")).toDF
>>> df: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ... 1
>>> more field]
>>>
>>> scala> df.show
>>> +------+------+-----+
>>> |teamid|userid| role|
>>> +------+------+-----+
>>> | t1| u1|role1|
>>> +------+------+-----+
>>>
>>> scala> df.createOrReplaceTempView("teamuser")
>>>
>>> scala> val newDF = spark.sql("select teamid, userid, role from teamuser")
>>> newDF: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ...
>>> 1 more field]
>>>
>>> scala> val userDS: Dataset[Teamuser] = newDF.as[Teamuser]
>>> userDS: org.apache.spark.sql.Dataset[Teamuser] = [teamid: string, userid:
>>> string ... 1 more field]
>>>
>>> scala> userDS.show
>>> +------+------+-----+
>>> |teamid|userid| role|
>>> +------+------+-----+
>>> | t1| u1|role1|
>>> +------+------+-----+
>>>
>>>
>>> scala> userDS.printSchema
>>> root
>>> |-- teamid: string (nullable = true)
>>> |-- userid: string (nullable = true)
>>> |-- role: string (nullable = true)
>>>
>>>
>>> Am I missing anything?
>>>
>>>
>>> Yong
>>>
>>>
>>> ------------------------------
>>> *From:* shyla deshpande <[email protected]>
>>> *Sent:* Thursday, March 23, 2017 3:49 PM
>>> *To:* user
>>> *Subject:* Re: Converting dataframe to dataset question
>>>
>>> I realized, my case class was inside the object. It should be defined
>>> outside the scope of the object. Thanks
>>>
>>> On Wed, Mar 22, 2017 at 6:07 PM, shyla deshpande <
>>> [email protected]> wrote:
>>>
>>>> Why userDS is Dataset[Any], instead of Dataset[Teamuser]? Appreciate your
>>>> help. Thanks
>>>>
>>>> val spark = SparkSession
>>>> .builder
>>>> .config("spark.cassandra.connection.host", cassandrahost)
>>>> .appName(getClass.getSimpleName)
>>>> .getOrCreate()
>>>>
>>>> import spark.implicits._
>>>> val sqlContext = spark.sqlContext
>>>> import sqlContext.implicits._
>>>>
>>>> case class Teamuser(teamid:String, userid:String, role:String)
>>>> spark
>>>> .read
>>>> .format("org.apache.spark.sql.cassandra")
>>>> .options(Map("keyspace" -> "test", "table" -> "teamuser"))
>>>> .load
>>>> .createOrReplaceTempView("teamuser")
>>>>
>>>> val userDF = spark.sql("SELECT teamid, userid, role FROM teamuser")
>>>>
>>>> userDF.show()
>>>>
>>>> val userDS:Dataset[Teamuser] = userDF.as[Teamuser]
>>>>
>>>>
>>>
>>
>