Re: Trying to run SparkSQL over Spark Streaming

2015-01-22 Thread nirandap
Hi, I'm also trying to use the insertInto method, but end up getting the "assertion error" Is there any workaround to this?? rgds -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-run-SparkSQL-over-Spark-Streaming-tp12530p21316.html Sent from th

Re: Trying to run SparkSQL over Spark Streaming

2014-08-28 Thread praveshjain1991
Thanks for the reply. Sorry I could not ask more earlier. Trying to use a parquet file is not working at all. case class Rec(name:String,pv:Int) val sqlContext=new org.apache.spark.sql.SQLContext(sc) import sqlContext.createSchemaRDD val d1=sc.parallelize(Array(("a",10),("b",3))).map(e=>Rec(e._1

Re: Trying to run SparkSQL over Spark Streaming

2014-08-27 Thread Zhan Zhang
I think current the ExistingRDD is not supported. But ParquestRelation is supported, probably you can try this as walk around. case logical.InsertIntoTable(table: ParquetRelation, partition, child, overwrite) => InsertIntoParquetTable(table, planLater(child), overwrite) :: Nil example:

Re: Trying to run SparkSQL over Spark Streaming

2014-08-26 Thread praveshjain1991
Thanks for the reply. Ya it doesn't seem doable straight away. Someone suggested this /For each of your streams, first create an emty RDD that you register as a table, obtaining an empty table. For your example, let's say you call it "allTeenagers". Then, for each of your queries, use SchemaRDD'

Re: Trying to run SparkSQL over Spark Streaming

2014-08-25 Thread Tobias Pfeiffer
Hi again, On Tue, Aug 26, 2014 at 10:13 AM, Tobias Pfeiffer wrote: > > On Mon, Aug 25, 2014 at 7:11 PM, praveshjain1991 < > praveshjain1...@gmail.com> wrote: >> >> "If you want to issue an SQL statement on streaming data, you must have >> both >> the registerAsTable() and the sql() call *within*

Re: Trying to run SparkSQL over Spark Streaming

2014-08-25 Thread Tobias Pfeiffer
Hi, On Mon, Aug 25, 2014 at 7:11 PM, praveshjain1991 wrote: > > "If you want to issue an SQL statement on streaming data, you must have > both > the registerAsTable() and the sql() call *within* the foreachRDD(...) > block, > or -- as you experienced -- the table name will be unknown" > > Since

Re: Trying to run SparkSQL over Spark Streaming

2014-08-25 Thread praveshjain1991
Hi, Thanks for your help the other day. I had one more question regarding the same. "If you want to issue an SQL statement on streaming data, you must have both the registerAsTable() and the sql() call *within* the foreachRDD(...) block, or -- as you experienced -- the table name will be unknown"

RE: Trying to run SparkSQL over Spark Streaming

2014-08-20 Thread Shao, Saisai
To: u...@spark.incubator.apache.org Subject: Re: Trying to run SparkSQL over Spark Streaming Oh right. Got it. Thanks Also found this link on that discussion: https://github.com/thunderain-project/StreamSQL Does this provide more features than Spark? -- View this message in context: http://apache-spark

Re: Trying to run SparkSQL over Spark Streaming

2014-08-20 Thread praveshjain1991
Oh right. Got it. Thanks Also found this link on that discussion: https://github.com/thunderain-project/StreamSQL Does this provide more features than Spark? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-run-SparkSQL-over-Spark-Streaming-tp1253

Re: Trying to run SparkSQL over Spark Streaming

2014-08-20 Thread Tobias Pfeiffer
Hi, On Thu, Aug 21, 2014 at 3:11 PM, praveshjain1991 wrote: > > The part that you mentioned "*/the variable `result ` is of type > DStream[Row]. That is, the meta-information from the SchemaRDD is lost and, > from what I understand, there is then no way to learn about the column > names > of the

Re: Trying to run SparkSQL over Spark Streaming

2014-08-20 Thread praveshjain1991
Hi Thanks for the reply and the link. Its working now. >From the discussion on the link, I understand that there are some shortcomings while using SQL over streaming. The part that you mentioned "*/the variable `result ` is of type DStream[Row]. That is, the meta-information from the SchemaRDD

Re: Trying to run SparkSQL over Spark Streaming

2014-08-20 Thread Tobias Pfeiffer
Hi, On Thu, Aug 21, 2014 at 2:19 PM, praveshjain1991 wrote: > > Using Spark SQL with batch data works fine so I'm thinking it has to do > with > how I'm calling streamingcontext.start(). Any ideas what is the issue? Here > is the code: > Please have a look at http://apache-spark-user-list.100