Hi,
I'm also trying to use the insertInto method, but end up getting the
"assertion error"
Is there any workaround to this??
rgds
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-run-SparkSQL-over-Spark-Streaming-tp12530p21316.html
Sent from th
Thanks for the reply. Sorry I could not ask more earlier.
Trying to use a parquet file is not working at all.
case class Rec(name:String,pv:Int)
val sqlContext=new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
val d1=sc.parallelize(Array(("a",10),("b",3))).map(e=>Rec(e._1
I think current the ExistingRDD is not supported. But ParquestRelation is
supported, probably you can try this as walk around.
case logical.InsertIntoTable(table: ParquetRelation, partition, child,
overwrite) =>
InsertIntoParquetTable(table, planLater(child), overwrite) :: Nil
example:
Thanks for the reply.
Ya it doesn't seem doable straight away. Someone suggested this
/For each of your streams, first create an emty RDD that you register as a
table, obtaining an empty table. For your example, let's say you call it
"allTeenagers".
Then, for each of your queries, use SchemaRDD'
Hi again,
On Tue, Aug 26, 2014 at 10:13 AM, Tobias Pfeiffer wrote:
>
> On Mon, Aug 25, 2014 at 7:11 PM, praveshjain1991 <
> praveshjain1...@gmail.com> wrote:
>>
>> "If you want to issue an SQL statement on streaming data, you must have
>> both
>> the registerAsTable() and the sql() call *within*
Hi,
On Mon, Aug 25, 2014 at 7:11 PM, praveshjain1991
wrote:
>
> "If you want to issue an SQL statement on streaming data, you must have
> both
> the registerAsTable() and the sql() call *within* the foreachRDD(...)
> block,
> or -- as you experienced -- the table name will be unknown"
>
> Since
Hi,
Thanks for your help the other day. I had one more question regarding the
same.
"If you want to issue an SQL statement on streaming data, you must have both
the registerAsTable() and the sql() call *within* the foreachRDD(...) block,
or -- as you experienced -- the table name will be unknown"
To: u...@spark.incubator.apache.org
Subject: Re: Trying to run SparkSQL over Spark Streaming
Oh right. Got it. Thanks
Also found this link on that discussion:
https://github.com/thunderain-project/StreamSQL
Does this provide more features than Spark?
--
View this message in context:
http://apache-spark
Oh right. Got it. Thanks
Also found this link on that discussion:
https://github.com/thunderain-project/StreamSQL
Does this provide more features than Spark?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-run-SparkSQL-over-Spark-Streaming-tp1253
Hi,
On Thu, Aug 21, 2014 at 3:11 PM, praveshjain1991
wrote:
>
> The part that you mentioned "*/the variable `result ` is of type
> DStream[Row]. That is, the meta-information from the SchemaRDD is lost and,
> from what I understand, there is then no way to learn about the column
> names
> of the
Hi
Thanks for the reply and the link. Its working now.
>From the discussion on the link, I understand that there are some
shortcomings while using SQL over streaming.
The part that you mentioned "*/the variable `result ` is of type
DStream[Row]. That is, the meta-information from the SchemaRDD
Hi,
On Thu, Aug 21, 2014 at 2:19 PM, praveshjain1991
wrote:
>
> Using Spark SQL with batch data works fine so I'm thinking it has to do
> with
> how I'm calling streamingcontext.start(). Any ideas what is the issue? Here
> is the code:
>
Please have a look at
http://apache-spark-user-list.100
12 matches
Mail list logo