Re: Spark Newbie question

2019-07-11 Thread infa elance
Thanks Jerry for the clarification. Ajay. On Thu, Jul 11, 2019 at 12:48 PM Jerry Vinokurov wrote: > Hi Ajay, > > When a Spark SQL statement references a table, that table has to be > "registered" first. Usually the way this is done is by reading in a > DataFrame, then calling the createOrRepla

Re: Spark Newbie question

2019-07-11 Thread Jerry Vinokurov
Hi Ajay, When a Spark SQL statement references a table, that table has to be "registered" first. Usually the way this is done is by reading in a DataFrame, then calling the createOrReplaceTempView (or one of a few other functions) on that data frame, with the argument being the name under which yo

Re: Spark Newbie question

2019-07-11 Thread infa elance
Sorry, i guess i hit the send button too soon This question is regarding a spark stand-alone cluster. My understanding is spark is an execution engine and not a storage layer. Spark processes data in memory but when someone refers to a spark table created through sparksql(df/rdd) what exactly

Spark Newbie question

2019-07-11 Thread infa elance
This is stand-alone spark cluster. My understanding is spark is an execution engine and not a storage layer. Spark processes data in memory but when someone refers to a spark table created through sparksql(df/rdd) what exactly are they referring to? Could it be a Hive table? If yes, is it the same

Re: A spark newbie question

2015-01-04 Thread Sanjay Subramanian
rning process :-) Plus IMHO , if u r planning on learning Spark, I would say YES to Scala and NO to Java. Yes its a diff paradigm but being a Java and Hadoop programmer for many years, I am excited to learn Scala as the language and use Spark. Its exciting.   regards sanjay From: Aniket Bh

Re: A spark newbie question

2015-01-04 Thread Aniket Bhatnagar
Go through spark API documentation. Basically you have to do group by (date, message_type) and then do a count. On Sun, Jan 4, 2015, 9:58 PM Dinesh Vallabhdas wrote: > A spark cassandra newbie question. Thanks in advance for the help. > I have a cassandra table with 2 columns message_timestamp(t

A spark newbie question on summary statistics

2015-01-04 Thread anondin
OP 1 2014-06-25 PAUSE 1 2014-06-27 START 2 2014-06-27 STOP 1 2014-06-27 PAUSE 1 2014-06-27 REWIND 2 2014-06-27 RESTART 1 I'm not proficient in scala and would like to use java. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/A-spa

A spark newbie question

2015-01-04 Thread Dinesh Vallabhdas
A spark cassandra newbie question. Thanks in advance for the help.I have a cassandra table with 2 columns message_timestamp(timestamp) and  message_type(text). The data is of the form2014-06-25 12:01:39 "START" 2014-06-25 12:02:39 "START" 2014-06-25 12:02:39 "PAUSE" 2014-06-25 14:02:39 "STOP" 2014