Spark SQL query

AJT Thu, 06 Oct 2016 06:40:51 -0700

>From what I have read on Spark SQL - you need to already have a dataframe
which you can then query on - e.g. select * from myDataframe where
<conditions>
Where the dataframe is either a Hive table or Avro file etc.


What if you want to create a dataframe from your underlying data on the fly
with input parameters passed into your job. 
i.e. 
1. Read my data files (e.g. avro) into a dataframe dependent on what
arguments are passed (e.g. date range)
2. perform map / mapPartitions / filter / GroupBy functions on the dataframe
to create a new dataframe
3. output this dataframe

I can see how to do this in a standard spark application (e.g. run via
spark-submit) but what if I want to use one of the myriad of tools
(Tableau/Qlik etc) that are SparkSQL compliant and run my job from there? Is
there a way I can do:

select * from
functions_on_dataframe_which_output_dataframe(dataframe_built_from_input_arguments)

Appreciate any help



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-query-tp27850.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark SQL query

Reply via email to