Re: Question related to Spark SQL

2015-02-11 Thread VISHNU SUBRAMANIAN
I dint mean that. When you try the above approach only one client will have access to the cached data. But when you expose your data through a thrift server the case is quite different. In the case of thrift server all the request goes to the thrift server and spark will be able to take the advan

Re: Question related to Spark SQL

2015-02-11 Thread VISHNU SUBRAMANIAN
Hi Ashish, In order to answer your question , I assume that you are planning to process data and cache them in the memory.If you are using a thrift server that comes with Spark then you can query on top of it. And multiple applications can use the cached data as internally all the requests go to t

Re: Question related to Spark SQL

2015-02-11 Thread Arush Kharbanda
I am implementing this approach currently. A 1.Create data tables in spark-sql and cache them. 2. Configure the hive metastore to read the cached tables and share the same metastore as spark-sql (You get the spark caching advantage) 3.Run spark code to fetch form the cached tables. In the spark co

Question related to Spark SQL

2015-02-11 Thread Ashish Mukherjee
Hi, I am planning to use Spark for a Web-based adhoc reporting tool on massive date-sets on S3. Real-time queries with filters, aggregations and joins could be constructed from UI selections. Online documentation seems to suggest that SharkQL is deprecated and users should move away from it. I u