Hi Ashish,

In order to answer your question , I assume that you are planning to
process data and cache them in the memory.If you are using a thrift server
that comes with Spark then you can query on top of it. And multiple
applications can use the cached data as internally all the requests go to
thrift server.

Spark exposes hive query language and allows you access its data through
spark .So you can consider using HiveQL for querying .

Thanks,
Vishnu

On Wed, Feb 11, 2015 at 4:12 PM, Ashish Mukherjee <
ashish.mukher...@gmail.com> wrote:

> Hi,
>
> I am planning to use Spark for a Web-based adhoc reporting tool on massive
> date-sets on S3. Real-time queries with filters, aggregations and joins
> could be constructed from UI selections.
>
> Online documentation seems to suggest that SharkQL is deprecated and users
> should move away from it.  I understand Hive is generally not used for
> real-time querying and for Spark SQL to work with other data stores, tables
> need to be registered explicitly in code. Also, the This would not be
> suitable for a dynamic query construction scenario.
>
> For a real-time , dynamic querying scenario like mine what is the proper
> tool to be used with Spark SQL?
>
> Regards,
> Ashish
>

Reply via email to