I am looking for similar solution more aligned to data scientist group. The concern i have is about supporting complex aggregations at runtime .
Thanks Deepak On Nov 12, 2017 12:51, "ashish rawat" <dceash...@gmail.com> wrote: > Hello Everyone, > > I was trying to understand if anyone here has tried a data warehouse > solution using S3 and Spark SQL. Out of multiple possible options > (redshift, presto, hive etc), we were planning to go with Spark SQL, for > our aggregates and processing requirements. > > If anyone has tried it out, would like to understand the following: > > 1. Is Spark SQL and UDF, able to handle all the workloads? > 2. What user interface did you provide for data scientist, data > engineers and analysts > 3. What are the challenges in running concurrent queries, by many > users, over Spark SQL? Considering Spark still does not provide spill to > disk, in many scenarios, are there frequent query failures when executing > concurrent queries > 4. Are there any open source implementations, which provide something > similar? > > > Regards, > Ashish >