Ok, so latency problem is being generated because I'm using SQL as source? how about csv, hive, or another source?
On Tue, Dec 1, 2015 at 9:18 PM, Mark Hamstra <[email protected]> wrote: > It is not designed for interactive queries. > > > You might want to ask the designers of Spark, Spark SQL, and particularly > some things built on top of Spark (such as BlinkDB) about their intent with > regard to interactive queries. Interactive queries are not the only > designed use of Spark, but it is going too far to claim that Spark is not > designed at all to handle interactive queries. > > That being said, I think that you are correct to question the wisdom of > expecting lowest-latency query response from Spark using SQL (sic, > presumably a RDBMS is intended) as the datastore. > > On Tue, Dec 1, 2015 at 4:05 PM, Jörn Franke <[email protected]> wrote: > >> Hmm it will never be faster than SQL if you use SQL as an underlying >> storage. Spark is (currently) an in-memory batch engine for iterative >> machine learning workloads. It is not designed for interactive queries. >> Currently hive is going into the direction of interactive queries. >> Alternatives are Hbase on Phoenix or Impala. >> >> On 01 Dec 2015, at 21:58, Andrés Ivaldi <[email protected]> wrote: >> >> Yes, >> The use case would be, >> Have spark in a service (I didnt invertigate this yet), through api calls >> of this service we perform some aggregations over data in SQL, We are >> already doing this with an internal development >> >> Nothing complicated, for instance, a table with Product, Product Family, >> cost, price, etc. Columns like Dimension and Measures, >> >> I want to Spark for query that table to perform a kind of rollup, with >> cost as Measure and Prodcut, Product Family as Dimension >> >> Only 3 columns, it takes like 20s to perform that query and the >> aggregation, the query directly to the database with a grouping at the >> columns takes like 1s >> >> regards >> >> >> >> On Tue, Dec 1, 2015 at 5:38 PM, Jörn Franke <[email protected]> wrote: >> >>> can you elaborate more on the use case? >>> >>> > On 01 Dec 2015, at 20:51, Andrés Ivaldi <[email protected]> wrote: >>> > >>> > Hi, >>> > >>> > I'd like to use spark to perform some transformations over data stored >>> inSQL, but I need low Latency, I'm doing some test and I run into spark >>> context creation and data query over SQL takes too long time. >>> > >>> > Any idea for speed up the process? >>> > >>> > regards. >>> > >>> > -- >>> > Ing. Ivaldi Andres >>> >> >> >> >> -- >> Ing. Ivaldi Andres >> >> > -- Ing. Ivaldi Andres
