It's only really mildly interactive. When I used presto+hive in the past (just a consumer not an admin) it seemed to be able to provide answers within ~2m even for fairly large data sets. Hoping I can get a similar level of responsiveness with spark.
Thanks, Sonal! I'll take a look at the example log processor and see what I can come up with. -Mat matschaffer.com On Mon, May 23, 2016 at 3:08 PM, Jörn Franke <[email protected]> wrote: > Do you want to replace ELK by Spark? Depending on your queries you could > do as you proposed. However, many of the text analytics queries will > probably be much faster on ELK. If your queries are more interactive and > not about batch processing then it does not make so much sense. I am not > sure why you plan to use Presto. > > On 23 May 2016, at 07:28, Mat Schaffer <[email protected]> wrote: > > I'm curious about trying to use spark as a cheap/slow ELK > (ElasticSearch,Logstash,Kibana) system. Thinking something like: > > - instances rotate local logs > - copy rotated logs to s3 > (s3://logs/region/grouping/instance/service/*.logs) > - spark to convert from raw text logs to parquet > - maybe presto to query the parquet? > > I'm still new on Spark though, so thought I'd ask if anyone was familiar > with this sort of thing and if there are maybe some articles or documents I > should be looking at in order to learn how to build such a thing. Or if > such a thing even made sense. > > Thanks in advance, and apologies if this has already been asked and I > missed it! > > -Mat > > matschaffer.com > >
