On Tue, Oct 11, 2016 at 11:02 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > > > > Could you expand a little bit more on stability ? Is it just bursty > workloads in terms of peak vs. average throughput ? Also what level of > latencies do you find users care about ? Is it on the order of 2-3 > seconds vs. 1 second vs. 100s of milliseconds ? > > >
Regarding stability, I've seen two levels of concrete requirements. The first is "don't bring down my Spark cluster". That is to say, regardless of the input data rate, Spark shouldn't thrash or crash outright. Processing may lag behind the data arrival rate, but the cluster should stay up and remain fully functional. The second level is "don't bring down my application". A common use for streaming systems is to handle heavyweight computations that are part of a larger application, like a web application, a mobile app, or a plant control system. For example, an online application for car insurance might need to do some pretty involved machine learning to produce an accurate quote and suggest good upsells to the customer. If the heavyweight portion times out, the whole application times out, and you lose a customer. In terms of bursty vs. non-bursty, the "don't bring down my Spark cluster case" is more about handling bursts, while the "don't bring down my application" case is more about delivering acceptable end-to-end response times under typical load. Regarding latency: One group I talked to mentioned requirements in the 100-200 msec range, driven by the need to display a web page on a browser or mobile device. Another group in the Internet of Things space mentioned times ranging from 5 seconds to 30 seconds throughout the conversation. But most people I've talked to have been pretty vague about specific numbers. My impression is that these groups are not motivated by anxiety about meeting a particular latency target for a particular application. Rather, they want to make low latency the norm so that they can stop having to think about latency. Today, low latency is a special requirement of special applications. But that policy imposes a lot of hidden costs. IT architects have to spend time estimating the latency requirements of every application and lobbying for special treatment when those requirements are strict. Managers have to spend time engineering business processes around latency. Data scientists have to spend time packaging up models and negotiating how those models will be shipped over to the low-latency serving tier. And customers who are accustomed to Google and smartphones end up with an experience that is functional but unsatisfying. It's best to think of latency as a sliding scale. A given level of latency imposes a given level of cost enterprise-wide. Someone who is making a decision on middleware policy will balance this cost against other costs: How much does it cost to deploy the middleware? How much does it cost to train developers to use the system? The winner will be the system that minimizes the overall cost. Fred