for "small" failures (local failures on a single node, like socket disconnection, disk read errors, out of memory etc) I've used byteman before - http://byteman.jboss.org/
On Tue, Oct 4, 2016 at 5:46 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > Hi Gwen, > > I've also seen suggestions of using Jepsen for fault injection, but > > I'm not familiar with this framework. > > > > What do you guys think? Write our own failure injection? or write > > Kafka tests in Jepsen? > > > > This would definitely add a lot of value and save a lot on release > validation overheads. I have heard of Jepsen (via the blog), but haven't > used it. At LinkedIn a couple of infra teams have been using Simoorg > <https://github.com/linkedin/simoorg> which being python-based would > perhaps be easier to use for system test writers than Clojure (under > Jepsen). The Ambry <https://github.com/linkedin/ambry> project at LinkedIn > uses it extensively (and I think has added several more failure scenarios > which don't seem to be reflected in the github repo). Anyway, I think we > should at least enumerate what we want to test and evaluate the > alternatives before reinventing. > > Thanks, > > Joel >