Hi Colin,

The Kibosh code is just a README for now, is it going to be published soon?

Tim

On Tue, Aug 22, 2017 at 11:44 AM, Colin McCabe <cmcc...@apache.org> wrote:
> Hi all,
>
> I've been working on a fault injector for Apache Kafka.  The general
> idea is to create faults such as network partitions or disk failures,
> and see what happens in the cluster.  The fault injector can run as part
> of a ducktape system test, or standalone.
>
> The fault injector has two processes: a coordinator, and an agent.  The
> agent process is responsible for actually implementing the faults.  For
> example, it might run iptables, send signals to processes, generate a
> lot of load, or do something else to disrupt the computer it is running
> on.  We run an agent process on each node where we would like to
> potentially inject faults.  So it will run alongside the brokers,
> zookeeper nodes, etc.
>
> The coordinator process is responsible for communicating with the agent
> processes and for scheduling faults.  For example, the coordinator can
> be instructed to create a fault immediately on several nodes.  Or it can
> be instructed to create faults over time, based on a pseudorandom seed.
> Both the coordinator and the agent expose a REST interface that accepts
> objects serialized via JSON.
>
> I think two kinds of faults will be especially interesting: network
> faults, and disk errors.  Simulating network faults in a Linux
> environment is relatively straightforward using iptables.  Disk errors
> are tougher to simulate, but I have written a FUSE filesystem to do
> this.  The  filesystem essentially simulates a bind mount in most cases,
> but it can take a JSON specification telling it to inject certain
> faults.  (Disk errors seem especially relevant to the ongoing work on
> JBOD.)
>
> Although it's not a user-visible component, I think having a fault
> injector will be really great for Kafka users.  It will really help us
> stress test Kafka in more situations.  I'm going to post some patches in
> a day or two-- it would be great to get some feedback.  Check out
> https://cwiki.apache.org/confluence/display/KAFKA/Fault+Injection
>
> best,
> Colin

Reply via email to