We use kafka as a durable buffer for 3rd party event traffic. It acts as the event source in a lambda architecture. We want it to be exactly once and we are close, though we can lose messages aggregating for Hadoop. To really tie this all together, I think there should be an Apache project to implement a proper 3-phase distributed transaction capability, which the Kafka and Hadoop communities could implement together. This paper looks promising. It is a 3 RTT protocol, but it is non-blocking. This could be a part of a new consumer api, at some point.
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1703048 regards, Rob