Hello, We have been working on a distributed data proxy for Cassandra. A data proxy is a combination of proxy and caching that also takes care of data consistency and invalidation for insert and updates. In addition, the data proxy is distributed based on consistent hashing and using gossip between data proxy nodes to keep the cached data unique (per node) and consistent. Finally, we have also implemented our data proxy on a FPGA-based accelerator to achieve lower latency and better throughput numbers.
We have a blog post with more details about our technology and initial results here: https://www.reniac.com/2018/04/10/turbocharging-your-cassandra-db-with-reniac-data-proxy/ In brief, the main highlights of our results are that we observe a latency reduction of almost 9X-10X compared to baseline Cassandra and a throughput increase of 3X-4X. Interested to hear thoughts on what kind of benchmarking setup you would like to see us use given we are now exploring other workloads to benchmark with our engine. thanks, Chidamber