Ahoy the list. I am evaluating Cassandra in the context of using it as a storage back end for the Titan graph database.
We’ll have several nodes in the cluster. However, one of our requirements is that data has to be loaded into and stored on a specific node and only on that node. Also, it cannot be replicated around the system, at least not stored persistently on disk – we will of course make copies in memory and on the wire as we access remote notes. These requirements are non-negotiable. We understand that this is essentially the opposite of what Cassandra is designed for, and that we’re missing all the scalability and robustness, but is it technically possible? First, I would need to create a custom partitioner – is there any tutorial on that? I see a few “you don’t need” to threads, but I do. Second, how easy is it to have Cassandra not replicate data between nodes in a cluster? I’m not seeing an obvious configuration option for that, presumably because it obviates much of the point of using Cassandra, but again, we’re working within some rather unfortunate constraints. Any hints or suggestions would be most gratefully received. Kind regards, -Colin MacDonald-