Hi everybody, We are working on porting some life science applications to Cassandra, but we have to deal with its limits managing huge queries. Our queries are usually multiget_slice ones: many rows with many columns each.
We have seen system start to slower until the entry point node crashes when increasing the amount of rows and columns required in a single query . Where does this limit come from? Giving a fast look to the code seems like the entry point is stressed because it has to keep all the responses in memory. Only after it has received all the responses, from the nodes, Then, it resolves the conflicts between different versions and sends them to the client. Would not be possible to start sending the response to the client before receiving all them? For instance, it may speed up the Consistency level ONE queries. Does anyone worked on it? Is it the real reason of the decreasing of performances? Thank you, Cesare Cugnasco