On Mar 22, 2011, at 5:09 AM, aaron morton wrote: > 1) You should use nodes with the same capacity (CPU, RAM, HDD), cassandra > assumes they are all equal.
Care to elaborate? While equal node will certainly make life easier I would have thought that dynamic snitch would take care of performance differences and manual assignment of token ranges can yield to any data distribution. Obviously if a node has twice as much data will probably get twice the load. But if that is no problem ... Where does cassandra assume that all are equal? Cheers Daniel > > 2) Not sure what exactly would happen. Am guessing either the node would > shutdown or writes would eventually block, probably the former. If the node > was up read performance may suffer (if there were more writes been sent in). > If you really want to know more let me know and I may find time to dig into > it. > > Also a node is be responsible for storing it's token range and acting as a > replica for other token ranges. So reducing the token range may not have a > dramatic affect on the storage requirements. > > Hope that helps. > Aaron > > On 22 Mar 2011, at 09:50, Jonathan Colby wrote: > >> >> This is a two part question ... >> >> 1. If you have cassandra nodes with different sized hard disks, how do you >> deal with assigning the token ring such that the nodes with larger disks get >> more data? In other words, given equally distributed token ranges, when >> the smaller disk nodes run out of space, the larger disk nodes with still >> have unused capacity. Or is installing a mixed hardware cluster a no-no? >> >> 2. What happens when a cassandra node runs out of disk space for its data >> files? Does it continue serving the data while not accepting new data? Or >> does the node break and require manual intervention? >> >> This info has alluded me elsewhere. >> Jon >