>> Since you’re calling them thin, I’m thinking that they’re probably >> E3.S. U.3 is the size of a conventional 2.5” SFF SSD or HDD. > > Hrm, my terminology is probably confusing. According to the specs of > the servers, they are U.3 slots.
Ah. I forget sometimes that there are both 7mm and 15mm drive heights. >> Understandable, but you might think in terms of percentage. If you >> add four HDD OSDs to each node, with 8 per NVMe offload device, that >> device is the same overall percentage of the cluster as what you have >> today. > > But I also think of it in terms of re-setting up four OSDs as opposed > to eight :-) Honestly that is one reason I recommend that people find a way to deploy all-NVMe chassis, which can even cost LESS than HDD. >> so if you suffer a power outage you may be in a world of hurt. > > But only if 3+ nodes lose power/get "rudley" rebooted first, correct? If you have 5 nodes with R3 pools, losing power to 2 of them could result in data unavailable without manual intervention. Loss of 3+ could result in permanent data loss. Data in flight can easily be lost or corrupted. I’ve experienced this myself when testing resilience by dropping power to an entire rack at once. Which is also a terrific and terrifying way to expose flaws in expensive RAID HBAs, against which I’ve ranted on this list for years. > Just bringing this back to my original question: since we have the room > to add up to four more HDDs to each of our existing 5 nodes, if we > wanted to add an addition 20 HDDs altogether, is there any real > performance difference between adding them to the existign nodes or by > adding 5 more nodes? This is Ceph, so the answer is It Depends. If your nodes have sufficient RAM, CPU, and networking, there might not be a measurable difference. More nodes would have the advantage of each node having a smaller blast radius in terms of percentages, and would also give you the potential for using more-advantageous EC profiles should you wish in the future. > I could see that there might be, as by adding more nodes, the IOPs are > spread across a bigger footprint, and less likely to saturate the > bandwidth Network? Depends on your links. It’s harder to saturate a network with HDDs than with SSDs, especially NVMe, but with say a 1GE network without proper bonding your nodes could conceivably saturate the links. Denser nodes with as many as 180 OSDs each (I’ve seen it proposed) or a more modest number of NVMe SSDs can easily saturate even faster network links, especially if their hash policies aren’t ideal. Dense HDD nodes can also saturate HBAs, backplanes, and expanders. > , as opposed to being more concentrated, but then I am not > 100% sure that it works that way? Maybe it just matters more that > there are more spinners available to increase the total IOPs? With modern rotational media (I doubt we have drum OSDs but that’d be really cool) that’s often the case — IOPS limited by the interface, and by rotational/seek latency. A rough rule of thumb is 2 vcores / threads per HDD OSD, though a 1TB HDD OSD and a 30TB OSD might have different demands. CPU-limited systems would tend to benefit from more nodes, as would those with limited physmem. The latter of course is often easier to augment, the default osd_memory_target is 4GB so 6GB per OSD + other daemons + OS overhead is a good target. NVMe OSDs depending who you ask, 4-6 vcores / threads > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io