hi Moïn, two suggestions, based on my experience:
1. max HDD size of GOOD QUALITY 7200 RPM spinning SATA/SAS HDD's is 4 TB. Everything else will ruin ur performance ( as long as you dont do pure archiving of files ( writing one time, "never" touching them again ) If you have 8 TB HDDs, just use them for max. 50% 2. use ssd cache tier with ssd's which can sustain continouse IO operations. Depending on the size of that cache tier you might be able to use > 4 TB per 7200 RPM spinning HDD. 3. of course, but thats already quiet standard, use ssd's for the journals and metadata ( take care to use the right ssd's for that ) Look at https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ to have an idea what i mean. Good luck ! -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 30.06.2016 um 10:34 schrieb m.da...@bluewin.ch: > Thank you all for your prompt answers. > >> firstly, wall of text, makes things incredibly hard to read. >> Use paragraphs/returns liberally. > > I actually made sure to use paragraphs. For some reason, the formatting was > removed. > >> Is that your entire experience with Ceph, ML archives and docs? > > Of course not, I have already been through the whole documentation many > times. It's just that I couldn't really decide between the choices I was > given. > >> What's an "online storage"? >> I assume you're talking about what is is commonly referred as "cloud > storage". > > I try not to use the term "cloud", but if you must, then yes that's the idea > behind it. Basically an online hard disk. > >> 10MB is not a small file in my book, 1-4KB (your typical mail) are small >> files. >> How much data (volume/space) are you looking at initially and within a >> year of deployment? > > 10MB is small compared to the larger files, but it is indeed bigger that > smaller, IOPS-intensive files (like the emails you pointed out). > > Right now there are two servers, each with 12x8TB. I expect a growth rate of > about the same size every 2-3 months. > >> What usage patterns are you looking at, expecting? > > Since my customers will put their files on this "cloud", it's generally write > once, read many (or at least more reads than writes). > As they most likely will store private documents, but some bigger files too, > the smaller files are predominant. > >> That's quite the blanket statement and sounds like from A sales brochure. >> SSDs for OSD journals are always a good idea. >> Ceph scales first and foremost by adding more storage nodes and OSDs. > > What I meant by scaling is that as the number of customers grows, the more > small files there will be, and so in order to have decent performance at > that point, SSDs are a must. I can add many OSDs, but if they are all > struggling with IOPS then it's no use (except having more space). > >> Are we talking about existing HW or what you're planning? > > That is existing hardware. Given the high capacity of the drives, I went with > a more powerful CPU to avoid myself future headaches. > >> Also, avoid large variations in your storage nodes if anyhow possible, > especially in your OSD sizes. > > Say I have two nodes, one with 12 OSDs and the other with 24. All drives are > the same size. Would that cause any issue ? (except for the failure domain) > > I think it is clear that native calls are the way to go, even the docs point > you in that direction. Now the issue is that the clients needs to have a file > directory structure. > > The access topology is as follows: > > Customer <-> customer application <-> server application <-> Ceph cluster > > The customer has to be able to make directories, as with an FTP server for > example. Using CephFS would make this task very easy, though at the expense > of some performance. > With natives calls, since everything is considered as an object, it gets > trickier to provide this feature. Perhaps some naming scheme would make this > possible. > > Kind regards, > > Moïn Danai. > > ----Original Message---- > From : ch...@gol.com > Date : 27/06/2016 - 02:45 (CEST) > To : ceph-users@lists.ceph.com > Cc : m.da...@bluewin.ch > Subject : Re: [ceph-users] Ceph for online file storage > > > Hello, > > firstly, wall of text, makes things incredibly hard to read. > Use paragraphs/returns liberally. > > Secondly, what Yang wrote. > > More inline. > On Sun, 26 Jun 2016 18:30:35 +0000 (GMT+00:00) m.da...@bluewin.ch wrote: > >> Hi all, >> After a quick review of the mailing list archive, I have a question that >> is left unanswered: > > Is that your entire experience with Ceph, ML archives and docs? > >> Is Ceph suitable for online file storage, and if >> yes, shall I use RGW/librados or CephFS ? > > What's an "online storage"? > I assume you're talking about what is is commonly referred as "cloud > storage". > Which also typically tends to use HTTP, S3 and thus RGW would be the > classic fit. > > But that's up to you really. > > For example OwnCloud (and thus NextCloud) can use Ceph RGW as a storage > backend. > >> The typical workload here is >> mostly small files 50kB-10MB and some bigger ones 100MB+ up to 4TB max >> (roughly 70/30 split). > 10MB is not a small file in my book, 1-4KB (your typical mail) are small > files. > How much data (volume/space) are you looking at initially and within a > year of deployment? > > What usage patterns are you looking at, expecting? > >> Caching with SSDs is critical in achieving >> scalable performance as OSD hosts increase (and files as well). > > That's quite the blanket statement and sounds like from A sales brochure. > SSDs for OSD journals are always a good idea. > Ceph scales first and foremost by adding more storage nodes and OSDs. > > SSD based cache-tiers (quite a different beast to journals) can help, but > that's highly dependent on your usage patterns as well as correct sizing > and configuration of the cache pool. > > For example one of your 4TB files above could potentially wreck havoc with > a cache pool of similar size. > >> OSD >> nodes have between 12 and 48 8TB drives. > > Are we talking about existing HW or what you're planning? > 12 OSDs per node are a good start and what I aim for usually, 24 are > feasible if you have some idea what you're doing. > More than 24 OSDs per node requires quite the insight and significant > investments in CPU and RAM. Tons of threads about this here. > > Read the current thread "Dramatic performance drop at certain number of > objects in pool" for example. > > Also, avoid large variations in your storage nodes if anyhow possible, > especially in your OSD sizes. > > Christian > >> If using CephFS, the hierarchy >> would include alphabet letters at the root and then a user's directory >> in the appropriate subfolder folder. With native calls, I'm not quite >> sure on how to retrieve file A from user A and not user B. Note that the >> software which processes user data is written in Java and deployed on >> multiple client-facing servers, so rados integration should be easy. >> Kind regards, Moïn Danai. > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com