There are likely simpler answers if you want to tier entire buckets, but it sounds like you are hosting a filesystem(s) on NetApp and want to tier them. It would be nice to have NetApp running Ceph as a block store, but I don't think crush is sophisticated enough to migrate components of a filesystem pool based on the ages of files/directories in them. For one thing, I'm not sure that the PGs in the pool can/should be that aware of such details or that you might not get into problems with fragments of files in different PGs with much of a PG being un-aged data. So I'm not optimistic on that concept.

What that suggests to me is that you might use an overlay filesystem, where the different tiers overlay each other to present a unified filesystem image. This is precisely what containers do, although much of their goal is simply optimising shared image layers. A variation of this is Copy-on-Write (COW), but what you want is more like the reverse.

At any rate, a frontend overlay filesystem with NetApp overlaying a secondary Ceph system seems like a likely solution. Then all you'd need would be a mechanism to move aged-out resources. That might even be a good use of rsync.

   Tim

On 5/4/25 10:20, sacawulu wrote:
Hi all,

We're exploring solutions to offload large volumes of data (on the order of petabytes) from our NetApp all-flash storage to our more cost-effective, HDD-based Ceph storage cluster, based on criteria such as: last access time older than X years.

Ideally, we would like to leave behind a 'stub' or placeholder file on the NetApp side to preserve the original directory structure and potentially enable some sort of transparent access or recall if needed. This kind of setup is commonly supported by solutions like DataCore/FileFly, but as far as we can tell, FileFly doesn’t support Ceph as a backend and instead favors its own Swarm object store.

Has anyone here implemented a similar tiering/archive/migration solution involving NetApp and Ceph?

We’re specifically looking for:

*    Enterprise-grade tooling

*    Stub file support or similar metadata-preserving offload

*    Support and reliability (given the scale, we can’t afford data loss or inconsistency)

*    Either commercial or well-supported open source solutions

Any do’s/don’ts, war stories, or product recommendations would be greatly appreciated. We’re open to paying for software or services if it brings us the reliability and integration we need.

Thanks in advance!

MJ
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to