The intent is the US administration’s assault against science, Linas doesn’t *want* to do it, he wants to preserve for the hope of a better future.
> On Apr 8, 2025, at 9:28 AM, Alex Gorbachev <a...@iss-integration.com> wrote: > > Hi Linas, > > Is the intent of purging of this data mainly due to just cost concerns? If > the goal is purely preservation of data, the likely cheapest and least > maintenance intensive way of doing this is a large scale tape archive. > Such archives (purely based on a google search) exist at LLNL and OU, and > there is a TAPAS service from SpectraLogic. > > I would imagine questions would arise about custody of the data, legal > implications etc. The easiest is for the organization already hosting the > data to just preserve it by archiving, and thereby claim a significant cost > reduction. > > -- > Alex Gorbachev > > > > > On Sun, Apr 6, 2025 at 11:08 PM Linas Vepstas <linasveps...@gmail.com> > wrote: > >> OK what you will read below might sound insane but I am obliged to ask. >> >> There are 275 petabytes of NIH data at risk of being deleted. Cancer >> research, medical data, HIPAA type stuff. Currently unclear where it's >> located, how it's managed, who has access to what, but lets ignore >> that for now. It's presumably splattered across data centers, cloud, >> AWS, supercomputing labs, who knows. Everywhere. >> >> I'm talking to a biomed person in Australias that uses NCBI data >> daily, she's in talks w/ Australian govt to copy and preserve the >> datasets they use. Some multi-petabytes of stuff. I don't know. >> >> While bouncing around tech ideas, IPFS and Ceph came up. My experience >> with IPFS is that it's not a serious contender for anything. My >> experience with Ceph is that it's more-or-less A-list. >> >> OK. So here's the question: is it possible to (has anyone tried) set >> up an internet-wide Ceph cluster? Ticking off the typical checkboxes >> for "decentralized storage"? Stuff, like: internet connections need to >> be encrypted. Connections go down, come back up. Slow. Sure, national >> labs may have multi-terabit fiber, but little itty-bitty participants >> trying to contribute a small collection of disks to a large pool might >> only have a gigabit connection, of which maybe 10% is "usable". >> Barely. So, a hostile networking environment. >> >> Is this like, totally insane, run away now, can't do that, it won't >> work idea, or is there some glimmer of hope? >> >> Am I misunderstanding something about IPFS that merits taking a second >> look at it? >> >> Is there any other way of getting scalable reliable "decentralized" >> internet-wide storage? >> >> I mean, yes, of course, the conventional answer is that it could be >> copied to AWS or some national lab or two somewhere in the EU or Aus >> or UK or where-ever, That's the "obvious" answer. I'm looking for a >> non-obvious answer, an IPFS-like thing, but one that actually works. >> Could it work? >> >> -- Linas >> >> >> -- >> Patrick: Are they laughing at us? >> Sponge Bob: No, Patrick, they are laughing next to us. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io