On Sat, 24 Oct 2009, Junhao wrote: > On 10/22/2009 01:23 AM, da...@lang.hm wrote: >> On Thu, 22 Oct 2009, Junhao wrote: >> >>> Hi! >>> >>> At my workplace, I am in charge of data storage for my research group. >>> These files are placed in a *NIX file server, and users authentication >>> is through my corporate AD. Files are owned by individual users; other >>> users from the same group can only read the files. As primary research >>> data files, we basically expect these to be available forever. >>> >>> This system has worked well till several of my colleagues left. Their >>> user accounts were promptly deleted from the corporate AD, creating a >>> situation where their files are owned by invalid/unknown users. >>> >>> My workplace does not have a policy to handle this situation, so I am >>> wondering how everyone handles this age-old problem. Any advice? >> >> I see this as your real problem, the issue of the files and their >> ownership is a symptom of the problem. >> >> I would lock the user for some period of time, then archive the >> files/e-mail/etc for some period of time, then delete them. >> >> time periods need to be decided by someone who can take the blame if >> they are too short and you delete something the company needs, or if >> they are too long and leave stuff around to complicate e-discovery >> requests. >> >> David Lang > > The catch is that I can't delete these files. As primary/raw research > data, the time periods to publication of research papers are measured in > years. Even after publication, we are expected to keep these data for > validation by third-parties or even release into the public domain.
if the files can be identified as something to release, then you move them from being owned by a user to something being owned by the system (say the webserver user as they are probably going to be made available through the web) if you are keeping them around due to publication, then you probably want to keep them owned by the real user and just have that user exist, but be locked until you no longer need this. the time periods that I mention above can be years if needed. > It is really madness (to me, at least). And we are starting to face > problems with long term data storage. But I digress... sorry, no sympaty from me here. when you can buy a system off-the-shelf (siliconmechanics.com) with 24x 2TB drives for ~$12K, and other folks are building systems that hold 48x 2TB drives in them cheap (I seem to remember seeing some hosting company that is doing ~$150K per petabyte of storage with this approach), the cost of archival storage, even redundant across multiple machines in addition to raid within each machine, should not be a significant factor. now, high performance storage can be FAR more expensive, I'm not talking about that here. I am talking about data that you need to have accessable, but don't really expect many people to access. David Lang _______________________________________________ Discuss mailing list Discuss@lopsa.org http://lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/