On Sat, 24 Oct 2009, Junhao wrote:

> On 10/22/2009 01:23 AM, da...@lang.hm wrote:
>> On Thu, 22 Oct 2009, Junhao wrote:
>>
>>> Hi!
>>>
>>> At my workplace, I am in charge of data storage for my research group.
>>> These files are placed in a *NIX file server, and users authentication
>>> is through my corporate AD. Files are owned by individual users; other
>>> users from the same group can only read the files. As primary research
>>> data files, we basically expect these to be available forever.
>>>
>>> This system has worked well till several of my colleagues left. Their
>>> user accounts were promptly deleted from the corporate AD, creating a
>>> situation where their files are owned by invalid/unknown users.
>>>
>>> My workplace does not have a policy to handle this situation, so I am
>>> wondering how everyone handles this age-old problem. Any advice?
>>
>> I see this as your real problem, the issue of the files and their
>> ownership is a symptom of the problem.
>>
>> I would lock the user for some period of time, then archive the
>> files/e-mail/etc for some period of time, then delete them.
>>
>> time periods need to be decided by someone who can take the blame if
>> they are too short and you delete something the company needs, or if
>> they are too long and leave stuff around to complicate e-discovery
>> requests.
>>
>> David Lang
>
> The catch is that I can't delete these files. As primary/raw research
> data, the time periods to publication of research papers are measured in
> years. Even after publication, we are expected to keep these data for
> validation by third-parties or even release into the public domain.

if the files can be identified as something to release, then you move them 
from being owned by a user to something being owned by the system (say the 
webserver user as they are probably going to be made available through the 
web)

if you are keeping them around due to publication, then you probably want 
to keep them owned by the real user and just have that user exist, but be 
locked until you no longer need this. the time periods that I mention 
above can be years if needed.

> It is really madness (to me, at least). And we are starting to face
> problems with long term data storage. But I digress...

sorry, no sympaty from me here. when you can buy a system off-the-shelf 
(siliconmechanics.com) with 24x 2TB drives for ~$12K, and other folks are 
building systems that hold 48x 2TB drives in them cheap (I seem to 
remember seeing some hosting company that is doing ~$150K per petabyte of 
storage with this approach), the cost of archival storage, even redundant 
across multiple machines in addition to raid within each machine, should 
not be a significant factor.

now, high performance storage can be FAR more expensive, I'm not talking 
about that here. I am talking about data that you need to have accessable, 
but don't really expect many people to access.

David Lang
_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to