Yedidya Bar-david wrote:
>
> Hi
>
> Omer Efraim wrote:
> > point to the same actual data). This technology actually seeks
> > out duplicate files and saves space by storing them only
> > once - it's completely transparent (unless you could your
> > system crashing while it's looking for duplicate files not
> > transparent - but with MS it usually is, you can never
> > know why NT crashes :), and it's actually a darn good idea.
>
> I will only consider it an innovation if it has much better features,
> e.g. copy on write if it made a single copy without letting the user
> know or something, doing it in real time and not every day/week/whatever
> (and I have no idea how can that be done), safely keep the differences
> of close files (not at all close to soft links, more like CVS or something)
> without admin intervension, stuff like that.
> If all you need is a program that finds duplicate files to run in crontab,
> a quick search in debian's Packages file reveals (didn't I already say I
> love debian?):
> perforate - Utilities to save disk space
> fdupes - Identifies duplicate files residing within given directories.
>
It does this in real time, without user intervention (except for
pressing the reset button when the server crashes). Forget about the
implementation, the idea in itself is good, and (AFAIK) has not been
previously done (as opposed to MS's page: "Freedom To Innovate" *snort*,
which
mentioned something suspiciously resembeling anti-aliasing as a MS
technology. I don't have much confidence in MS's ability to produce good
products - I've only seen one or two, and one was the calculator - but
just about anybody can have a good idea). MS stuff almost never lets the
user know - after all, the user might panic if he gets an informational
message. Finding duplicate files is one thing, actually allowing for
storage of just one instance is something else. If you just symlink
some files to a single file, and the owner of that file deletes it,
you're screwed. What this thingy does is take care of all of that (again,
if you ever manage to find that service among the 300,000 services that
Win2k installs for you).
> >
> > I'm not too sure it's very useful though, these days when everyone
> > has their one workstation, duplicate files on the servers are not
> > very common any more (and even if they are, I'd imagine they wouldn't
> > be very big - you'd only have one ISO image of Slack).
>
> Once, when the disk of the home directories at work got full, I made
> a list of big files, sorted by size to find dups, and find something
> like 5 copies of the dancing baby animation (I guess many saw it), a
> more than 10MB AVI. And it was a disk for ~70 users.
> I know of a place that made a file server dedicated to all people's MP3s,
> to prevent duplicates. Disks are cheap today, but not meaningless; you
> have more GBs to backup, to restore (often when time is critical), ...
Good point. Still, I wouldn't expect HUGE savings (on their webpage they
say 80-90% - bah), and it's not really an enterprise-useful technology.
Huge companies (hopefully) won't have a single file server for tons
of users, and the big guns (and by that I mean servers that store
lots of data) will usually be DB servers (by which I mean mail
servers, DB backends etc). Might be good for ISPs and webhosting companies,
but that would mean having to run NT, woops - WIN2k - Which is probably bad
ISPs.
--
"You will now die. Make whatever rituals are necessary for your species."
- Ur-Quan, Kohr-Ah
=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]