Hello all --
As the world continues to ramp up into the use of virtual machine systems
more and more, its becoming quite an interesting world to live in with
regard to storage systems and backups of these virtual machine files. The
main virtual machine systems such as those by VMWare (I.e., VMWare Fusion
that runs on Mac OS X which is similar if I'm not mistaken to VMWare
Workstation) offer useful options such as snapshots and rollbacks.
One of the consequences of having a lot of virtual machine snapshots around
on a file system is that its easy for these virtual machine *image* files on
the host OS's filesystem to become quite large relatively speaking (it would
be easy to have multiple virtual machines for example whose file sizes on
the host OS's filesystem are well into the multiple Gigabytes). I have
noticed that if one merely boots up a virtual machine, its (relatively
large) *image* file will change (even if the actual changes within the
virtual machine were scant).
Given this context and Bacula, from a file system standpoint, backing up
differentials or incrementals of these large image files on a regular basis
could easily start to become problematic, perhaps not so much with respect
to Bacula Volumes (whether tape, optical disc, hard drive, etc. because one
might argue that storage is cheap and Kryder's Law [1] marches on), but much
more so is the issue of network bandwidth (where distributed backups are
leveraged, which is one of Bacula's greatest strengths) -- moving
gigabyte-scale files can be a problem. Even Amazon, which sells their S3
storage service, has recently offered a beta of their new AWS Import/Export
service ("ship us that disk!"):
http://aws.amazon.com/importexport/
http://aws.typepad.com/aws/2009/05/send-us-that-data.html
*AWS Import/Export: Ship Us That Disk!*
>
> Since station wagons and tapes are both on the verge of obsolescence,
> others have updated this nugget of wisdom to reference DVDs and Boeing 747s.
> Hard drives are getting bigger more rapidly than internet connections are
> getting faster. It is now relatively easy to create a collection of data so
> large that it cannot be uploaded to offsite storage (e.g. Amazon S3) in a
> reasonable amount of time. Media files, corporate backups, data collected
> from scientific experiments, and potential AWS Public Data Sets are now at
> this point. Our customers in the scientific space routinely create terabyte
> data sets from individual experiments.
>
This brings me to a question which is, what about a future version of Bacula
that would be able to perform block level backups of differentials and
incrementals? That way, if say a 4 GB file (representing a virtual machine
for example) had only a small number of disk level blocks that changed, only
those blocks would need to be backed up relative to an initial Full backup?
I imagine one argument might be to just install Bacula on every virtual
machine ever created, but that's not practical. Seeing that Amazon is trying
to solve the problem of backups and bandwidth, it strikes me as if Bacula
could help to scratch this itch as well?
Cheers,
-hydro
[1] http://en.wikipedia.org/wiki/Mark_Kryder
------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises
looking to deploy the next generation of Solaris that includes the latest
innovations from Sun and the OpenSource community. Download a copy and
enjoy capabilities such as Networking, Storage and Virtualization.
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users