-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello,
On 06/08/2014 12:09 PM, Steven Haigh wrote: > I do believe this is one of the biggest shortcomings of Bacula... The > fact it is job based vs file based removes a lot of flexibility. This difference with TSM, is in fact one of the great accidental inventions of Bacula, because in one extreme example, there is a large European site running 12 instances of Tivoli. These 12 instances are needed because Tivoli requires so many resource to do the backup (*very* big data volume but not a huge number of clients). Bacula can do the same thing with 1 instance of the Bacula director, because Bacula is *far* more efficient in dealing with individual files than TSM is. As a result, I don't see this design difference as a shortcoming but rather a major advantage. > > If I understand things properly, for a VirtualFull will: > 1) Require all volumes as stated below, and; > 2) Require enough space to write the entire backup out again; and > 3) is unable to keep a copy of a file forever if it is never changed. For point 1) it should be noted that it requires all Volumes where any file is uniquely stored. That is if there is an Incremental JobId 10 that is part of the Incremental jobs, if all files that JobId 10 wrote have been modified since then, that JobId will not be re-read, and thus providing there are no other jobs the Volume will probably not be needed. Another way of stating it, is that Bacula will only need volumes associated with the last time any given file was changed. > > > Instead, after the purge date, the file is deleted and retransferred - > unless it is done by a VirtualFull - which still has the problems of #1 > and #2 above. I guess I could consider #1 a problem. However, #2 is certainly not a disadvantage. It is *exactly* what is desired because we want to be able to free up the old fragmented backup storage and thus the old volumes. > > > As such, I'm not sure that I can easily achieve my goals with Bacula. > I'm still not exactly sure as to what my other alternatives are as yet. > > I currently have an rsync going between hosts and create a copy of > backups with hard links to minimise space used but still get a > consistent view of each host (rotating daily). Maybe coupling this with > a filesystem that supports compression would assist in making more space > available for backups... > > As I've been used to TSM for so long (many years now!), I got used to > how it works - and I'm having trouble moving on! :) In general, Bacula can do everything that TSM can (the overall architecture is very similar). However as with every other backup product, each has its own way of accomplishing certain tasks. To succeed you have to learn the Bacula way of doing things and not try to force it to fit into any pre-concieved way of doing things. > > > > On 08/06/14 19:32, Kern Sibbald wrote: >> >> Hello, >> >> To do a VirtualFull you do need to have all backups since the last Full >> or VirtualFull available. >> >> I recommend against production use of SQLite, unless you have less than >> 10 machines. >> >> Normally there is no reason why an instance of MySQL/PostgreSQL cannot >> be put in a VM that is running Bacula -- I do that all the time. >> >> Best regards, >> Kern >> >> On 06/08/2014 11:04 AM, Steven Haigh wrote: >>> Hi all, >> >>> The one thing I can see tripping me up is that from what I understand, >>> for a VirtualFull I will need access to ALL jobs since the last >>> VirtualFull. In the case of a removable eSATA drive that won't be online >>> all the time, I can't guarantee that access will be available to that >> drive. >> >>> At this stage, I have been using SQLite - simply to keep the entire >>> system contained. I do have a MySQL server available - but the idea is >>> to keep the backup system contained to a single VM. >> >>> The ponderings of which direction to go is difficult :) >> >>> On 08/06/14 18:31, Kern Sibbald wrote: >>>> >>>> Hello, >>>> >>>> I cannot help you with your overall design because I am more effective >>>> writing new code than helping with implementations (very important). >>>> However a couple of points, which are my personal opinions: >>>> >>>> 1. Your setup is medium size and would work fine with MySQL, but if you >>>> can accept a short term learning curve and would like long term peace of >>>> mind, I would use PostgreSQL to avoid performance problems later. It is >>>> harder to setup correctly and tune in the beginning (performance is >>>> pretty bad with the out of the box PostgreSQL configuration, but >>>> longterm it performs in big installations *much* better than MySQL. >>>> >>>> 2. You need to carefully setup incrementals forever, but Bacula has >>>> supported that feature from the beginning and if you take the time to >>>> understand and use Virtual Full jobs and accurate backups (at least once >>>> a week), Incrementals forever can be much more efficient compared to >>>> normal backups. If you don't use accurate mode (at least occasionally) >>>> and VirtualFulls, stay away from incrementals forever. >>>> >>>> 3. I also recommend using the Bacula "virtual" autochanger for disk >>>> based systems. It is very robust and simple, but there is not a lot of >>>> documentation on it. >>>> >>>> Best regards, >>>> Kern >>>> >>>> On 06/08/2014 05:00 AM, Steven Haigh wrote: >>>>> Hi guys, >>>> >>>>> So I'm starting from scratch again with my bacula config. I thought I'd >>>>> try to get some pointers before I dive in head first again. >>>> >>>>> My setup consists of multiple virtual machines. Some over GigE, some >>>>> over an ADSL connection (6000/800kbit). My aim is to transfer as little >>>>> as possible over the ADSL connection - but enough to be able to restore >>>>> if required. >>>> >>>>> I would like to use some local disk storage (say 40Gb), and have the >>>>> rest go to a removable external eSATA drive. I'm thinking this could >>>>> mainly be done via job migration when the internal storage starts to get >>>>> full. >>>> >>>>> As some insight, my current setup has ~167 daily incremental backups and >>>>> has used under 11Gb of space on the 'on disk' volumes. The amount of >>>>> data changed per day isn't really huge. >>>> >>>>> Some more specific questions: >>>>> 1) I want to try and avoid vchanger and use something that can use the >>>>> eSATA drive properly - grow the number of volumes automatically to fill >>>>> the entire eSATA drive. Bonus points for being able to just plug in a >>>>> new eSATA drive and expand further. >>>> >>>>> 2) From my previous posts, I heard that daily incrementals forever may >>>>> be a bad idea - the whole job based backup vs the file based backup that >>>>> I'm used to with TSM. What would be the suggested route for backups >>>>> being done? I obviously don't want to do Full backups over the ADSL >>>>> connection every week / month. >>>> >>>>> 3) I'm starting from scratch with Bacula v7 and all systems are a >>>>> mixture of RHEL6 and Fedora 20. Are there any gotchas I should be aware >>>>> of straight up? >>>> >>>>> 4) Any general comments? :) >>>> >> >> >> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlOVcpEACgkQNgfoSvWqwEjGlwCfemhJwKSwcLHv/N7tWTQ51mKd BCgAoLLKq+JneBNHISOMu4EKEemAx0oT =ensI -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions Find What Matters Most in Your Big Data with HPCC Systems Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph Analysis for Fast Processing & Easy Data Exploration http://www.hpccsystems.com _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users