On 27/05/2020 15:13, Gary R. Schmidt wrote: > On 27/05/2020 23:17, Alan Brown wrote: >> >> >> >> Bacula DOES NOT LIKE and does not handle network interruptions _at all_ >> if backups are in progress. This _will_ cause backups to abort - and >> these aborted backups are _not_ resumable >> >> Similarly, if there's any kind of disruption between the director and >> database, the only fix is to restart the director >>
>> Opinion: I know bugs aren't sexy to work on but these need fixing, not >> being brushed off. This is the difference between LAN-quality and actual >> Enterprise grade software. >> > I do not consider these to be bugs - they aren't simple errors where > someone made a mistake or used the wrong sized variable - they require > a large amount of re-design and reimplementation of Bacula's > communication modules, and the scheduler, and no doubt other bits to > go away. Nonetheless they need to be done. There are a lot of assumptions made about networks that simply do not hold true or only work in SOHO/SMB scale. > > Bacula started life twenty years ago, and the environment has changed > since then, and, while Bacula has kept up with a some things, disk as > a target rather than tape, frex, something like re-startable jobs is, > as I have said, not just an extension or addition to what is there, > but a big change to a large part of Bacula. Restarting is there for stopped jobs already. The question is how much work is needed to extend that to aborted or errored jobs > And, from the commercial stand-point, that the changes could be made without interrupting the existing income stream. There's a "cost of not implementing". I'm facing pressure to replace Bacula and this is pointed to as one of the reasons - bear in mind we're a paying customer who would go away if this isn't sorted > > Then there's the projected time-line before it could be released? You can't project that if it's not even on your TODO list and right now it keeps being swept into the "WON'T DO" basket. > I don't want to think about that, Bacula is fragile as it is, ripping > it apart and stitching it back together would be a massive task! This is exactly why I _do_ want to think about it. This is _where_ it's fragile and what most fundamentally needs fixing. Enteprise software needs to be robust. Bacula is not - in extremely critical areas "If carpenters built buildings the way programmers write programs, the first woodpecker that came along would destroy civilization." > And Bacula does not have that capability, not in the OSS space nor in > the Enterprise space. > > All the above said, I think that re-startable jobs would be a great > enhancement for Bacula, but how often and for how long does it try by > default before giving up? :-> > restartable, or reconnecting? (and why not just set defaults - then let the users decide on #attempts/timeouts?) The single most fragile part of Bacula: If the database connection glitches for _any_ reason the only solution is to restart the entire program - and you lose _everything_ that was underway at the time. As I said, that includes using a high availability database (postgresql, etc). As soon as heads are switched there's a necessary glitch in the connection. Database connections are _supposed_ to be stateless. Bacula breaks that and as such it's a fundamental bug, whether by design or not. _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users