Hello, Arno Lehmann wrote: > Hello, > > 16.07.2007 13:46,, Kern Sibbald wrote:: > >> On Monday 16 July 2007 13:17, Arno Lehmann wrote: >> >>> Hello, >>> > ... > >>>> Yes, either a kernel problem or a hardware problem seem the most likely. >>>> >> We >> >>>> cannot exclude a Bacula bug, but the finger is pointing to the >>>> >> CPU/hardware. >> >>> Well, this is problematic... Alfredo gave good reasons to assume that >>> it's not purely hardware/OS related. Basically, the problem occurs >>> when he runs certain jobs. >>> >> I didn't see that, but then I am no longer receive any email from the >> bacula-users list. >> > > Yes. I know, but it's hard moderating a discussion across two separate > mailing list :-) > > >>> I guess that the interworking of DIR, SD, catalog database, and OS >>> might trigger some sort of resource exhaustion, but debugging this is >>> beyond my abilities :-) >>> >> Or as I mentioned, it could be that Bacula is self destructing ... >> >> >>>> I recommend shutting down your machine, rebooting it, running memtest, and >>>> >> if >> >>>> all is OK, restarting Bacula and see what happens. >>>> >>> Fortunately, that's not my machine :-) >>> >>> Unfortunately, my backup server is dying, but I know and understand >>> that problem :-( >>> >> If you and he *really* think it is a Bacula bug, I'd *strongly* recommend >> that >> he upgrade to the latest 2.1.26 beta version. >> > > I think is a bacula bug, because on my first mail I thought that problem was caused when bacula-dir stay alive for 2 weeks, but last week, on wednesday my server reboot cause ups and bacula-dir restarted. But on sunday he was blocked again. And my jobs, files and volumes retention is set to 14 days (casually 2 weeks). For experience (small), I think that if there is a bug in 2.0.3 version, it's possible that bacula-dir 2.1.23 source code keeps the bug, so it can be more useful search now for the bug and, if present, correct it before developers publish final version 2.1.x. But this is only what I think, I don't know anything about the source code of bacula and what's changed in version 2.1.x from 2.0. Because if is my hw or cpu problem, why all works fine for 2 weeks? Another think: I'm not sure but before I catch this problem I setted job and file retentions different (file < job retention), and I had got no problems. Then my customer asks me to set job and file retention at the same value... and I setted it, and started problems. If I reboot server (or more simply restarts bacula-dir) we'll loose for 2 weeks the chance to make tests on the system. So I would be better if someone tell me which tests should I do... I'm interested to help developers to resolve this problem, because I like very much this project and I want to continue to using it for my customers. But I don't like very much to install on a 16 servers backup system a beta version of bacula-dir, because can born new problems... but if it's the only solution...
> Ok... Alfredo, I suggest you try this. Upgrade and see what happens. > If the problem occurs again after two weeks, use the (then-generated) > trace file and strace output to file a bug report. > > Also, I recommend to closely watch the mailing list for possible > updates of the code... it would not be uncommon that some interesting > bugs are found shortly after the final beta or the release version is > published. > > >> IMO (aside from the Win32 >> testing problem -- the old FD daemons do not need to be upgraded) it is >> ready >> for production use, and I've knocked off 3 or 4 memory overrun problems -- >> particularly one in PostgreSQL. So before declaring it a bug, it is >> important to reproduce it on 2.1.26 or later. >> > > Good... I send this to the users list, too, so Alfredo can notice it :-) > > Arno > > PS: And in case the select() fails again like this, make sure you > subscribe to the developers list, too - this bug might be one needing > some communication between the developers and the one with the > reproducible problem. > > > -- Alfredo Marchini Consulente IT P.IVA: 05649240487 CF: MRCLRD81R07D612B Via Imbriani, 66 50019 Sesto Fiorentino (FI) Tel. +39 393 9566375 E-Mail: [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users