On 2005-12-14 22:23, Kern Sibbald wrote:
Hi Kern.On Wednesday 14 December 2005 22:13, Jonas Mixter wrote:On 2005-12-14 17:55, Attila Fülöp wrote:Jonas Mixter wrote:On 2005-12-14 10:29, Jonas Mixter wrote:Hi! Yesterday I upgraded my director, fd and sd from bacula 1.36.2 (that comes with Debian sarge) to 1.38.2. I have one machine running the director (with a connected tape-station) and one machine with a lot of disks and a storage daemon (this should later on be placed elsewhere, hence the separation of the daemons). The director is named merry-dir and the storage daemon is named pippin-sd. Before the upgrade I could backup for example the catalog to a file storage on the machine with just the sd. I could also restore files from the backup.After the upgrade, I could backup just as before. The job exits OK and I could see that the files that hold the backup are growing. But I cannot restore... When running a restore I could mark the files I want and the jobs is visible underneath "Running Jobs" in bconsole. The job is "waiting for Client merry-fd to connect to Storage File" and after a while the job times out. Here's the output when trying to restore a file to merry-fd (running on the same server as the director): 14-Dec 09:03 merry-dir: Start Restore Job RestoreFiles.2005-12-14_09.03.22 14-Dec 09:13 merry-fd: RestoreFiles.2005-12-14_09.03.22 Fatal error: Authorization key rejected by Storage daemon. Please see http://www.bacula.org/rel-manual/faq.html#AuthorizationErrors for help. 14-Dec 09:13 merry-fd: RestoreFiles.2005-12-14_09.03.22 Fatal error: Failed to authenticate Storage daemon. 14-Dec 09:13 merry-dir: RestoreFiles.2005-12-14_09.03.22 Fatal error: Socket error on Storage command: ERR=No data available 14-Dec 09:13 merry-dir: RestoreFiles.2005-12-14_09.03.22 Error: Bacula 1.38.2 (20Nov05): 14-Dec-2005 09:13:30 JobId: 512 Job: RestoreFiles.2005-12-14_09.03.22 Client: merry-fd Start time: 14-Dec-2005 09:03:25 End time: 14-Dec-2005 09:13:30 Files Expected: 1 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 0 FD termination status: SD termination status: Waiting on FD Termination: *** Restore Error *** 14-Dec 09:13 merry-dir: RestoreFiles.2005-12-14_09.03.22 Error: Bacula 1.38.2 (20Nov05): 14-Dec-2005 09:13:30 JobId: 512 Job: RestoreFiles.2005-12-14_09.03.22 Client: merry-fd Start time: 14-Dec-2005 09:03:25 End time: 14-Dec-2005 09:13:30 Files Expected: 1 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Waiting on FD Termination: *** Restore Error *** Note the job fails twice... Is that really correct? I could backup and restore without any problems to a local file on the server running the director (tested only once though). I could check the status of the storage daemon from the director, and also do backups so the passwords should be OK, right? I've also seen this error sometimes "14-Dec 09:43 merry-sd: Job RestoreFiles.2005-12-14_09.43.50 waiting to reserve a device." but don't really know the meaning. In a previous post i saw that Kern asked about the /lib/tls-directory when presented with this error. I do have the /lib/tls, but start all my daemons with the startscript that bacula provides. In that script the "LD_ASSUME_KERNEL=2.4.19"-variable is exported. (If running kernel 2.4! is the /lib/tls a problem with kernel 2.6 too?) I use kernel 2.4.27 on the server with the director, and 2.6.8 on the sd server. The /lib/tls is there on both the servers which are running Debian sarge. Should really "merry-sd" be involved? I've selected the storage on pippin-sd when running "restore" in bconsole. Could it be that the director/fd is trying to restore from the wrong sd? Shouldn't the job then fail with a "volume not found" error, ask me to mount a volume, or similiar? If I try to cancel a job, the director from time to time complains it cannot the job even though it's listed under running jobs. *cancel Automatically selected Job: JobId=515 Job=RestoreFiles.2005-12-14_10.18.20 Confirm cancel (yes/no): yes 3902 Job RestoreFiles.2005-12-14_10.18.20 not found. The job is marked "has been canceled" in bconsole anyway. I don't remember if I got errors like this before the upgrade. What could be wrong in my setup? I've been banging my head to the wall for quite a few hours now and I'm out of ideas. Best regards, Jonas MixterHi! I've doing some more tests today. There seems to be no problems at all restoring from backups stored on tape. If I copy the entire backupfile from the server with running only the sd, to the server running the bacula-dir I could do a restore files without any problems. Does anyone have any suggestions where continue trouble shooting? Best regards, Jonas MixterAuthorization key rejected by Storage daemon. Please see http://www.bacula.org/rel-manual/faq.html#AuthorizationErrors for help. 14-Dec 09:13 merry-fd: RestoreFiles.2005-12-14_09.03.22 Fatal error: Failed to authenticate Storage daemon. Either you have a password mismatch or some firewall in between.Hi! Thank you for your answer. I'm afraid this is not the problem though, even if it seems obvious. There is no firewall between (or on) the machines. They are connected to the very same network switch and I got no iptables or similar activated. The passwords seem to match when I review the config files and I have no problem using "status storage" or running backups _to_ this storage daemon. I shouldn't be able to do that if the passwords weren't matching, right? I only get this problem when trying to restore _from_ the storage daemon. It also seems more like a timeout (10 minutes between the start of the job and the error) than an actual authorization error too me. Am I wrong?Most likely the error message is correct, but for a slightly more subtle reason -- I suspect that either your Director and the Client don't resolve the SD address to the same IP or more likely the SD has crashed. Thank you for taking time to review my problem. I have an IP-address specified in my bacula-dir.conf so I don't think that there should be any resolve problem. The server running the the storage daemon has no entry in the DNS-system at the moment. Could that be a problem even though I specify an IP as the address for the server? The storage daemon must be up and running since I could do a second backup right after the failed restore. (Just tested to be sure.) Could I start any of the daemons in some kind of debug mode? Would that do me any good? I've included the parts of my configuration files that I think is relevant here below. Could anyone on the list spot an error? (The commented out lines about different ports are for a future stunnel, but at the moment I use standard bacula ports for the sd.) Best regards, Jonas Mixter >From the storage daemon: Storage { # definition of myself Name = pippin-sd #SDPort = 59103 # Director's port SDPort = 9103 # Director's port WorkingDirectory = "/var/bacula/working" Pid Directory = "/var/run" Maximum Concurrent Jobs = 20 } Director { Name = merry-dir Password = "mypassword" } Device { Name = FileStorage Media Type = File Archive Device = /backup LabelMedia = yes; # lets Bacula label unlabeled media Random Access = Yes; AutomaticMount = yes; # when device opened, read it RemovableMedia = no; AlwaysOpen = no; } >From the directors config file: # Definition of file storage device Storage { Name = File Address = my-ip-for-the-storage-daemon-server #Address = merry.jamtport.se # Här finns en stunnel som krypterar trafiken och skickar den till pippin SDPort = 9103 #SDPort = 9104 Password = "mypassword, same as above" # password for Storage daemon Device = "FileStorage" # must be same as Device in Storage daemon Media Type = "File" # must be same as MediaType in Storage daemon } |
- Re: [Bacula-users] Backup to file OK, but restore fails Jonas Mixter