On 2025-09-03 14:04, Martin Simmons wrote:
On Wed, 3 Sep 2025 12:55:51 -0400, Gary Dale said:
On 2025-09-03 11:07, Josh Fisher via Bacula-users wrote:
On 9/2/25 17:03, Gary Dale wrote:
On 2025-09-02 12:16, Gary Dale wrote:
On 2025-09-02 12:04, Gary Dale wrote:
When I run bacula-dir with -d 100 then try the connection to my
client using bconsole, I get the output below, which shows that the
address translation is working. The fd is running on the client
machine and the name and password match. I've listed the server as
a director authorized to contact the fd. And yes, I did restart the
fd. No, there is no firewall. Yes, I can do things like ping the
client workstation.
...
OK, I've got the server bconsole to connect, but it's not actually
doing anything AFAICT. The director shows the backup job running for
a while but nothing gets backed up. Eventually the job stops.
I can back up the server, but the workstation gives these messages
when I run it:
02-Sep 16:51 TheLibrarian-dir JobId 7: No prior or suitable Full
backup found in catalog. Doing FULL backup.
02-Sep 16:51 TheLibrarian-dir JobId 7: Start Backup JobId 7,
Job=<WorkstationBackup>.2025-09-02_16.51.38_19
02-Sep 16:51 TheLibrarian-dir JobId 7: Connected to Storage "File1"
at 127.0.0.1:9103 with TLS
02-Sep 16:51 TheLibrarian-dir JobId 7: Created new Volume="Vol-0001",
Pool="File", MediaType="File1" in catalog.
02-Sep 16:51 TheLibrarian-dir JobId 7: Using Device "FileChgr1-Dev1"
to write.
02-Sep 16:51 TheLibrarian-dir JobId 7: Connected to Client "<client
name>-fd" at <client FQDN>:9102 with TLS
02-Sep 16:51 TheLibrarian-dir: ABORTING via segfault due to ERROR in
bnet_server.c:135
Cannot bind port 9101: ERR=Address already in use.
What are the versions of the daemons? The bacula-dir and bacula-sd
daemons must be the same version, and the bacula-fd client must NOT be
newer than the server daemons.
The workstation client id is installed from the Forky repository and is
15.0.3-5,
The server version is from Trixie and is 15.0.3-3,
My understanding of the numbering is the ones after the "-" are build
numbers - they don't include feature changes. In fact, the first (.0)
and second sub-version numbers (.3) changing shouldn't break compatibility.
An interesting side note: I changed the autochanger address to the
server address last night, and it's no longer preventing me from
communicating with the clients. That is, I can use bconsole status
client to reach both the server and the workstation.
However, if I use bconsole status network, I can only get the status of
the server. If I try the workstation, bconsole never responds.
>>>>>>>>>>>>>>>>>>>>>>>>>>
*status
Status available for:
1: Director
2: Storage
3: Client
4: Scheduled
5: Network
6: All
Select daemon type for status (1-6): 5
The defined Client resources are:
1: TheLibrarian-fd
2: workstation-fd
Select Client (File daemon) resource (1-2): 1
Automatically selected Storage: File1
Connecting to Storage File1 at 127.0.0.1:9103
Connecting to Client TheLibrarian-fd at localhost:9102
Running network test between Client=TheLibrarian-fd and Storage=File1
with 52.42 MB ...
2000 OK FD wrote bytes=52428800 to SD duration=77ms write_speed=677.3 MB/s
2000 OK FD read bytes=52428800 from SD duration=93ms read_speed=566.5 MB/s
2000 OK packets=10 duration=1ms rtt=0.09ms min=0.04ms max=0.16ms
*status
Status available for:
1: Director
2: Storage
3: Client
4: Scheduled
5: Network
6: All
Select daemon type for status (1-6): 5
The defined Client resources are:
1: TheLibrarian-fd
2: workstation-fd
Select Client (File daemon) resource (1-2): 2
Automatically selected Storage: File1
Connecting to Storage File1 at 127.0.0.1:9103
Connecting to Client workstation-fd at workstation.<FQDN>:9102
<<<<<<<<<<<<<<<<<<<<<<<<<
This hangs because the client workstation-fd is trying to connect to the
storage daemon on 127.0.0.1:9103. That is exactly why you shouldn't use
localhost (or 127.0.0.1) in the config files.
Which config files? The only one that has a warning is the autochanger
section in bacula-dir.conf. Everywhere else, 127.0.0.1 is in place with
no comments about not using it. The server address only works in the
autochanger section for getting the client-fd status for both the server
and workstation.
However, the sd and network status's still doesn't ever return any
information. I have to kill the bconsole process to regain control. When
I use 127.0.0.1, I can at least connect to the sd and get the network
status for the server. This allows server backups to take place.
And I am still getting errors like:
03-Sep 12:32 TheLibrarian-dir JobId 14: shell command: run AfterJob
"/etc/bacula/scripts/delete_catalog_backup"
03-Sep 12:32 TheLibrarian-dir: Warning: Cannot bind port 9101:
ERR=Address already in use: Retrying ...
03-Sep 12:31 TheLibrarian-dir JobId 0: Error: Director's connection to
SD for this Job was lost.
03-Sep 12:33 TheLibrarian-dir: ABORTING via segfault due to ERROR in
bnet_server.c:135
Cannot bind port 9101: ERR=Address already in use.
03-Sep 12:34 TheLibrarian-dir: Warning: Cannot bind port 9101:
ERR=Address already in use: Retrying ...
03-Sep 12:35 TheLibrarian-dir: ABORTING via segfault due to ERROR in
bnet_server.c:135
The only time it should be reaching bnet_server.c:135 for port 9101 is when
starting the bacula-dir while another service is using port 9101 (e.g.
bacula-dir is already running).
Is this message in the syslog/systemd-journal?
__Martin
You're right. Somehow I had two copies of bacula-director running. I
killed them both then started a new one. That actually explains why I
was apparently able to to use the server address - the bacula-director
was still using 127.0.0.1.
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users