On 12/12/22 10:47, Marco Gaiarin wrote:
I'm using 'vchanger' script defining some virtual changer.

I don't see that Dir is reporting any errors invoking vchanger, no timeouts or vchanger errors, but are there any vchanger processes still running? The vchanger processes should be very short lived. In

Are you sure that the filesystem is mounted at the mount point pointed to by /var/spool/vchanger/SDPVE2RDX/0 ? Versions of vchanger before 1.0.3 used the nohup command in udev scripts which does not work as expected when invoked by udev and can cause the udev auto-mounting to fail.

Another problem with versions before 1.0.3 is that the locking used to serializes concurrent vchanger processes had a race condition that could prevent a vchanger instance from running and cause a LOAD or UNLOAD command to fail, although that should be logged as a timeout error by bacula-dir. As a diagnostic aid, you can turn off this behavior in the vchanger config by setting bconsole="". That will prevent vchanger from invoking bconsole at all and eliminate the possibility of the race condition.

Also, you have tried using 'umount' and 'update slots' in bconsole, but did you try the 'mount' command? It is the mount command that would cause bacula-dir to choose a volume and invoke vchanger to load it.



Sometimes, and i've looked at but found nothing in logs, the SD 'stalled';
typical situation:

*status storage=SDPVE2RDX
Connecting to Storage daemon SDPVE2RDX at sdpve2.sd.lnf.it:9103

sdpve2-sd Version: 9.4.2 (04 February 2019) x86_64-pc-linux-gnu debian 10.5
Daemon started 19-Sep-22 13:01. Jobs: run=90, running=1.
  Heap: heap=139,264 smbytes=799,681 max_bytes=1,701,085 bufs=209 max_bufs=375
  Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 newbsr=0
  Res: ndevices=3 nautochgr=1

Running Jobs:
Writing: Incremental Backup job Sdinny JobId=2263 Volume=""
     pool="SDPVE2RDXPool" device="RDXStorage0" (/var/spool/vchanger/SDPVE2RDX/0)
     spooling=0 despooling=0 despool_wait=0
     Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
     FDReadSeqNo=6 in_msg=6 out_msg=4 fd=5
Writing: Full Backup job SDPVE2-VMs JobId=2274 Volume=""
     pool="SDPVE2RDXPool" device="RDXStorage0" (/var/spool/vchanger/SDPVE2RDX/0)
     spooling=0 despooling=0 despool_wait=0
     Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
     FDReadSeqNo=6 in_msg=6 out_msg=5 fd=7
====

Jobs waiting to reserve a drive:
====

Terminated Jobs:
  JobId  Level    Files      Bytes   Status   Finished        Name
===================================================================
   2120  Incr      7,388    168.4 M  OK       01-Dec-22 23:01 Sdinny
   2137  Full    620,898    179.7 G  OK       02-Dec-22 22:07 Sdinny
   2147  Incr        949    93.80 M  OK       03-Dec-22 23:02 Sdinny
   2158  Full          6    46.26 G  OK       04-Dec-22 02:05 SDPVE2-VMs
   2168  Incr        542    93.40 M  OK       04-Dec-22 23:02 Sdinny
   2185  Incr      8,063    227.5 M  OK       05-Dec-22 23:02 Sdinny
   2202  Incr      8,497    257.1 M  OK       06-Dec-22 23:06 Sdinny
   2219  Incr      9,638    228.3 M  OK       07-Dec-22 23:02 Sdinny
   2236  Incr        986    93.80 M  OK       08-Dec-22 23:02 Sdinny
   2253  Full          0         0   Error    09-Dec-22 20:01 Sdinny
====

Device status:
Autochanger "RDXAutochanger" with devices:
    "RDXStorage0" (/var/spool/vchanger/SDPVE2RDX/0)
    "RDXStorage1" (/var/spool/vchanger/SDPVE2RDX/1)
    "RDXStorage2" (/var/spool/vchanger/SDPVE2RDX/2)

Device File: "RDXStorage0" (/var/spool/vchanger/SDPVE2RDX/0) is not open.
    Device is being initialized.
    Drive 0 is not loaded.
==

Device File: "RDXStorage1" (/var/spool/vchanger/SDPVE2RDX/1) is not open.
    Drive 1 is not loaded.
==

Device File: "RDXStorage2" (/var/spool/vchanger/SDPVE2RDX/2) is not open.
    Drive 2 is not loaded.
==
====

Used Volume status:
Reserved volume: SDPVE2RDX_0002_0004 on File device "RDXStorage0" 
(/var/spool/vchanger/SDPVE2RDX/0)
     Reader=0 writers=0 reserves=2 volinuse=1 worm=0
====

Attr spooling: 0 active jobs, 0 bytes; 86 total jobs, 178,259,171 max bytes.
====


eg, there are jobs stalled waiting a volume, in director:

Running Jobs:
Console connected at 12-Dec-22 14:31
  JobId  Type Level     Files     Bytes  Name              Status
======================================================================
   2263  Back Incr          0         0  Sdinny            is running
   2274  Back Full          0         0  SDPVE2-VMs        is running
====

but seems that the virtual changer is still on the 'Reserved volume:
SDPVE2RDX_0002_0004'.

If i try to 'umount', 'update slots', ... nothing changed. Volumes status
seems OK:

*list media pool=SDPVE2RDXPool
Using Catalog "BaculaLNF"
+---------+---------------------+-----------+---------+-----------------+----------+--------------+---------+------+-----------+-----------+---------+----------+---------------------+-----------+
| mediaid | volumename          | volstatus | enabled | volbytes        | 
volfiles | volretention | recycle | slot | inchanger | mediatype | voltype | 
volparts | lastwritten         | expiresin |
+---------+---------------------+-----------+---------+-----------------+----------+--------------+---------+------+-----------+-----------+---------+----------+---------------------+-----------+
|      45 | SDPVE2RDX_0000_0000 | Used      |       1 |     427,788,267 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        1 
| 2022-11-28 23:02:11 |   541,022 |
|      46 | SDPVE2RDX_0000_0001 | Used      |       1 | 226,047,710,801 |       
52 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-11-27 23:02:48 |   454,659 |
|      47 | SDPVE2RDX_0000_0002 | Used      |       1 |     618,329,415 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-11-30 23:02:05 |   713,816 |
|      48 | SDPVE2RDX_0000_0003 | Used      |       1 |     168,777,663 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-01 23:01:32 |   800,183 |
|      49 | SDPVE2RDX_0000_0004 | Error     |       1 |               1 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-11-10 23:01:47 |         0 |
|      50 | SDPVE2RDX_0000_0005 | Error     |       1 |               1 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|      51 | SDPVE2RDX_0000_0006 | Error     |       1 |               1 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|      52 | SDPVE2RDX_0000_0007 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|      53 | SDPVE2RDX_0000_0008 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|      54 | SDPVE2RDX_0000_0009 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        1 
| 2022-08-25 23:01:40 |         0 |
|      75 | SDPVE2RDX_0001_0000 | Purged    |       1 |     370,671,659 |       
 0 |    1,728,000 |       1 |    1 |         1 | RDX       |       1 |        0 
| 2022-11-24 23:02:21 |   195,432 |
|      76 | SDPVE2RDX_0001_0001 | Purged    |       1 |               1 |       
 0 |    1,728,000 |       1 |    2 |         1 | RDX       |       1 |        0 
| 2022-11-03 23:02:29 |         0 |
|      77 | SDPVE2RDX_0001_0002 | Used      |       1 |  46,981,141,220 |       
10 |    1,728,000 |       1 |    3 |         1 | RDX       |       1 |        0 
| 2022-11-21 23:02:13 |         0 |
|      78 | SDPVE2RDX_0001_0003 | Recycle   |       1 |               1 |       
 0 |    1,728,000 |       1 |    4 |         1 | RDX       |       1 |        0 
| 2022-11-18 21:58:19 |         0 |
|      79 | SDPVE2RDX_0001_0004 | Used      |       1 |   1,473,650,131 |       
 0 |    1,728,000 |       1 |    5 |         1 | RDX       |       1 |        0 
| 2022-11-23 23:02:08 |   109,019 |
|      80 | SDPVE2RDX_0001_0005 | Purged    |       1 |               1 |       
 0 |    1,728,000 |       1 |    6 |         1 | RDX       |       1 |        0 
|                     |         0 |
|      81 | SDPVE2RDX_0001_0006 | Purged    |       1 |               1 |       
 0 |    1,728,000 |       1 |    7 |         1 | RDX       |       1 |        0 
|                     |         0 |
|      82 | SDPVE2RDX_0001_0007 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    8 |         1 | RDX       |       1 |        0 
|                     |         0 |
|      83 | SDPVE2RDX_0001_0008 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    9 |         1 | RDX       |       1 |        0 
|                     |         0 |
|      84 | SDPVE2RDX_0001_0009 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |   10 |         1 | RDX       |       1 |        1 
| 2022-10-12 23:01:43 |         0 |
|     166 | SDPVE2RDX_0002_0001 | Used      |       1 |  46,482,814,763 |       
10 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-04 23:02:35 | 1,059,446 |
|     167 | SDPVE2RDX_0002_0002 | Used      |       1 |     227,968,691 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-05 23:02:10 | 1,145,821 |
|     168 | SDPVE2RDX_0002_0003 | Used      |       1 |     486,470,204 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-07 23:02:05 | 1,318,616 |
|     169 | SDPVE2RDX_0002_0004 | Used      |       1 |      93,909,936 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-08 23:02:14 | 1,405,025 |
|     170 | SDPVE2RDX_0002_0005 | Error     |       1 |               1 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|     171 | SDPVE2RDX_0002_0006 | Error     |       1 |               1 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|     172 | SDPVE2RDX_0002_0007 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|     173 | SDPVE2RDX_0002_0008 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
|                     |         0 |
|     174 | SDPVE2RDX_0002_0009 | Disabled  |       1 |             234 |       
 0 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        1 
| 2022-09-23 21:53:46 |         0 |
|     295 | SDPVE2RDX_0002_0000 | Used      |       1 | 179,877,194,043 |       
41 |    1,728,000 |       1 |    0 |         0 | RDX       |       1 |        0 
| 2022-12-02 21:58:51 |   882,822 |
+---------+---------------------+-----------+---------+-----------------+----------+--------------+---------+------+-----------+-----------+---------+----------+---------------------+-----------+

they are in slots.

The only way i've found to 'cure' this is restart the storage daemon,
loosing of course the running jobs.


How can i fix this?! Thanks.



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to