Dear Bill, please find my response inline, further down:

> On 2. Mar 2023, at 17:33, Bill Arlofski via Bacula-users 
> <bacula-users@lists.sourceforge.net> wrote:
> 
> On 3/1/23 16:22, Justin Case wrote:
>> What I actually did was to select them in Baculum and purged them. Then they 
>> were in state “Purged”. Then next time a volume from this pool was needed, 
>> Baculum grabbed the first one and used it without problems.
> 
> Hello Justin,
> 
> Well this does not make sense to me from what I understand from your initial 
> description of the issue of volumes in error.
> 
> I will do a couple tests to be sure, but if the scenario is like:
> 
> - New volume needed
> - Director creates one in the catalog
> - SD is instructed to create same volume on disk
> - Disk Volume creation fails (due to unmounted filesystem, permissions issue, 
> etc)
> - Director marks the volume in Error status (well, Read-Only in my tests 
> here) in the catalog and it is no longer eligible for any write/truncate 
> actions by the Director, nor the SD.
> 
> In the scenario above, we have an "orphaned" volume in the catalog, and no 
> file volume on disk. Nothing except deleting this volume from the catalog can 
> really be done.
> 
> If the volstatus is manually set to Append, Recycle, or Purged, the next time 
> any action is attempted by the SD, it will fail since the file volume does 
> not exist to read.
> 
> When the SD attempts to open a volume, the first thing the SD does is attempt 
> to read its label. Of course this would fail on a non-existent file volume, 
> and the Director would reset the volume to Error (Read-Only) in the catalog.
> 
> BUT... As I re-read your initial email, I see that the scenario I described 
> above is not what happened. :)
> 
> In your case, the file volume did exist on disk, but at the time of the 
> error, the SD could not access it (disk not mounted, NFS/iSCSI/FC/CIFS 
> problems, etc).  It seems that later, as you described this partition and the 
> volumes on it became available. This is why the purge worked for you.
> 

I am sorry you are mistaken. I triple checked that the volume did NOT exist on 
disk. When I checked, the disk was mounted. The volume limit of the pool was 
not reached yet. The disk shows as 100% full, but there were some hundred GB 
freeon disk, however, seems to have been the root cause. I have not changed any 
permissions and it works smoothly again.

Here is the full log for one of the volumes set to status “Error”:

Created new Volume="vol-2667", Pool=“client1", MediaType=“file" in catalog.
Warning: label.c:404 Open File device “storagdev1" (/mnt/storage1) Volume 
"vol-2667" failed: ERR=file_dev.c:189 Could not 
open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device
Warning: label.c:404 Open File device “storagedev1" (/mnt/storage1) Volume 
“vol-2667" failed: ERR=file_dev.c:189 Could not 
open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device
Warning: mount.c:216 Open of File device “storagedev1" (/mnt/storage1) Volume 
"vol-2667" failed: ERR=file_dev.c:189 Could not 
open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device
Marking Volume "vol-2667" in Error in Catalog.


> Notice in your initial message you say:
> ----8<----
> today I had the situation that bacula tried to create a new disk file volumes 
> and the creation failed
> ----8<----
> 
> and you show the log message:
> ----8<----
> Warning: mount.c:216 Open of File device “storagedev1" (/mnt/storage1) Volume 
> "vol-2675" failed: ERR=file_dev.c:189 Could not 
> open(/mnt/storage1/vol-2675,OPEN_READ_WRITE,0640): ERR=No such file or 
> directory
> ----8<----
> 
> 
> Notice this is a *MOUNT* action, not a *LABEL* action. It is the SD trying to 
> open and read the volume's label.

Sorry, the log line was missing as the logs for the volume were kind of 
scattered as there were also requests for labeling a volume. I went back in the 
logs to find a more clear block for a volume with a lower media id. (see above)

> In my testing here, I changed the ownership/permissions on all of my disk 
> volumes to `root:root 0750` so that the SD running as `bacula` could not 
> access them.
> 
> When I run a job, I see what I expected when the SD tries to 'open' 
> (OPEN_READ_WRITE) an existing volume (exactly as in your case):
> ----8<----
> Warning: mount.c:216 Open of File device "FileChgr1-Dev2" 
> (/opt/comm-bacula-mysql/archive) Volume "Vol-0009" failed: ERR=file_dev.c:189 
> Could not open(/opt/comm-bacula-mysql/archive/Vol-0009,OPEN_READ_WRITE,0640): 
> ERR=Permission denied
> ----8<----
> 
> Then, it is marked as 'Read-Only' in the catalog:
> ----8<----
> Marking Volume "Vol-0009" Read-Only in Catalog
> ----8<----
> 
> Note: I could have sworn I remember that this scenario used to cause them to 
> get marked in 'Error'. Maybe Eric can comment on this. Did something change 
> in this scenario?
> 
> 
> Then, a new file volume is created in the catalog, and the SD fails to 
> *LABEL* it:
> ----8<----
> Created new Volume="Vol-0012", Pool="File", MediaType="File1" in catalog.
> 
> Warning: label.c:404 Open File device "FileChgr1-Dev2" 
> (/opt/comm-bacula-mysql/archive) Volume "Vol-0012" failed: ERR=file_dev.c:189 
> Could not 
> open(/opt/comm-bacula-mysql/archive/Vol-0012,CREATE_READ_WRITE,0640): 
> ERR=Permission denied
> ----8<----
> 
> Note the the 'label.c' and "CREATE_READ_WRITE" here.
> 
> 
> So, in your scenario, I am pretty sure of two things:
> 
> 1. You had a temporary disk access outage of some type when the SD tried
>   to open an existing Volume "vol-2675"

Nope, disk was full. Not really totally full, but df shows 100% full, and stil 
lsome hundered GB writeable, and I was able to create files on that disk.
Not sure how Bacula determined whether the disk is full. In the shell I was abe 
to create files, not sure why the SD was unable to do so.

I freed space by truncating purged volumes.

> 2. The SD was trying to open an existing file volume, not trying to label a 
> new one.

yes but that was not the whole story: it tried to create it earlier an failed. 
It tried to label it earlier and failed….

> Hope this helps.

Yes kind of… really knid of you to test this! Highly appreciated. I am sorr 
that this was possibly not necessary if you had had the other log lines..

> Also, if you have logs that prove something other than what I described 
> about, I would love to see them.

You got them now.

> Especially if that volume was marked in "Error" status, because mine get 
> marked in "Read-Only" status using 13.0.2, and this my also confirm that 
> something has changed in this area of the code as I suspect.  There has been 
> a lot of work around security, encryption, marking volumes immutable or 
> append only in recent versions of Bacula Enterprise, and it would not 
> surprise me if some of this did indeed change some behaviors.

You test case is different from the situation I had here.

Still unclear to me is why CREATE during labeling failed, as I was able to 
manually create files in the shell (in the docker container where the SD lives).

Thank you so much and all the best,
 j/c

> 
> 
> Best regards,
> Bill
> 
> -- 
> Bill Arlofski
> w...@protonmail.com
> 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to