Dear Bill, please find my response inline, further down: > On 2. Mar 2023, at 17:33, Bill Arlofski via Bacula-users > <bacula-users@lists.sourceforge.net> wrote: > > On 3/1/23 16:22, Justin Case wrote: >> What I actually did was to select them in Baculum and purged them. Then they >> were in state “Purged”. Then next time a volume from this pool was needed, >> Baculum grabbed the first one and used it without problems. > > Hello Justin, > > Well this does not make sense to me from what I understand from your initial > description of the issue of volumes in error. > > I will do a couple tests to be sure, but if the scenario is like: > > - New volume needed > - Director creates one in the catalog > - SD is instructed to create same volume on disk > - Disk Volume creation fails (due to unmounted filesystem, permissions issue, > etc) > - Director marks the volume in Error status (well, Read-Only in my tests > here) in the catalog and it is no longer eligible for any write/truncate > actions by the Director, nor the SD. > > In the scenario above, we have an "orphaned" volume in the catalog, and no > file volume on disk. Nothing except deleting this volume from the catalog can > really be done. > > If the volstatus is manually set to Append, Recycle, or Purged, the next time > any action is attempted by the SD, it will fail since the file volume does > not exist to read. > > When the SD attempts to open a volume, the first thing the SD does is attempt > to read its label. Of course this would fail on a non-existent file volume, > and the Director would reset the volume to Error (Read-Only) in the catalog. > > BUT... As I re-read your initial email, I see that the scenario I described > above is not what happened. :) > > In your case, the file volume did exist on disk, but at the time of the > error, the SD could not access it (disk not mounted, NFS/iSCSI/FC/CIFS > problems, etc). It seems that later, as you described this partition and the > volumes on it became available. This is why the purge worked for you. >
I am sorry you are mistaken. I triple checked that the volume did NOT exist on disk. When I checked, the disk was mounted. The volume limit of the pool was not reached yet. The disk shows as 100% full, but there were some hundred GB freeon disk, however, seems to have been the root cause. I have not changed any permissions and it works smoothly again. Here is the full log for one of the volumes set to status “Error”: Created new Volume="vol-2667", Pool=“client1", MediaType=“file" in catalog. Warning: label.c:404 Open File device “storagdev1" (/mnt/storage1) Volume "vol-2667" failed: ERR=file_dev.c:189 Could not open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device Warning: label.c:404 Open File device “storagedev1" (/mnt/storage1) Volume “vol-2667" failed: ERR=file_dev.c:189 Could not open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device Warning: mount.c:216 Open of File device “storagedev1" (/mnt/storage1) Volume "vol-2667" failed: ERR=file_dev.c:189 Could not open(/mnt/storage1/vol-2667,CREATE_READ_WRITE,0640): ERR=No space left on device Marking Volume "vol-2667" in Error in Catalog. > Notice in your initial message you say: > ----8<---- > today I had the situation that bacula tried to create a new disk file volumes > and the creation failed > ----8<---- > > and you show the log message: > ----8<---- > Warning: mount.c:216 Open of File device “storagedev1" (/mnt/storage1) Volume > "vol-2675" failed: ERR=file_dev.c:189 Could not > open(/mnt/storage1/vol-2675,OPEN_READ_WRITE,0640): ERR=No such file or > directory > ----8<---- > > > Notice this is a *MOUNT* action, not a *LABEL* action. It is the SD trying to > open and read the volume's label. Sorry, the log line was missing as the logs for the volume were kind of scattered as there were also requests for labeling a volume. I went back in the logs to find a more clear block for a volume with a lower media id. (see above) > In my testing here, I changed the ownership/permissions on all of my disk > volumes to `root:root 0750` so that the SD running as `bacula` could not > access them. > > When I run a job, I see what I expected when the SD tries to 'open' > (OPEN_READ_WRITE) an existing volume (exactly as in your case): > ----8<---- > Warning: mount.c:216 Open of File device "FileChgr1-Dev2" > (/opt/comm-bacula-mysql/archive) Volume "Vol-0009" failed: ERR=file_dev.c:189 > Could not open(/opt/comm-bacula-mysql/archive/Vol-0009,OPEN_READ_WRITE,0640): > ERR=Permission denied > ----8<---- > > Then, it is marked as 'Read-Only' in the catalog: > ----8<---- > Marking Volume "Vol-0009" Read-Only in Catalog > ----8<---- > > Note: I could have sworn I remember that this scenario used to cause them to > get marked in 'Error'. Maybe Eric can comment on this. Did something change > in this scenario? > > > Then, a new file volume is created in the catalog, and the SD fails to > *LABEL* it: > ----8<---- > Created new Volume="Vol-0012", Pool="File", MediaType="File1" in catalog. > > Warning: label.c:404 Open File device "FileChgr1-Dev2" > (/opt/comm-bacula-mysql/archive) Volume "Vol-0012" failed: ERR=file_dev.c:189 > Could not > open(/opt/comm-bacula-mysql/archive/Vol-0012,CREATE_READ_WRITE,0640): > ERR=Permission denied > ----8<---- > > Note the the 'label.c' and "CREATE_READ_WRITE" here. > > > So, in your scenario, I am pretty sure of two things: > > 1. You had a temporary disk access outage of some type when the SD tried > to open an existing Volume "vol-2675" Nope, disk was full. Not really totally full, but df shows 100% full, and stil lsome hundered GB writeable, and I was able to create files on that disk. Not sure how Bacula determined whether the disk is full. In the shell I was abe to create files, not sure why the SD was unable to do so. I freed space by truncating purged volumes. > 2. The SD was trying to open an existing file volume, not trying to label a > new one. yes but that was not the whole story: it tried to create it earlier an failed. It tried to label it earlier and failed…. > Hope this helps. Yes kind of… really knid of you to test this! Highly appreciated. I am sorr that this was possibly not necessary if you had had the other log lines.. > Also, if you have logs that prove something other than what I described > about, I would love to see them. You got them now. > Especially if that volume was marked in "Error" status, because mine get > marked in "Read-Only" status using 13.0.2, and this my also confirm that > something has changed in this area of the code as I suspect. There has been > a lot of work around security, encryption, marking volumes immutable or > append only in recent versions of Bacula Enterprise, and it would not > surprise me if some of this did indeed change some behaviors. You test case is different from the situation I had here. Still unclear to me is why CREATE during labeling failed, as I was able to manually create files in the shell (in the docker container where the SD lives). Thank you so much and all the best, j/c > > > Best regards, > Bill > > -- > Bill Arlofski > w...@protonmail.com > > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users