Thank you to Jonathan, David, and Martin for their helpful replies. They
helped me find a solution that worked for me.

*A question, upfront: Is there any documentation on wild directives, and
the rules around how they pattern match? The information I found in the
manual was vague or at times incorrect. *

*The rest of this email is mostly an 'after action' report, discussing what
I found and what I did to solve problems.*

For me, this was mostly a test exercise to better understand wild
statements. My actual use case involves errors on 3 test files in a
onedrive account that is currently signed out and the user doesn't need it.
I could have simply deleted the files and moved on.

I found that I only got errors about these 3 files in initial full backups
(makes sense, they didn't change so inc / diff wouldn't have picked them up
again).
For this reason I could only test my exclude wild statements with a full
backup.
*I REALLY should have made a test job with a more limited test fileset
targeting the user directory and excluding a number of irrelevant user
directories a lot sooner.* I'd say I did the first 3/4 of the testing over
a couple *weeks* of time using full backups against the normal job /
fileset, and the last 1/4 of testing in the last few *hours* with a test
job using a modified, more limited fileset. *If I had made this test job /
fs sooner, I would have saved a lot of time and effort wasted waiting for
full jobs to complete. *I knew I could this this all along, somehow I
didn't prioritize it properly and paid the price. I kept thinking "It'll
just be one more run, I'll check it in the morning." Nope.
I put my test job and fileset in a file created by the bacula user
(/opt/bacula/etc/TESTONLY-cad2.conf), and referenced it from the
bacula-dir.conf file using a line like
@/opt/bacula/etc/TESTONLY-cad2.conf
I figured this would make it easier to clean up any conf changes after the
fact, with less risk that random changes I made would impact anything other
than the test job and fileset. Easy to clean up. Just delete all test jobs
from within bacularis, delete test conf file, remove reference to this file
from bacula-dir.conf, and reload director.

I want to emphasize something: As I'm spending this morning doing final
tests and composing this email, I've ran a lot of fast bacula jobs to test
various fileset changes. I found some obvious errors and as I'm testing and
describing what I found, I occasionally find something where my thinking
was incorrect and I had the wrong conclusion. The big lesson here is that
long drawn out testing processes where you start a job, then come back
hours or days later don't let you (or at least me) focus on the issue as a
whole. Finding ways to shorten the testing time and speed up the process
can have improvements beyond just achieving a faster result - they can
result in better focus and more accurate conclusions.

I found that the following would match all onedrive folders, including
onedrive 'work' accounts in the format "OneDrive - Company Name"
wilddir = "*:/Users/*/OneDrive*"
Crucially, this same statement with a / on the end would NOT work
(example: wilddir = "*:/Users/*/OneDrive*/"). This was confusing, since the
bacula manual section on filesets says:
"When using wild-cards or regular expressions, directory names are always
terminated with
a slash (/) and filenames have no trailing slash."
*My guess is that real paths like "C:/This/is/a/directory" are being
returned without a / on the end, and as such didn't match a pattern with a
trailing slash. Maybe the manual is incorrect, or this behavior only
matches windows clients?* I haven't tested further, but the trailing slash
at the end gave me a lot of trouble and mistaken conclusions for a while.

As Martin said, wild expressions don't appear to match any VSS paths. I
needed to use normal system paths.

*It was important to do science when testing this: that is to say, I had to
document what I had tried and the results. Otherwise it was too confusing.*

With so many iterative tests of the config file changes, I found it useful
to use a oneliner that reloaded the bacula director config, grepped the
test conf file for lines matching 'OneDrive', and launched the test job via
bconsole. I would copy the grep output showing which wild statements I had
defined into a text editor, check the job number specified by the bconsole
output for the job launch, and then inspect the joblogs for that job in
bacularis. I'd type the results under the wild statements I'd used, make
some changes to the conf file, then repeat. The grep function let me
confirm that I had properly saved the conf file, and that I really was
applying the changes I thought I'd made.
For reference, this was my oneliner:
echo reload | bconsole && grep -i onedrive
/opt/bacula/etc/TESTONLY-cad2.conf && echo -e "run
job=Backup-TESTONLY-delegates-cad2-job\ny\nquit\n" | bconsole

And here is sample output:
[gerber@td-bacula ~]$ echo reload | bconsole && grep -i onedrive
/opt/bacula/etc/TESTONLY-cad2.conf && echo -e "run
job=Backup-TESTONLY-delegates-cad2-job\ny\nquit\n" | bconsole
Connecting to Director td-bacula:9101
1000 OK: 10002 td-bacula-dir Version: 15.0.2 (21 March 2024)
Enter a period to cancel a command.
reload
  Fileset = "Windows-stupid-onedrive-test-fs"
  Name = "Windows-stupid-onedrive-test-fs"
#      wilddir = "*:/Users/*/OneDrive*/"  # doesn't work. No trailing
slashes allowed on directory matches in wild statements, under windows
anyhow. *nix untested.
#      wilddir = "*:/Users/*/OneDrive*"   # works
#      wilddir = "*:/Users/*/OneDrive/"    # doesn't work. Doesn't match my
folder path.
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates/"   #
doesn't work
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates"   # works
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates/*"   #
doesn't work. Can't specify a file mask in a wilddir statement.
#      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/*"   #
works.
#      wilddir = "C:/Users/Dtr02060719/OneDrive/"   # doesn't work. Doesn't
match my paths.
#      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc1.docx"
  # works
#      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc2.docx"
  # works
#      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Labeled
Photos.docx"   # works
Connecting to Director td-bacula:9101
1000 OK: 10002 td-bacula-dir Version: 15.0.2 (21 March 2024)
Enter a period to cancel a command.
run job=Backup-TESTONLY-delegates-cad2-job
Using Catalog "MyCatalog"
Run Backup job
JobName:  Backup-TESTONLY-delegates-cad2-job
Level:    Full
Client:   delegates-cad2-fd
FileSet:  Windows-stupid-onedrive-test-fs
Pool:     Synology-Local-Full (From Job resource)
Storage:  Synology-Local (From Pool resource)
When:     2025-05-29 14:11:48
Priority: 9
OK to run? (Yes/mod/no): y
Job queued. JobId=569
quit


Ultimately, I used the following wildfile patterns to simply match the 3
problem files. If my users add another onedrive account or more files, then
I'll get error emails alerting me to this fact, and I'll either educate my
users or build a solution to backup onedrive specifically (perhaps using
local sync, or rclone).
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc1.docx"
  # works
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc2.docx"
  # works
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Labeled
Photos.docx"   # works

These were the highlights of my test results. Some of the wild patterns
would have matched file or directory structures that didn't exist in my
case. 'doesn't work' in this case means "didn't match my 3 trouble files".
"Works" means the pattern did match, and the exclude statement they were a
part of excluded the problem files.

#      wilddir = "*:/Users/*/OneDrive*/"  # doesn't work. No trailing
slashes allowed on directory matches in wild statements, under windows
anyhow. *nix untested.
#      wilddir = "*:/Users/*/OneDrive*"   # works
#      wilddir = "*:/Users/*/OneDrive/"    # doesn't work. Doesn't match my
folder path.
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates/"   #
doesn't work
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates"   # works
#      wilddir = "C:/Users/Dtr02060719/OneDrive - The delegates/*"   #
doesn't work. Can't specify a file mask in a wilddir statement.
#      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/*"   #
works.
#      wilddir = "C:/Users/Dtr02060719/OneDrive/"   # doesn't work. Doesn't
match my paths.
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc1.docx"
  # works
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Doc2.docx"
  # works
      wildfile = "C:/Users/Dtr02060719/OneDrive - The delegates/Labeled
Photos.docx"   # works

Regards,
Robert Gerber
402-237-8692
r...@craeon.net
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to