At first I was somewhat confused about what the current behaviour is, both by this thread, the documentation and my experience with file matching algorithms used by other programs such as the various shells.
Judging from your example below I believe you may be confused about the current behaviour as well. The current code allows "*" to match any character INCLUDING the "/". This means, given the pattern "*kern", and using WildFile, the following files would match: /xxx/kern /xxx/abc_kern /xxx/yyy/kern But the following would not unless an asterisk is added to the end of the pattern or WildDir or Wild was used. /xxx/kern/yyy /xxx/abc_kern/yyy If you wanted "*kern" to match as part of the path or as a file then you would have to use Wild rather than WildFile. This will be matched when that directory is processed as the tree is descended. Perhaps most worrisome of all is that the following would match the pattern "/test*.dat" /test/archive/important/file1.dat Or given the pattern "/xxx/yyy/*.c" the following would match /xxx/yyy/abc.c /xxx/yyy/backup_copy/abc.c This is how the code works today. I think these last two cases are counter-intuitive and likely to cause someone problems at some point. Off the top of my head, I can't think of a case where it would be desirable behaviour. With this understanding of the current behaviour I believe it is important to make FNM_FILE_NAME work properly and specify it. This would just require adding one missing check for a "/" in fnmatch() when FNM_FILE_NAME is set. There isn't much difference in the current code when this flag is specified. It certainly doesn't behave the way it is documented in GNU or POSIX. However fixing this "bug" would mean that the pattern "*.tmp" would no longer match anything since absolute paths are always supplied. Thus the origins of my original suggestion. While you could add a new keyword for matching against just the base name you would either have to not fix the above behaviour or break existing configurations. As far as the code example is concerned I wouldn't implement it that way in production code either. I just included it to illustrate what I was suggesting using the minimum changes required. In production code I would only scan the filename and patterns once. While you could use regex the patterns would be much more complicated and harder for the average user, who is familiar with using the shell, to understand. There is a reason why programs that deal with just filenames use the glob(7) and fnmatch(3) patterns. I think the reason that only one user has mentioned it in the last 5 years is due to a few factors, the code mostly works as expected, I believe that Wild patterns are primarily used for exclusions and most users don't examine which files are being excluded until they aren't there when you try to restore. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kern Sibbald Sent: Sunday, April 16, 2006 10:19 AM To: Robert Nelson Cc: 'Martin Simmons'; [EMAIL PROTECTED]; bacula-users@lists.sourceforge.net Subject: Re: [Bacula-devel] Surprise bug + Scratch pool algorithm On Sunday 16 April 2006 11:52, Robert Nelson wrote: > But I think the behaviour would be very intuitive. If you look at how > it is used below in the example from the Windows FileSet I think it is > fairly obvious. I also think it is much clearer and easier to > maintain than the corresponding regex would be. The code change was > minimal and didn't require any modification of the buffers themselves. > Here is the diff for > WildFile: Well, I am sorry, but I don't agree with you on the point of it being very intuitive. With your change, I would no longer be able to do something based on a part of the path -- for example, suppose I want to compress all files where any part of the path has kern in the name. I can do it with WildFile = "*kern" With your suggestion, that would only match the filename part and would never match against something in the path. In addition, if I did implement it, I wouldn't do it as in the code below because that code has a huge performance penalty especially if you are dealing with 3 million files. I leave it to you to work out why. After a good deal of thought, IMO the correct way to solve the problem is with regex, or possibly if it is really necessary with another directive that explicitly lets the user match against only the filename part rather than the full path, but I don't think a new directive will really be necessary since no one has asked for it in the 5 years it has been programmed. > > } else { > for (k=0; k<fo->wildfile.size(); k++) { > - if (fnmatch((char *)fo->wildfile.get(k), ff->fname, fnmode|ic) > == 0) { > + const char *pattern = (const char *)fo->wildfile.get(k); > + const char *fname; > + > + if (strchr(pattern, '/') != NULL || (fname = > strrchr(ff->fname, '/')) == NULL) > + fname = ff->fname; > + else > + fname++; > + > + if (fnmatch(pattern, fname, fnmode|ic) == 0) { > if (ff->flags & FO_EXCLUDE) { > > -----Original Message----- > From: Kern Sibbald [mailto:[EMAIL PROTECTED] > Sent: Sunday, April 16, 2006 1:18 AM > To: Robert Nelson > Cc: 'Martin Simmons'; [EMAIL PROTECTED]; > bacula-users@lists.sourceforge.net > Subject: Re: [Bacula-devel] Surprise bug + Scratch pool algorithm > > On Sunday 16 April 2006 00:21, Robert Nelson wrote: > > Couldn't you handle both cases transparently. If the pattern has a > > "/" in it then pass the full name, otherwise just pass the basename > > to > > fnmatch(). > > > That way you get both behaviours without breaking existing examples > > and configs. > > > > Ironically the Windows example FileSet in the manual expects the > > above behaviour since it has both > > > > WildFile = "[A-Z]:/WINNT/system32/dhcp/tmp.edb" > > And > > WildFile = "*.tmp" > > That is an interesting idea, but probably not something I would do, > because it makes matching more complicated by altering the input data > (filenames) depending on the pattern. > > Tar has a similar feature, and I doubt that many on this list know > about it or that anyone on this list can explain exactly how it works. > > Since wild-cards are terribly incomplete, the solution to the > limitations users will have with wild-cards is to use Bacula's regular > expressions, which are now implemented (experimentally) in Win32 in > version 1.38.8. The only problem with the Win32 regex is that it is > untested and it does not have an "ignore case", which I will probably add in a future version. > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Kern > > Sibbald > > Sent: Monday, April 10, 2006 5:09 AM > > To: Martin Simmons > > Cc: [EMAIL PROTECTED]; > > bacula-users@lists.sourceforge.net > > Subject: Re: [Bacula-devel] Surprise bug + Scratch pool algorithm > > > > On Monday 10 April 2006 13:15, Martin Simmons wrote: > > > >>>>> On Mon, 10 Apr 2006 12:22:59 +0200, Kern Sibbald > > > >>>>> <[EMAIL PROTECTED]> > > > >>>>> said: > > > > > > > > Hello, > > > > > > > > It seems that it is becoming more frequent (probably because of > > > > the increasing number of Bacula users) that users submit support > > > > questions to the bugs database. This morning a user submitted a > > > > bug stating that the WildFile option was broken. Normally, I > > > > would have dismissed this as a support problem because most of > > > > us realize that wild-cards and regexes are awfully tricky. > > > > > > > > However, this user presented a *really* simple case with debug > > > > output, so I took a look at it, and surprise both WildFile and > > > > RegexFile are broken because they match against the full path > > > > and filename rather than just the filename. > > > > > > > > I wonder how many users have torn out their hair trying to > > > > figure out why WildFile or RegexFile didn't work :-( > > > > > > Are you really sure that is a bug? I think the word "filename" in > > > the documentation is ambiguous, but when it says "No directories > > > will be matched by this directive" it does not mean that the > > > matching is performed only on the basename part. > > > > > > The examples in "A Windows Example FileSet" are also written to > > > assume that WildFile compares the whole name. > > > > > > The current behaviour is very useful because it allows files in > > > selected directories to be matched, without accidentally matching > > > subdirectories (as Wild will do). > > > > After a little more thought about this, I'm not so sure I should > > change the behavior. It is not what I had originally intended (I > > didn't program it), but to change it now, given all the examples in > > the doc would create a number of problems. > > > > I think the best solution is to ensure that the documentation is > > extremely clear, then if there is really a demand, implement a new > > option such as WildFilename that matches against only the filename > > (basename). > > > -- > > Best regards, > > > > Kern > > > > ("> > > /\ > > V_V > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by xPML, a groundbreaking scripting > > language that extends applications into web and mobile media. Attend > > the live webcast and join the prime developer group breaking into > > this new coding territory! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=12 > > 16 > > 42 _______________________________________________ > > Bacula-devel mailing list > > [EMAIL PROTECTED] > > https://lists.sourceforge.net/lists/listinfo/bacula-devel > > -- > Best regards, > > Kern > > ("> > /\ > V_V -- Best regards, Kern ("> /\ V_V ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Bacula-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/bacula-devel ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users