Dumb glob question
I've run into an issue with glob and matching filenames with brackets '[]' in them. The problem comes when I'm using part of such a filename as the path I'm passing to glob. Here's a trimmed down dumb example. Let's say I have a directory with the following files in it. foo.par2 foo.vol0+1.par2 foo.vol1+1.par2 zzz [foo].par2 zzz [foo].vol0+1.par2 zzz [foo].vol1+1.par2 While processing one of the files I want to do certain things in batch so I've been using glob as a means to get all of the files in a set. The following code will print the filenames for parity volumes in each set while working with the base checksum, unless there are brackets in the name. #re2 = re.compile(r'vol', re.IGNORECASE) #for nuke in glob.glob('*.par2'): #if not re2.search(nuke): #list = glob.glob(nuke[:-5]+'*vol*') #for name in list: print os.path.join(os.getcwd(),name) I'm sure there is something obvious I'm missing. I figured I could use something like re.escape on the trimmed filename for matching but that hasn't worked either. Using win32api.FindFiles instead of glob works but I'd obviously rather do it the _right_ way and have it work properly in *nix too. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb glob question
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote in comp.lang.python: > code like below willprint all files ending on 'par2', except tose not > containong 'vol' from the 5th position. is that what you need? > -import glob > -for nuke in glob.glob(r"""c:\temp\*.par2"""): > -try: > -nuke.index('vol', 5) > -print nuke > -except ValueError, e: > -print e Not quite. I'm sorry my example wasn't very clear. While working with any single file I need to be able to build a list of all the other files in a particular set. Basically I just need globbing of the base filename. glob.glob(basename+'.*some_extension') So if I was working with 'foo.par2' at the moment... glob.glob(filename[:-5]+'.*par2') would catch all of the files belonging to the set including 'foo.par2' 'foo.vol0+1.par2' 'foo.vol1+1.par2' etc. This works great (as expected) until you are working with a filename with brackets '[]' in it. Then glob just returns an empty list. So if I happen to be processing 'foo [bar].par2' glob.glob(filename[:-5]+'.*par2') doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2') works perfectly, but I don't want to rely on win32api functions. I hope that made more sense :). -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb glob question
Michael Hoffman <[EMAIL PROTECTED]> wrote in comp.lang.python: > Python Dunce wrote: > >> So if I happen >> to be processing 'foo [bar].par2' >> >> glob.glob(filename[:-5]+'.*par2') >> >> doesn't return anything. Using >> win32api.FindFiles(filename[:-5]+'.*par2') works perfectly, but I don't >> want to rely on win32api functions. I hope that made more sense :). > > If you look in the source for glob.py, you will find that it calls the > fnmatch module, and this is the docstring for fnmatch.translate(): > > """Translate a shell PATTERN to a regular expression. > > There is no way to quote meta-characters. > """ > > So you cannot do what you want with glob. > > You can replace [] with ? in your glob string, if you are sure that > there won't be other characters there. That's a bit of a hack, and I > wouldn't do it. > > In my mind it would probably be best to do: > > re_vol = re.compile(re.escape(startpart) + ".*vol.*") > lst = [filename for filename in os.listdir(".") if > re_vol.match(filename)] > > I changed "list" to "lst" because the former shadows a built-in. Thanks, that should do the trick! I had tried basically the same thing once but I was getting back empty lists. I think it was just a brain fart involving a case sensitive regex that didn't match the files I was testing it on :/. -- http://mail.python.org/mailman/listinfo/python-list