Re: Compairing filenames in a list

Joshua Landau Sun, 30 Sep 2012 18:52:36 -0700

On 30 September 2012 23:08, Arnaud Delobelle <arno...@gmail.com> wrote:


> On 30 September 2012 02:27, Kevin Anthony <kevin.s.anth...@gmail.com>
> wrote:
> > I have a list of filenames, and i need to find files with the same name,
> > different extensions, and split that into tuples.  does anyone have any
> > suggestions on an easy way to do this that isn't O(n^2)?
>
> >>> import os, itertools
> >>> filenames = ["foo.png", "bar.csv", "foo.html", "bar.py"]
> >>> dict((key, tuple(val)) for key, val in
> itertools.groupby(sorted(filenames), lambda f: os.path.splitext(f)[0]))
> {'foo': ('foo.html', 'foo.png'), 'bar': ('bar.csv', 'bar.py')}
>

That seems wasteful. Sort is O(n log n)

I've seen this pattern a lot. Surely there should be an object for this...

filenames = ["foo.png", "bar.csv", "foo.html", "bar.py"]
>
> import os
>
> from collections import defaultdict
> grouped = defaultdict(list)
>
> for file in filenames:
>     splitname = os.path.splitext(file)
>     grouped[splitname[0]].append(splitname[1])
>
> grouped
> >>> defaultdict(<class 'list'>, {'foo': ['.png', '.html'], 'bar': ['.csv',
> '.py']})


This should be near-enough O(n) time. Pah, it's not like you need to
optimize this anyway!

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Compairing filenames in a list

Reply via email to