On 30 September 2012 23:08, Arnaud Delobelle <arno...@gmail.com> wrote:
> On 30 September 2012 02:27, Kevin Anthony <kevin.s.anth...@gmail.com> > wrote: > > I have a list of filenames, and i need to find files with the same name, > > different extensions, and split that into tuples. does anyone have any > > suggestions on an easy way to do this that isn't O(n^2)? > > >>> import os, itertools > >>> filenames = ["foo.png", "bar.csv", "foo.html", "bar.py"] > >>> dict((key, tuple(val)) for key, val in > itertools.groupby(sorted(filenames), lambda f: os.path.splitext(f)[0])) > {'foo': ('foo.html', 'foo.png'), 'bar': ('bar.csv', 'bar.py')} > That seems wasteful. Sort is O(n log n) I've seen this pattern a lot. Surely there should be an object for this... filenames = ["foo.png", "bar.csv", "foo.html", "bar.py"] > > import os > > from collections import defaultdict > grouped = defaultdict(list) > > for file in filenames: > splitname = os.path.splitext(file) > grouped[splitname[0]].append(splitname[1]) > > grouped > >>> defaultdict(<class 'list'>, {'foo': ['.png', '.html'], 'bar': ['.csv', > '.py']}) This should be near-enough O(n) time. Pah, it's not like you need to optimize this anyway!
-- http://mail.python.org/mailman/listinfo/python-list