Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-20 Thread MRAB
On 20/07/2012 04:07, larry.mart...@gmail.com wrote: [snip] Also, in make_dir5_key the format specifier for strftime should be %y%m %d so they sort properly. Correct. I realised that only some time later, after I'd turned off my computer for the night. :-( -- http://mail.python.org/mailman/listi

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-20 Thread Paul Rudin
"larry.mart...@gmail.com" writes: > It seems that if you do a list(group) you have consumed the list. This > screwed me up for a while, and seems very counter-intuitive. You've consumed the *group* which is an iterator, in order to construct a list from its elements. Sorry if this is excessively

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-20 Thread Paul Rubin
"larry.mart...@gmail.com" writes: > It seems that if you do a list(group) you have consumed the list. This > screwed me up for a while, and seems very counter-intuitive. Yes, that is correct, you have to carefully watch where the stuff in the iterators is getting consumed, including when there ar

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-20 Thread Peter Otten
larry.mart...@gmail.com wrote: > It seems that if you do a list(group) you have consumed the list. This > screwed me up for a while, and seems very counter-intuitive. Many itertools functions work that way. It allows you to iterate over the items even if there is more data than fits into memory.

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 19, 7:01 pm, "larry.mart...@gmail.com" wrote: > On Jul 19, 3:32 pm, MRAB wrote: > > > > > > > > > > > On 19/07/2012 20:06, larry.mart...@gmail.com wrote: > > > > On Jul 19, 1:02 pm, "Prasad, Ramit" wrote: > > >> > > I am making the assumption that you intend to collapse the directory > >

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 19, 1:43 pm, Paul Rubin wrote: > "larry.mart...@gmail.com" writes: > > Thanks for the reply Paul. I had not heard of itertools. It sounds > > like just what I need for this. But I am having 1 issue - how do you > > know how many items are in each group? > > Simplest is: > >   for key, grou

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 19, 3:32 pm, MRAB wrote: > On 19/07/2012 20:06, larry.mart...@gmail.com wrote: > > > > > > > > > On Jul 19, 1:02 pm, "Prasad, Ramit" wrote: > >> > > I am making the assumption that you intend to collapse the directory > >> > > tree and store each file in the same directory, otherwise I can

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 19, 1:56 pm, Paul Rubin wrote: > "larry.mart...@gmail.com" writes: > > You can't do a len on the iterator that is returned from groupby, and > > I've tried to do something with imap or      defaultdict, but I'm not > > getting anywhere. I guess I can just make 2 passes through the data, >

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread MRAB
On 19/07/2012 20:06, larry.mart...@gmail.com wrote: On Jul 19, 1:02 pm, "Prasad, Ramit" wrote: > > I am making the assumption that you intend to collapse the directory > > tree and store each file in the same directory, otherwise I can't think > > of why you need to do this. > Hi Simon, thanks

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread Paul Rubin
"larry.mart...@gmail.com" writes: > You can't do a len on the iterator that is returned from groupby, and > I've tried to do something with imap or defaultdict, but I'm not > getting anywhere. I guess I can just make 2 passes through the data, > the first time getting counts. Or am I missing

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread Paul Rubin
"larry.mart...@gmail.com" writes: > Thanks for the reply Paul. I had not heard of itertools. It sounds > like just what I need for this. But I am having 1 issue - how do you > know how many items are in each group? Simplest is: for key, group in groupby(xs, lambda x:(x[-1],x[4],x[5])): gs

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 18, 4:49 pm, Paul Rubin wrote: > "larry.mart...@gmail.com" writes: > > I have an interesting problem I'm trying to solve. I have a solution > > almost working, but it's super ugly, and know there has to be a > > better, cleaner way to do it. ... > > > My solution involves multiple maps and

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 19, 1:02 pm, "Prasad, Ramit" wrote: > > > I am making the assumption that you intend to collapse the directory > > > tree and store each file in the same directory, otherwise I can't think > > > of why you need to do this. > > > Hi Simon, thanks for the reply. It's not quite this - what I a

RE: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread Prasad, Ramit
> > I am making the assumption that you intend to collapse the directory > > tree and store each file in the same directory, otherwise I can't think > > of why you need to do this. > > Hi Simon, thanks for the reply. It's not quite this - what I am doing > is creating a zip file with relative path

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 18, 4:49 pm, Paul Rubin wrote: > "larry.mart...@gmail.com" writes: > > I have an interesting problem I'm trying to solve. I have a solution > > almost working, but it's super ugly, and know there has to be a > > better, cleaner way to do it. ... > > > My solution involves multiple maps and

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-19 Thread larry.mart...@gmail.com
On Jul 18, 6:36 pm, Simon Cropper wrote: > On 19/07/12 08:20, larry.mart...@gmail.com wrote: > > > > > > > > > > > I have an interesting problem I'm trying to solve. I have a solution > > almost working, but it's super ugly, and know there has to be a > > better, cleaner way to do it. > > > I have

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-18 Thread Simon Cropper
On 19/07/12 08:20, larry.mart...@gmail.com wrote: I have an interesting problem I'm trying to solve. I have a solution almost working, but it's super ugly, and know there has to be a better, cleaner way to do it. I have a list of path names that have this form: /dir0/dir1/dir2/dir3/dir4/dir5/di

Re: Finding duplicate file names and modifying them based on elements of the path

2012-07-18 Thread Paul Rubin
"larry.mart...@gmail.com" writes: > I have an interesting problem I'm trying to solve. I have a solution > almost working, but it's super ugly, and know there has to be a > better, cleaner way to do it. ... > > My solution involves multiple maps and multiple iterations through the > data. How woul

Finding duplicate file names and modifying them based on elements of the path

2012-07-18 Thread larry.mart...@gmail.com
I have an interesting problem I'm trying to solve. I have a solution almost working, but it's super ugly, and know there has to be a better, cleaner way to do it. I have a list of path names that have this form: /dir0/dir1/dir2/dir3/dir4/dir5/dir6/file I need to find all the file names (basename