On 19/07/12 08:20, larry.mart...@gmail.com wrote:
I have an interesting problem I'm trying to solve. I have a solution
almost working, but it's super ugly, and know there has to be a
better, cleaner way to do it.
I have a list of path names that have this form:
/dir0/dir1/dir2/dir3/dir4/dir5/dir6/file
I need to find all the file names (basenames) in the list that are
duplicates, and for each one that is a dup, prepend dir4 to the
filename as long as the dir4/file pair is unique. If there are
multiple dir4/files in the list, then I also need to add a sequence
number based on the sorted value of dir5 (which is a date in ddMONyy
format).
For example, if my list contains:
/dir0/dir1/dir2/dir3/qwer/09Jan12/dir6/file3
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file1
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file2
/dir0/dir1/dir2/dir3/xyz/08Jan12/dir6/file1
/dir0/dir1/dir2/dir3/qwer/07Jan12/dir6/file3
Then I want to end up with:
/dir0/dir1/dir2/dir3/qwer/09Jan12/dir6/qwer_01_file3
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/abcd_file1
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file2
/dir0/dir1/dir2/dir3/xyz/08Jan12/dir6/xyz_file1
/dir0/dir1/dir2/dir3/qwer/07Jan12/dir6/qwer_00_file3
My solution involves multiple maps and multiple iterations through the
data. How would you folks do this?
Hi Larry,
I am making the assumption that you intend to collapse the directory
tree and store each file in the same directory, otherwise I can't think
of why you need to do this.
If this is the case, then I would...
1. import all the files into an array
2. parse path to extract forth level directory name and base name.
3. reiterate through the array
3.1 check if base filename exists in recipient directory
3.2 if not, copy to recipient directory
3.3 if present, append the directory path then save
3.4 create log of success or failure
Personally, I would not have some files with abcd_file1 and others as
file2 because if it is important enough to store a file in a separate
directory you should also note where file2 came from as well. When
looking at your results at a later date you are going to have to open
file2 (which I presume must record where it relates to) to figure out
where it came from. If it is in the name it is easier to review.
In short, consistency is the name of the game; if you are going to do it
for some then do it for all; and finally it will be easier for others
later to work out what you have done.
--
Cheers Simon
Simon Cropper - Open Content Creator
Free and Open Source Software Workflow Guides
------------------------------------------------------------
Introduction http://www.fossworkflowguides.com
GIS Packages http://www.fossworkflowguides.com/gis
bash / Python http://www.fossworkflowguides.com/scripting
--
http://mail.python.org/mailman/listinfo/python-list