On 22/08/12 20:28, Pete O'Connell wrote:
Hi. The next step for me to parse the file as I want to is to change lines that look like this: f 21/21/21 22/22/22 24/24/23 23/23/24 into lines that look like this: f 21 22 23 24
In English, what is the rule you are applying here? My guess is: "Given three numbers separated by slashes, ignore the first two numbers and keep the third." E.g. "17/25/97" => 97. Am I close?
Below is my terribly slow loop for doing this. Any suggestions about how to make this code more efficient would be greatly appreciated
What makes you say it is "terribly slow"? Perhaps it is as fast as it could be under the circumstances. (Maybe it takes a long time because you have a lot of data, not because it is slow.) The first lesson of programming is not to be too concerned about speed until your program is correct. Like most such guidelines, this is not entirely true -- you don't want to write code which is unnecessarily slow. But the question you should be asking is, "is it fast enough?" rather than "is it fast?". Also, the sad truth is that Python tends to be slower than some other languages. (It's also faster than some other languages too.) But the general process is: 1) write something that works correctly; 2) if it is too slow, try to speed it up in Python; 3) if that's still too slow, try using something like cython or PyPy 4) if all else fails, now that you have a working prototype, re-write it again in C, Java, Lisp or Haskell. Once they see how much more work is involved in writing fast C code, most people decide that "fast enough" is fast enough :)
with open(fileName) as lines: theGoodLines = [line.strip("\n") for line in lines if "vn" not in line and "vt" not in line and line != "\n"]
I prefer to write code in chains of filters. with open(fileName) as lines: # get rid of leading and trailing whitespace, including newlines lines = (line.strip() for line in lines) # ignore blanks lines = (line in lines if line) # ignore lines containing "vn" or "vt" theGoodLines = [line in lines if not ("vn" in line or "vt" in line)] Note that only the last step is a list comprehension using [ ], the others are generator expressions using ( ) instead. Will the above be faster than your version? I have no idea. But I think it is more readable and understandable. Some people might disagree.
for i in range(len(theGoodLines)): if theGoodLines[i][0] == "f": aGoodLineAsList = theGoodLines[i].split(" ") theGoodLines[i] = aGoodLineAsList[0] + " " + aGoodLineAsList[1].split("/")[-1] + " " + aGoodLineAsList[2].split("/")[-1] + " " + aGoodLineAsList[3].split("/")[-1] + " " + aGoodLineAsList[4].split("/")[-1]
Start with a helper function: def extract_last_item(term): """Extract the item from a term like a/b/c""" return term.split("/")[-1] for i, line in enumerate(theGoodLines): if line[0] == "f": terms = line.split() theGoodLines[i] = " ".join([extract_last_item(t) for t in terms]) See how you go with that. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor