[EMAIL PROTECTED] wrote: > I have a simple assignment for school but am unsure where to go. The > assignment is to read in a text file, split out the words and say which > line each word appears in alphabetical order. I have the basic outline > of the program done which is:
looks like an excellent start to me. > def Xref(filename): > try: > fp = open(filename, "r") > lines = fp.readlines() > fp.close() > except: > raise "Couldn't read input file \"%s\"" % filename > dict = {} > for line_num in xrange(len(lines)): > if lines[line_num] == "": continue > words = lines[line_num].split() > for word in words: > if not dict.has_key(word): > dict[word] = [] > if line_num+1 not in dict[word]: > dict[word].append(line_num+1) > return dict > > My question is, how do I easily parse out punction marks it depends a bit how you define the term "word". if you're using regular text, with a limited set of punctuation characters, you can simply do e.g. word = word.strip(".,!?:;") if not word: continue inside the "for word" loop. this won't handle such characters if they appear inside words, but that's probably good enough for your task. another, slightly more advanced approach is to use regular expressions, such as re.findall("\w+") to get a list of all alphanumeric "words" in the text. that'll have other drawbacks (e.g. it'll split up words like "couldn't" and "cross-reference", unless you tweak the regexp), and is probably overkill. and how do I sort the list and how to sort the dictionary when printing the cross-reference, you mean? just use "sorted" on the dictionary; that'll get you a sorted list of the keys. sorted(dict) to avoid duplicates and simplify sorting, you probably want to normalize the case of the words you add to the dictionary, e.g. by converting all words to lowercase. > if there anything else that I am doing wrong in this code there's plenty of things that can be tweaked and tuned and written in a slightly shorter way by an experienced Python programmer, but assuming that this is a general programming assignment, I don't see something seriously "wrong" in your code (just make sure you test it on a file that doesn't exist before you hand it in) </F> -- http://mail.python.org/mailman/listinfo/python-list