Lipska the Kat wrote: > Greetings Pythoners > > A short while back I posted a message that described a task I had set > myself. I wanted to implement the following bash shell script in Python > > Here's the script > > sort -nr $1 | head -${2:-10} > > this script takes a filename and an optional number of lines to display > and sorts the lines in numerical order, printing them to standard out. > if no optional number of lines are input the script prints 10 lines > > Here's the file. > > 50 Parrots > 12 Storage Jars > 6 Lemon Currys > 2 Pythons > 14 Spam Fritters > 23 Flying Circuses > 1 Meaning Of Life > 123 Holy Grails > 76 Secret Policemans Balls > 8 Something Completely Differents > 12 Lives of Brian > 49 Spatulas > > > ... and here's my very first attempt at a Python program > I'd be interested to know what you think, you can't hurt my feelings > just be brutal (but fair). There is very little error checking as you > can see and I'm sure you can crash the program easily. > 'Better' implementations most welcome
> #! /usr/bin/env python3.2 > > import fileinput > from sys import argv > from operator import itemgetter > > l=[] > t = tuple > filename=argv[1] > lineCount=10 > > with fileinput.input(files=(filename)) as f: Note that (filename) is not a tuple, just a string surrounded by superfluous parens. >>> filename = "foo.bar" >>> (filename) 'foo.bar' >>> (filename,) ('foo.bar',) >>> filename, ('foo.bar',) You are lucky that FileInput() tests if its files argument is just a single string. > for line in f: > t=(line.split('\t')) > t[0]=int(t[0]) > l.append(t) > l=sorted(l, key=itemgetter(0)) > > try: > inCount = int(argv[2]) > lineCount = inCount > except IndexError: > #just catch the error and continue > None > > for c in range(lineCount): > t=l[c] > print(t[0], t[1], sep='\t', end='') > I prefer a more structured approach even for such a tiny program: - process all commandline args - read data - sort - clip extra lines - write data I'd break it into these functions: def get_commmandline_args(): """Recommended library: argparse. Its FileType can deal with stdin/stdout. """ def get_quantity(line): return int(line.split("\t", 1)[0]) def sorted_by_quantity(lines): """Leaves the lines intact, so you don't have to reassemble them later on.""" return sorted(lines, key=get_quantity) def head(lines, count): """Have a look at itertools.islice() for a more general approach""" return lines[:count] if __name__ == "__main__": # protecting the script body allows you to import # the script as a library into other programs # and reuse its functions and classes. # Also: play nice with pydoc. Try # $ python -m pydoc -w ./yourscript.py args = get_commandline_args() with args.infile as f: lines = sorted_by_quantity(f) with args.outfile as f: f.writelines(head(lines, args.line_count)) Note that if you want to handle large files gracefully you need to recombine sorted_by_quantity() and head() (have a look at heapq.nsmallest() which was already mentioned in the other thread). -- http://mail.python.org/mailman/listinfo/python-list