Gabriel Genellina wrote:
En Wed, 25 Jul 2007 19:14:28 -0300, James Stroud <jstr...@mbi.ucla.edu> escribió:
Daniel Nogradi wrote:
A very simple question: I currently use a cumbersome-looking way of
getting the first, second, etc. line of a text file:
to_get = [0, 3, 7, 11, 13]
got = dict((i,s) for (i,s) in enumerate(open(textfile)) if i in to_get)
print got[3]
This would probably be the best way for really big files and if you know
all of the lines you want ahead of time.
But it still has to read the complete file (altough it does not keep the unwanted lines). Combining this with Paul Rubin's suggestion of itertools.islice I think we get the best solution: got = dict((i,s) for (i,s) in enumerate(islice(open(textfile),max(to_get)+1)) if i in to_get)

or even faster:
    wanted = set([0, 3, 7, 11, 13])
    with open(textfile) as src:
        got = dict((i, s) for (i, s) in enumerate(islice(src,
                                        min(wanted), max(wanted) + 1))
                   if i in wanted)
Of course that could just as efficiently create a list as a dict.
Note that using a list rather than a set for wanted takes len(wanted)
comparisons on misses, and len(wanted)/2 on hits, but most likely a
single comparison for a dict whether it is a hit or a miss.

--Scott David Daniels
scott.dani...@acm.org

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to