On Nov 7, 9:47 am, Chris Rebert <c...@rebertia.com> wrote: > On Sun, Nov 7, 2010 at 9:34 AM, chad <cdal...@gmail.com> wrote: > > <snip> > > > > > #!/usr/local/bin/python > > > import sys > > > def construct_set(data): > > for line in data: > > lines = line.splitlines() > > for curline in lines: > > if curline.strip(): > > key = curline.split(' ') > > value = int(key[0]) > > yield value > > > def approximate(first, second): > > midpoint = (first + second) / 2 > > return midpoint > > > def format(input): > > prev = 0 > > value = int(input) > > > with open("/home/cdalten/oakland/freq") as f: > > for next in construct_set(f): > > if value > prev: > > current = prev > > prev = next > > > middle = approximate(current, prev) > > if middle < prev and value > middle: > > return prev > > elif value > current and current < middle: > > return current > <snip> > > The question is about the construct_set() function. > <snip> > > I have it yield on 'value' instead of 'curline'. Will the program > > still read the input file named freq line by line even though I don't > > have it yielding on 'curline'? Or since I have it yield on 'value', > > will it read the entire input file into memory at once? > > The former. The yield has no effect at all on how the file is read. > The "for line in data:" iteration over the file object is what makes > Python read from the file line-by-line. Incidentally, the use of > splitlines() is pointless; you're already getting single lines from > the file object by iterating over it, so splitlines() will always > return a single-element list. >
But what happens if the input file is say 250MB? Will all 250MB be loaded into memory at once? Just curious, because I thought maybe using something like 'yield curline' would prevent this scenario. -- http://mail.python.org/mailman/listinfo/python-list