On Tue, 22 Mar 2016 10:05 pm, BartC wrote: > On 22/03/2016 01:01, Steven D'Aprano wrote: >> On Tue, 22 Mar 2016 06:43 am, BartC wrote: >> >>> This code was adapted from a program that used: >>> >>> readstrfile(filename) >>> >>> which either returned the contents of the file as a string, or 0. >> >> What an interesting function. And I don't mean that in a good way. >> >> So if it returns 0, how do you know what the problem is? Mistyped file >> name? Permission denied? File doesn't actually exist? Disk corruption and >> you can't open the file? Some weird OS problem where you can't *close* >> the file? (That can actually happen, although it's never happened to me.) >> How do you debug any problems, given only "0" as a result? >> >> What happens if you read (let's say) a 20GB Blue-Ray disk image? > > I think you're making far too much of a throwaway function to grab a > file off disk and into memory. > > But out of interest, how would /you/ write a function that takes a > file-spec and turns it into an in-memory string? And what would its use > look like?
I already told you. For a quick and dirty script where I didn't care much about reliability, I would use: the_text = open(filename).read() and leave it at that. There's a hierarchy of less- to more-reliable. Next would be: with open(filename) as f: the_text = f.read() which guarantees to close the file promptly. Better still would be to avoid dealing with the entire file in one (potentially enormous) chunk, and process it line by line: with open(filename) as f: for line in f: process line If for some reason I *had* to process it as one big chunk of text, where I knew that there was a chance that it could be bigger than what I could comfortably hold in memory in one go, I would research mmap. But I don't really know anything about how that works. I've been lucky enough to never need to care. Dealing with out-of-memory errors on modern OSes is one of the hardest things to get right. In some ways, we're lucky, because the OS will try really hard to give the illusion that you have an infinite amount of memory. But the illusion is never perfect, and the abstraction of "virtual memory plus real memory = infinite memory" can break down. I once foolishly tried to create an *enormous* list, something like [0]*10**100, and my OS very kindly started swapping applications in and out of memory trying to free up 40 000 000 billion billion billion billion billion billion billion billion billion petabytes of memory (estimated). Not only did Python lock up, but so did the OS. I decided to leave it overnight to see if it would recover, but 16 hours later it was still locked up and frantically trying to swap memory. I'm not sure why the OOM-Killer didn't trigger. I ended up having to do a hard power-down to recover. So virtual memory is a mixed blessing. >> Pythonic code probably uses a lot of iterables: >> >> for value in something: >> ... > >> in preference to Pascal code written in Python: >> >> for index in range(len(something)): >> value = something[index] > > (Suppose you need both the value and its index in the loop? Then the > one-line for above won't work. For example, 'something' is [10,20,30] > and you want to print: > > 0: 10 > 1: 20 > 2: 30 ) for index, n in enumerate([10, 20, 30]): print(index, ":", n) >> or worse: >> >> index = 0 >> while index < len(something): >> value = something[index] >> ... >> index += 1 > >> (I don't know where that while-loop idiom comes from. C? Assembly? >> Penitent monks living in hair shirts in the desert and flogging >> themselves with chains every single night to mortify the accursed flesh? >> But I'm seeing it a lot in code written by beginners. I presume somebody, >> or some book, is teaching it to them. "Learn Python The Hard Way" >> perhaps?) > > Are you suggesting 'while' is not needed? Of course not. Use while loops for when you need a while loop. But *writing a for-loop using while* is an abuse of while. -- Steven -- https://mail.python.org/mailman/listinfo/python-list