On Feb 11, 3:39 pm, Jeremy <jlcon...@gmail.com> wrote: > I have been using Python for several years now and have never run into > memory errors… > > until now. >
Yes, Python does a good job of making memory errors the least of your worries as a programmer. Maybe it's doing too good of a job... > My Python program now consumes over 2 GB of memory and then I get a > MemoryError. I know I am reading lots of files into memory, but not > 2GB worth. Do a quick calculation: How much are leaving around after you read in a file? Do you create an object for each line? What does that object have associated with it? You may find that you have some strange O(N^2) behavior regarding memory here. Oftentimes people forget that you have to evaluate how your algorithm will run in time *and* memory. > I thought I didn't have to worry about memory allocation > in Python because of the garbage collector. While it's not the #1 concern, you still have to keep track of how you are using memory and try not to be wasteful. Use good algorithms, let things fall out of scope, etc... > 1. When I pass a variable to the constructor of a class does it > copy that variable or is it just a reference/pointer? I was under the > impression that it was just a pointer to the data. For objects, until you make a copy, there is no copy made. That's the general rule and even though it isn't always correct, it is correct enough. > 2. When do I need to manually allocate/deallocate memory and when > can I trust Python to take care of it? Let things fall out of scope. If you're concerned, use delete. Try to avoid using the global namespace for everything, and try to keep your lists and dicts small. > 3. Any good practice suggestions? > Don't read in the entire file and then process it. Try to do line-by- line processing. Figure out what your algorithm is doing in terms of time *and* memory. You likely have some O(N^2) or worse in memory usage. Don't use Python variables to store data long-term. Instead, setup a database or a file and use that. I'd first look at using a file, then using SQLite, and then a full-fledged database like PostgreSQL. Don't write processes that sit around for a long time unless you also evaluate whether that process grows in size as it runs. If it does, you need to figure out why and stop that memory leak. Simpler code uses less memory. Not just because it is smaller, but because you are not copying and moving data all over the place. See what you can do to simplify your code. Maybe you'll expose the nasty O(N^2) behavior. -- http://mail.python.org/mailman/listinfo/python-list