Sent from my LG phone wrote:

>Send Python-list mailing list submissions to
>To subscribe or unsubscribe via the World Wide Web, visit
>or, via email, send a message with subject or body 'help' to
>You can reach the person managing the list at
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of Python-list digest..."
>Today's Topics:
>   1. Re: Python use growing fast (Alice Bevan?McGregor)
>   2. Re: order of importing modules (Chris Rebert)
>   3. Re: How to Buffer Serialized Objects to Disk (MRAB)
>   4. Re: How to Buffer Serialized Objects to Disk (Chris Rebert)
>   5. Re: How to Buffer Serialized Objects to Disk (Peter Otten)
>   6. Re: Best way to automatically copy out attachments from an
>      email (Chris Rebert)
>   7. Re: Parsing string for "<verb> <noun>" (Aahz)
>   8. Re: Nested structures question (Tim Harig)
>   9. Re: How to Buffer Serialized Objects to Disk (Scott McCarty)
>On 2011-01-10 19:49:47 -0800, Roy Smith said:
>> One of the surprising (to me, anyway) uses of JavaScript is as the 
>> scripting language for MongoDB (
>I just wish they'd drop spidermonkey and go with V8 or another, faster 
>and more modern engine.  :(
>       - Alice.
>> Dan Stromberg wrote:
>>> On Tue, Jan 11, 2011 at 4:30 PM, Catherine Moroney
>>> <> wrote:
>>>> In what order does python import modules on a Linux system?  I have a
>>>> package that is both installed in /usr/lib64/python2.5/site-packages,
>>>> and a newer version of the same module in a working directory.
>>>> I want to import the version from the working directory, but when I
>>>> print module.__file__ in the interpreter after importing the module,
>>>> I get the version that's in site-packages.
>>>> I've played with the PYTHONPATH environmental variable by setting it
>>>> to just the path of the working directory, but when I import the module
>>>> I still pick up the version in site-packages.
>>>> /usr/lib64 is in my PATH variable, but doesn't appear anywhere else.  I
>>>> don't want to remove /usr/lib64 from my PATH because that will break
>>>> a lot of stuff.
>>>> Can I force python to import from my PYTHONPATH first, before looking
>>>> in the system directory?
>>> Please import sys and inspect sys.path; this defines the search path
>>> for imports.
>>> By looking at sys.path, you can see where in the search order your
>>> $PYTHONPATH is going.
>On Wed, Jan 12, 2011 at 11:07 AM, Catherine Moroney
><> wrote:
>> I've looked at my sys.path variable and I see that it has
>> a whole bunch of site-package directories, followed by the
>> contents of my $PYTHONPATH variable, followed by a list of
>> misc site-package variables (see below).
>> But, I'm curious as to where the first bunch of 'site-package'
>> entries come from.  The
>> /usr/lib64/python2.5/site-packages/pyhdfeos-1.0_r57_58-py2.5-linux-x86_64.egg
>> is not present in any of my environmental variables yet it shows up
>> as one of the first entries in sys.path.
>You probably have a .pth file somewhere that adds it (since it's an
>egg, probably site-packages/easy-install.pth).
>On 12/01/2011 21:05, Scott McCarty wrote:
>> Sorry to ask this question. I have search the list archives and googled,
>> but I don't even know what words to find what I am looking for, I am
>> just looking for a little kick in the right direction.
>> I have a Python based log analysis program called petit
>> ( I am trying to modify it to manage the
>> main object types to and from disk.
>> Essentially, I have one object which is a list of a bunch of "Entry"
>> objects. The Entry objects have date, time, date, etc fields which I use
>> for analysis techniques. At the very beginning I build up the list of
>> objects then would like to start pickling it while building to save
>> memory. I want to be able to process more entries than I have memory.
>> With a strait list it looks like I could build from xreadlines(), but
>> once you turn it into a more complex object, I don't quick know where to go.
>> I understand how to pickle the entire data structure, but I need
>> something that will manage the memory/disk allocation?  Any thoughts?
>To me it sounds like you need to use a database.
>On Wed, Jan 12, 2011 at 1:05 PM, Scott McCarty <> wrote:
>> Sorry to ask this question. I have search the list archives and googled, but
>> I don't even know what words to find what I am looking for, I am just
>> looking for a little kick in the right direction.
>> I have a Python based log analysis program called petit
>> ( I am trying to modify it to manage the main
>> object types to and from disk.
>> Essentially, I have one object which is a list of a bunch of "Entry"
>> objects. The Entry objects have date, time, date, etc fields which I use for
>> analysis techniques. At the very beginning I build up the list of objects
>> then would like to start pickling it while building to save memory. I want
>> to be able to process more entries than I have memory. With a strait list it
>> looks like I could build from xreadlines(), but once you turn it into a more
>> complex object, I don't quick know where to go.
>> I understand how to pickle the entire data structure, but I need something
>> that will manage the memory/disk allocation?  Any thoughts?
>You could subclass `list` and use sys.getsizeof()
>[ ] to keep track
>of the size of the elements, and then start pickling them to disk once
>the total size reaches some preset limit.
>But like MRAB said, using a proper database, e.g. SQLite
>( ), wouldn't be a bad idea
>Scott McCarty wrote:
>> Sorry to ask this question. I have search the list archives and googled,
>> but I don't even know what words to find what I am looking for, I am just
>> looking for a little kick in the right direction.
>> I have a Python based log analysis program called petit (
>> I am trying to modify it to manage the main
>> object types to and from disk.
>> Essentially, I have one object which is a list of a bunch of "Entry"
>> objects. The Entry objects have date, time, date, etc fields which I use
>> for analysis techniques. At the very beginning I build up the list of
>> objects then would like to start pickling it while building to save
>> memory. I want to be able to process more entries than I have memory. With
>> a strait list it looks like I could build from xreadlines(), but once you
>> turn it into a more complex object, I don't quick know where to go.
>> I understand how to pickle the entire data structure, but I need something
>> that will manage the memory/disk allocation?  Any thoughts?
>You can write multiple pickled objects into a single file:
>import cPickle as pickle
>def dump(filename, items):
>    with open(filename, "wb") as out:
>        dump = pickle.Pickler(out).dump
>        for item in items:
>            dump(item)
>def load(filename):
>    with open(filename, "rb") as instream:
>        load = pickle.Unpickler(instream).load
>        while True:
>            try:
>                item = load()
>            except EOFError:
>                break
>            yield item
>if __name__ == "__main__":
>    filename = "tmp.pickle"
>    from collections import namedtuple
>    T = namedtuple("T", "alpha beta")
>    dump(filename, (T(a, b) for a, b in zip("abc", [1,2,3])))
>    for item in load(filename):
>        print item
>To get random access you'd have to maintain a list containing the offsets of 
>the entries in the file.
>However, a simple database like SQLite is probably sufficient for the kind 
>of entries you have in mind, and it allows operations like aggregation, 
>sorting and grouping out of the box.
>On Wed, Jan 12, 2011 at 10:59 AM, Matty Sarro <> wrote:
>> As of now here is my situation:
>> I am working on a system to aggregate IT data and logs. A number of
>> important data are gathered by a third party system. The only
>> immediate way I have to access the data is to have their system
>> automatically email me updates in CSV format every hour. If I set up a
>> mail client on the server, this shouldn't be a huge issue.
>> However, is there a way to automatically open the emails, and copy the
>> attachments to a directory based on the filename? Kind of a weird
>> project, I know. Just looking for some ideas hence posting this on two
>> lists.
>Parsing out email attachments:
>Parsing the extension from a filename:
>Retrieving email from a mail server:
>You could poll for new messages via a cron job or the `sched` module
>( ). Or if the messages are
>being delivered locally, you could use inotify bindings or similar to
>watch the appropriate directory for incoming mail. Integration with a
>mail server itself is also a possibility, but I don't know much about
>In article <>,
>Daniel da Silva  <> wrote:
>>I have come across a task where I would like to scan a short 20-80
>>character line of text for instances of "<verb> <noun>". Ideally
>><verb> could be of any tense.
>In Soviet Russia, <noun> <verbs> you!
>Aahz (           <*>
>"Think of it as evolution in action."  --Tony Rand
>In case you still need help:
>- # Set the initial values
>- the_number= random.randrange(100) + 1
>- tries = 0
>- guess = None
>- # Guessing loop
>- while guess != the_number and tries < 7:
>-     guess = int(raw_input("Take a guess: "))
>-     if guess > the_number:
>-         print "Lower..."
>-     elif guess < the_number:
>-         print "Higher..."
>-     tries += 1
>- # did the user guess correctly to make too many guesses?
>- if guess == the_number:
>-     print "You guessed it! The number was", the_number
>-     print "And it only took you", tries, "tries!\n"
>- else:
>-     print "Wow you suck! It should only take at most 7 tries!"
>- raw_input("Press Enter to exit the program.")
>Been digging ever since I posted this. I suspected that the response might
>be use a database. I am worried I am trying to reinvent the wheel. The
>problem is I don't want any dependencies and I also don't need persistence
>program runs. I kind of wanted to keep the use of petit very similar to cat,
>head, awk, etc. But, that said, I have realized that if I provide the
>analysis features as an API, you very well, might want persistence between
>What about using an array inside a shelve?
>Just got done messing with this in python shell:
>import shelve
>d ="/root/test.shelf", protocol=-1)
>d["log"] = ()
>Then, always interacting with d["log"], for example:
>for i in d["log"]:
>    print i
>I know this won't manage memory, but it will keep the footprint down right?
>On Wed, Jan 12, 2011 at 5:04 PM, Peter Otten <> wrote:
>> Scott McCarty wrote:
>> > Sorry to ask this question. I have search the list archives and googled,
>> > but I don't even know what words to find what I am looking for, I am just
>> > looking for a little kick in the right direction.
>> >
>> > I have a Python based log analysis program called petit (
>> > I am trying to modify it to manage the
>> main
>> > object types to and from disk.
>> >
>> > Essentially, I have one object which is a list of a bunch of "Entry"
>> > objects. The Entry objects have date, time, date, etc fields which I use
>> > for analysis techniques. At the very beginning I build up the list of
>> > objects then would like to start pickling it while building to save
>> > memory. I want to be able to process more entries than I have memory.
>> With
>> > a strait list it looks like I could build from xreadlines(), but once you
>> > turn it into a more complex object, I don't quick know where to go.
>> >
>> > I understand how to pickle the entire data structure, but I need
>> something
>> > that will manage the memory/disk allocation?  Any thoughts?
>> You can write multiple pickled objects into a single file:
>> import cPickle as pickle
>> def dump(filename, items):
>>    with open(filename, "wb") as out:
>>        dump = pickle.Pickler(out).dump
>>        for item in items:
>>            dump(item)
>> def load(filename):
>>    with open(filename, "rb") as instream:
>>        load = pickle.Unpickler(instream).load
>>        while True:
>>            try:
>>                item = load()
>>            except EOFError:
>>                break
>>            yield item
>> if __name__ == "__main__":
>>    filename = "tmp.pickle"
>>    from collections import namedtuple
>>    T = namedtuple("T", "alpha beta")
>>    dump(filename, (T(a, b) for a, b in zip("abc", [1,2,3])))
>>    for item in load(filename):
>>        print item
>> To get random access you'd have to maintain a list containing the offsets
>> of
>> the entries in the file.
>> However, a simple database like SQLite is probably sufficient for the kind
>> of entries you have in mind, and it allows operations like aggregation,
>> sorting and grouping out of the box.
>> Peter
>> --

Reply via email to