Sent from my LG phone python-list-requ...@python.org wrote:
>Send Python-list mailing list submissions to > python-list@python.org > >To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-list >or, via email, send a message with subject or body 'help' to > python-list-requ...@python.org > >You can reach the person managing the list at > python-list-ow...@python.org > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of Python-list digest..." > >Today's Topics: > > 1. Re: Python use growing fast (Alice Bevan?McGregor) > 2. Re: order of importing modules (Chris Rebert) > 3. Re: How to Buffer Serialized Objects to Disk (MRAB) > 4. Re: How to Buffer Serialized Objects to Disk (Chris Rebert) > 5. Re: How to Buffer Serialized Objects to Disk (Peter Otten) > 6. Re: Best way to automatically copy out attachments from an > email (Chris Rebert) > 7. Re: Parsing string for "<verb> <noun>" (Aahz) > 8. Re: Nested structures question (Tim Harig) > 9. Re: How to Buffer Serialized Objects to Disk (Scott McCarty) > >On 2011-01-10 19:49:47 -0800, Roy Smith said: > >> One of the surprising (to me, anyway) uses of JavaScript is as the >> scripting language for MongoDB (http://www.mongodb.org/). > >I just wish they'd drop spidermonkey and go with V8 or another, faster >and more modern engine. :( > > - Alice. > > > > >> Dan Stromberg wrote: >>> On Tue, Jan 11, 2011 at 4:30 PM, Catherine Moroney >>> <catherine.m.moro...@jpl.nasa.gov> wrote: >>>> >>>> In what order does python import modules on a Linux system? I have a >>>> package that is both installed in /usr/lib64/python2.5/site-packages, >>>> and a newer version of the same module in a working directory. >>>> >>>> I want to import the version from the working directory, but when I >>>> print module.__file__ in the interpreter after importing the module, >>>> I get the version that's in site-packages. >>>> >>>> I've played with the PYTHONPATH environmental variable by setting it >>>> to just the path of the working directory, but when I import the module >>>> I still pick up the version in site-packages. >>>> >>>> /usr/lib64 is in my PATH variable, but doesn't appear anywhere else. I >>>> don't want to remove /usr/lib64 from my PATH because that will break >>>> a lot of stuff. >>>> >>>> Can I force python to import from my PYTHONPATH first, before looking >>>> in the system directory? >>>> >>> Please import sys and inspect sys.path; this defines the search path >>> for imports. >>> >>> By looking at sys.path, you can see where in the search order your >>> $PYTHONPATH is going. >>> >On Wed, Jan 12, 2011 at 11:07 AM, Catherine Moroney ><catherine.m.moro...@jpl.nasa.gov> wrote: >> I've looked at my sys.path variable and I see that it has >> a whole bunch of site-package directories, followed by the >> contents of my $PYTHONPATH variable, followed by a list of >> misc site-package variables (see below). ><snip> >> But, I'm curious as to where the first bunch of 'site-package' >> entries come from. The >> /usr/lib64/python2.5/site-packages/pyhdfeos-1.0_r57_58-py2.5-linux-x86_64.egg >> is not present in any of my environmental variables yet it shows up >> as one of the first entries in sys.path. > >You probably have a .pth file somewhere that adds it (since it's an >egg, probably site-packages/easy-install.pth). >See http://docs.python.org/install/index.html#modifying-python-s-search-path > >Cheers, >Chris >-- >http://blog.rebertia.com > > >On 12/01/2011 21:05, Scott McCarty wrote: >> Sorry to ask this question. I have search the list archives and googled, >> but I don't even know what words to find what I am looking for, I am >> just looking for a little kick in the right direction. >> >> I have a Python based log analysis program called petit >> (http://crunchtools.com/petit). I am trying to modify it to manage the >> main object types to and from disk. >> >> Essentially, I have one object which is a list of a bunch of "Entry" >> objects. The Entry objects have date, time, date, etc fields which I use >> for analysis techniques. At the very beginning I build up the list of >> objects then would like to start pickling it while building to save >> memory. I want to be able to process more entries than I have memory. >> With a strait list it looks like I could build from xreadlines(), but >> once you turn it into a more complex object, I don't quick know where to go. >> >> I understand how to pickle the entire data structure, but I need >> something that will manage the memory/disk allocation? Any thoughts? >> >To me it sounds like you need to use a database. > > >On Wed, Jan 12, 2011 at 1:05 PM, Scott McCarty <scott.mcca...@gmail.com> wrote: >> Sorry to ask this question. I have search the list archives and googled, but >> I don't even know what words to find what I am looking for, I am just >> looking for a little kick in the right direction. >> I have a Python based log analysis program called petit >> (http://crunchtools.com/petit). I am trying to modify it to manage the main >> object types to and from disk. >> Essentially, I have one object which is a list of a bunch of "Entry" >> objects. The Entry objects have date, time, date, etc fields which I use for >> analysis techniques. At the very beginning I build up the list of objects >> then would like to start pickling it while building to save memory. I want >> to be able to process more entries than I have memory. With a strait list it >> looks like I could build from xreadlines(), but once you turn it into a more >> complex object, I don't quick know where to go. >> I understand how to pickle the entire data structure, but I need something >> that will manage the memory/disk allocation? Any thoughts? > >You could subclass `list` and use sys.getsizeof() >[http://docs.python.org/library/sys.html#sys.getsizeof ] to keep track >of the size of the elements, and then start pickling them to disk once >the total size reaches some preset limit. >But like MRAB said, using a proper database, e.g. SQLite >(http://docs.python.org/library/sqlite3.html ), wouldn't be a bad idea >either. > >Cheers, >Chris >-- >http://blog.rebertia.com > > >Scott McCarty wrote: > >> Sorry to ask this question. I have search the list archives and googled, >> but I don't even know what words to find what I am looking for, I am just >> looking for a little kick in the right direction. >> >> I have a Python based log analysis program called petit ( >> http://crunchtools.com/petit). I am trying to modify it to manage the main >> object types to and from disk. >> >> Essentially, I have one object which is a list of a bunch of "Entry" >> objects. The Entry objects have date, time, date, etc fields which I use >> for analysis techniques. At the very beginning I build up the list of >> objects then would like to start pickling it while building to save >> memory. I want to be able to process more entries than I have memory. With >> a strait list it looks like I could build from xreadlines(), but once you >> turn it into a more complex object, I don't quick know where to go. >> >> I understand how to pickle the entire data structure, but I need something >> that will manage the memory/disk allocation? Any thoughts? > >You can write multiple pickled objects into a single file: > >import cPickle as pickle > >def dump(filename, items): > with open(filename, "wb") as out: > dump = pickle.Pickler(out).dump > for item in items: > dump(item) > >def load(filename): > with open(filename, "rb") as instream: > load = pickle.Unpickler(instream).load > while True: > try: > item = load() > except EOFError: > break > yield item > >if __name__ == "__main__": > filename = "tmp.pickle" > from collections import namedtuple > T = namedtuple("T", "alpha beta") > dump(filename, (T(a, b) for a, b in zip("abc", [1,2,3]))) > for item in load(filename): > print item > >To get random access you'd have to maintain a list containing the offsets of >the entries in the file. >However, a simple database like SQLite is probably sufficient for the kind >of entries you have in mind, and it allows operations like aggregation, >sorting and grouping out of the box. > >Peter > > > >On Wed, Jan 12, 2011 at 10:59 AM, Matty Sarro <msa...@gmail.com> wrote: >> As of now here is my situation: >> I am working on a system to aggregate IT data and logs. A number of >> important data are gathered by a third party system. The only >> immediate way I have to access the data is to have their system >> automatically email me updates in CSV format every hour. If I set up a >> mail client on the server, this shouldn't be a huge issue. >> >> However, is there a way to automatically open the emails, and copy the >> attachments to a directory based on the filename? Kind of a weird >> project, I know. Just looking for some ideas hence posting this on two >> lists. > >Parsing out email attachments: >http://docs.python.org/library/email.parser.html >http://docs.python.org/library/email.message.html#module-email.message > >Parsing the extension from a filename: >http://docs.python.org/library/os.path.html#os.path.splitext > >Retrieving email from a mail server: >http://docs.python.org/library/poplib.html >http://docs.python.org/library/imaplib.html > >You could poll for new messages via a cron job or the `sched` module >(http://docs.python.org/library/sched.html ). Or if the messages are >being delivered locally, you could use inotify bindings or similar to >watch the appropriate directory for incoming mail. Integration with a >mail server itself is also a possibility, but I don't know much about >that. > >Cheers, >Chris >-- >http://blog.rebertia.com > > >In article <0d7143ca-45cf-44c3-9e8d-acb867c52...@f30g2000yqa.googlegroups.com>, >Daniel da Silva <ddasi...@umd.edu> wrote: >> >>I have come across a task where I would like to scan a short 20-80 >>character line of text for instances of "<verb> <noun>". Ideally >><verb> could be of any tense. > >In Soviet Russia, <noun> <verbs> you! >-- >Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ > >"Think of it as evolution in action." --Tony Rand > > >In case you still need help: > >- # Set the initial values >- the_number= random.randrange(100) + 1 >- tries = 0 >- guess = None >- >- # Guessing loop >- while guess != the_number and tries < 7: >- guess = int(raw_input("Take a guess: ")) >- if guess > the_number: >- print "Lower..." >- elif guess < the_number: >- print "Higher..." >- tries += 1 >- >- # did the user guess correctly to make too many guesses? >- if guess == the_number: >- print "You guessed it! The number was", the_number >- print "And it only took you", tries, "tries!\n" >- else: >- print "Wow you suck! It should only take at most 7 tries!" >- >- raw_input("Press Enter to exit the program.") > > >Been digging ever since I posted this. I suspected that the response might >be use a database. I am worried I am trying to reinvent the wheel. The >problem is I don't want any dependencies and I also don't need persistence >program runs. I kind of wanted to keep the use of petit very similar to cat, >head, awk, etc. But, that said, I have realized that if I provide the >analysis features as an API, you very well, might want persistence between >runs. > >What about using an array inside a shelve? > >Just got done messing with this in python shell: > >import shelve > >d = shelve.open(filename="/root/test.shelf", protocol=-1) > >d["log"] = () >d["log"].append("test1") >d["log"].append("test2") >d["log"].append("test3") > >Then, always interacting with d["log"], for example: > >for i in d["log"]: > print i > >Thoughts? > > >I know this won't manage memory, but it will keep the footprint down right? >On Wed, Jan 12, 2011 at 5:04 PM, Peter Otten <__pete...@web.de> wrote: > >> Scott McCarty wrote: >> >> > Sorry to ask this question. I have search the list archives and googled, >> > but I don't even know what words to find what I am looking for, I am just >> > looking for a little kick in the right direction. >> > >> > I have a Python based log analysis program called petit ( >> > http://crunchtools.com/petit). I am trying to modify it to manage the >> main >> > object types to and from disk. >> > >> > Essentially, I have one object which is a list of a bunch of "Entry" >> > objects. The Entry objects have date, time, date, etc fields which I use >> > for analysis techniques. At the very beginning I build up the list of >> > objects then would like to start pickling it while building to save >> > memory. I want to be able to process more entries than I have memory. >> With >> > a strait list it looks like I could build from xreadlines(), but once you >> > turn it into a more complex object, I don't quick know where to go. >> > >> > I understand how to pickle the entire data structure, but I need >> something >> > that will manage the memory/disk allocation? Any thoughts? >> >> You can write multiple pickled objects into a single file: >> >> import cPickle as pickle >> >> def dump(filename, items): >> with open(filename, "wb") as out: >> dump = pickle.Pickler(out).dump >> for item in items: >> dump(item) >> >> def load(filename): >> with open(filename, "rb") as instream: >> load = pickle.Unpickler(instream).load >> while True: >> try: >> item = load() >> except EOFError: >> break >> yield item >> >> if __name__ == "__main__": >> filename = "tmp.pickle" >> from collections import namedtuple >> T = namedtuple("T", "alpha beta") >> dump(filename, (T(a, b) for a, b in zip("abc", [1,2,3]))) >> for item in load(filename): >> print item >> >> To get random access you'd have to maintain a list containing the offsets >> of >> the entries in the file. >> However, a simple database like SQLite is probably sufficient for the kind >> of entries you have in mind, and it allows operations like aggregation, >> sorting and grouping out of the box. >> >> Peter >> >> -- >> http://mail.python.org/mailman/listinfo/python-list >> > >-- >http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list