Permission denied and lock issue with multiprocess logging
When i am running the implementation of multiprocess logging through queue handler, i get this error. It is the same with sockethandler as well as with pipe handler if multiprocesses are involved. I am not getting any hint to solve this problem. Please help to solve the problem. Platform: AIX Python version:2.6.5 sem_trywait: Permission denied sem_post: Permission denied sem_destroy: Permission denied sem_wait: Permission denied Process Process-1: Traceback (most recent call last): File "/opt/freeware/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/opt/freeware/lib/python2.6/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "mplog.py", line 108, in listener_process configurer() File "mplog.py", line 99, in listener_configurer h = logging.handlers.RotatingFileHandler('/tmp/mptest.log', 'a', 300, 10) File "/opt/freeware/lib/python2.6/logging/handlers.py", line 107, in __init__ BaseRotatingHandler.__init__(self, filename, mode, encoding, delay) File "/opt/freeware/lib/python2.6/logging/handlers.py", line 59, in __init__ logging.FileHandler.__init__(self, filename, mode, encoding, delay) File "/opt/freeware/lib/python2.6/logging/__init__.py", line 819, in __init__ StreamHandler.__init__(self, self._open()) File "/opt/freeware/lib/python2.6/logging/__init__.py", line 744, in __init__ Handler.__init__(self) File "/opt/freeware/lib/python2.6/logging/__init__.py", line 605, in __init__ _releaseLock() File "/opt/freeware/lib/python2.6/logging/__init__.py", line 208, in _releaseLock _lock.release() File "/opt/freeware/lib/python2.6/threading.py", line 138, in release raise RuntimeError("cannot release un-acquired lock") RuntimeError: cannot release un-acquired lock sem_trywait: Permission denied -- http://mail.python.org/mailman/listinfo/python-list
Re: __dict__ is neato torpedo!
On Sat, 11 Jun 2011 21:28:40 -0500, Andrew Berg wrote: > On 2011.06.11 09:13 PM, Steven D'Aprano wrote: >> A second, more subtle risk: not all objects have a __dict__. But if you >> obey the rule about never updating from arbitrary objects you don't >> know, then you won't be surprised by an object with no __dict__. > > What objects don't (other than the obvious ones like strings, > dictionaries, ints and lists)? namedtuple is another common example. In pure Python, objects created using __slots__ usually don't have a __dict__. Quite possibly C extension objects. There may be others. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: __dict__ is neato torpedo!
On 06/11/2011 08:32 PM, Andrew Berg wrote: I'm pretty happy that I can copy variables and their value from one object's namespace to another object's namespace with the same variable names automatically: b.__dict__.update(a.__dict__) The reason I'm posting this is to ask what to watch out for when doing this. I've seen vague warnings that I don't really understand, and I'm hoping someone can enlighten me. I guess the summary is is "it does *exactly* what an Python experienced programmer would expect it to, so if things break you get to keep both pieces" (which even nicely summarizes Steven's sub-thread about objects lacking __dict__). Based on your reactions to the replies, I'd say you're sufficiently experienced in Python to have your expectations align with Python reality. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Handling emails
Hello I wrote a program which was working on python 2.x. I'd like to go for newer version but I face the problem on how the emails are parsed. In particular I'd like to extract the significant parts of the headers, but the query to the servers had turned in to list of bytes. What could be a method that will parse and return the headers into ascii if I'll pass the headers as bytes. Even I don't know whether I can pass as they arrive to the program. For example if I try: import poplib.POP3 _pop= poplib.POP3(srvr) _pop.user(args[1]) _pop.pass_(args[2]) header =_pop.top(nmuid, 0) This will return a list of bytes string and I don't have idea to process them in order to have a dictionary containing 'from', 'to', 'cc', 'bcc', 'date', 'subject', 'reply-to', 'message-id' as keys. -- goto /dev/null -- http://mail.python.org/mailman/listinfo/python-list
Re: Emacs Python indention
Bastian Ballmann writes: > Hi Emacs / Python coders, > > moving a region of python code for more than one indention in Emacs is > quite annoying, cause the python-shift-left and -right functions always > loose the mark and one has to reactivate it with \C-x \C-x or > guess how many indentions one want to make and do a \C-u \C-c > > > That were the only solutions I found on the net and well both are > not very comfortable so here's a fix for that. With the following code > you can use \C-c left and right to move your Python code to the left > and to the right :) > HF > > Basti [...] Nice functions... But actually I use python-mode.el from the bzr trunk and the indentation works really nicely, with C-c > or C-c <, and doesn't lose the mark. Another nice thing is that often TAB does also the right thing, indenting or unindenting if for example we add remove one level. -- http://mail.python.org/mailman/listinfo/python-list
Re: Handling emails
On Sun, 12 Jun 2011 19:20:00 +0800, TheSaint wrote: > Hello > I wrote a program which was working on python 2.x. I'd like to go for > newer version but I face the problem on how the emails are parsed. In > particular I'd like to extract the significant parts of the headers, but > the query to the servers had turned in to list of bytes. What could be a > method that will parse and return the headers into ascii if I'll pass > the headers as bytes. Even I don't know whether I can pass as they > arrive to the program. > > For example if I try: > > import poplib.POP3 > _pop= poplib.POP3(srvr) > _pop.user(args[1]) > _pop.pass_(args[2]) > > header =_pop.top(nmuid, 0) > > This will return a list of bytes string and I don't have idea to process > them in order to have a dictionary containing 'from', 'to', 'cc', 'bcc', > 'date', 'subject', 'reply-to', 'message-id' as keys. To parse emails, you should use the email package. It already handles bytes and strings. Other than that, I'm not entirely sure I understand your problem. In general, if you have some bytes, you can decode it into a string by hand: >>> header = b'To: python-list@python.org\n' >>> s = header.decode('ascii') >>> s 'To: python-list@python.org\n' If this is not what you mean, perhaps you should give an example of what header looks like, what you hope to get, and a concrete example of how it differs in Python 3. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Idle python32 starts now!!!!!!!!!Got it!!!!!
Ok, after reading eveything I could find on Idle not starting, nothing worked. I was left on my own. So, I went into C:\Python32\Lib\idlelib through a dos command prompt. I then executed the "idle.py" program. The message's that I recieved in the dos window was referring gnuplot.ini. It complained a few times with different messages and finally idle 3.2 started up and now works from the start programs menu. Anyway, just incase, I deleted gnuplot from my path. I guess I will have to try and install gnuplot again. At least if it breaks my python, I now know how to fix it. -- http://mail.python.org/mailman/listinfo/python-list
ftplib: Software caused connection abort, how I solved it
My program polls FTP servers at intervals for jobs to process. Its running as a service on Windows server 2000 or 2003 :-(. About 13% of times the retrbinary and less often the nlst calls would fail with "Software caused connection abort". I could find no relevant solution on the intertubes. I added blocksize=2048 to the retrbinary call which lowerd the failures to about 2% but that was still unsatisfactory for me. When I added: socket.setdefaulttimeout(60) to the setup stuff in order to solve a different problem, the connection abort errors went away completely. Even when I restored retrbinary to use the default blocksize it still worked. HTH -- http://mail.python.org/mailman/listinfo/python-list
Re: Handling emails
Steven D'Aprano wrote: First of all: thanks for the reply >> header =_pop.top(nmuid, 0) > To parse emails, you should use the email package. It already handles > bytes and strings. I've read several information this afternoon, mostly are leading to errors. That could be my ignorance fault :) For what I could come over, I decided to write my own code. def msg_parser(listOfBytes): header={} for lin in listOfBytes: try: line= lin.decode() except UnicodeDecodeError: continue for key in _FULLhdr: if key in line: header[key]= line continue return header listOfBytes is the header content, whuch id given by libpop.POP3.top(num_msg. how_much), tuple second part. However, some line will fail to decode correctly. I can't imagine why emails don't comply to a standard. > Other than that, I'm not entirely sure I understand your problem. In > general, if you have some bytes, you can decode it into a string by hand: I see. I didn't learn a good english yet :P. I'm Italian :) header = b'To: python-list@python.org\n' s = header.decode('ascii') s > 'To: python-list@python.org\n' I know this, in case to post the entire massege header and envelope it's not applicable. The libraries handling emails and their headers seems to me a big confusion and I suppose I should take a different smaller approach. I'll try to show a header (if content isn't privacy breaker) but as the above example the *_pop.top(nmuid, 0)* won't go into your example > If this is not what you mean, perhaps you should give an example of what > header looks like The difference is that previous version returning text strings and the following processes are based on strings manipulations. Just to mention, my program reads headers from POP3 or IMAP4 server and apply some regex filtering in order to remove unwanted emails from the server. All the filters treating IO as ascii string of characters. I passed my modules to 2to3 for the conversion to the newer python, but at the first run it told that downloaded header is not a string. -- goto /dev/null -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Card alternatives?
On Sat, 2011-06-11 at 13:07 +, rzed wrote: > Desktop apps don't seem to be the wave of the future, but they still > serve a useful purpose today. They can be ideal for a quick database > table management screen, +1, they are perfect for that, and will be around for a *long* *long* time. And I doubt they will ever go away - the web app will change to be more desktopish. Gtk already has an experimental HTML canvas backend, GNOME3 is a canvas controlled via JavaScript, etc... > or a data entry front end for a program with > a bunch of parameters. It's not easy enough to build a quick utility > with a GUI front end, though. Wax and PythonCard (and maybe others) > tried to hit that niche, but development on both is spotty at best. > Some claim that Dabo's gui builder is a good one for this purpose, and > maybe it can be. Are there any other, better solutions? My advice is to keep it simple. Gtk/Glade works perfectly well for this purpose. The glue code required is trivial. -- http://mail.python.org/mailman/listinfo/python-list
Re: __dict__ is neato torpedo!
On 12/06/11 10:47:01, Steven D'Aprano wrote: On Sat, 11 Jun 2011 21:28:40 -0500, Andrew Berg wrote: On 2011.06.11 09:13 PM, Steven D'Aprano wrote: A second, more subtle risk: not all objects have a __dict__. But if you obey the rule about never updating from arbitrary objects you don't know, then you won't be surprised by an object with no __dict__. What objects don't (other than the obvious ones like strings, dictionaries, ints and lists)? namedtuple is another common example. In pure Python, objects created using __slots__ usually don't have a __dict__. Quite possibly C extension objects. There may be others. The base class 'object' is a well-known example: >>> ham = object() >>> ham.spam = 'eggs' Traceback (most recent call last): File "", line 1, in AttributeError: 'object' object has no attribute 'spam' >>> But subclasses of object do have a __dict__ by default, which is why you'd normally do: >>> class Ham(object): ... pass ... >>> ham = Ham() >>> ham.spam = 'eggs' >>> Here, the only reason for creating a subclass with no visible new methods or attributes, is the unmentioned __dict__ attribute that Ham instances have and 'object' instances lack. -- HansM -- http://mail.python.org/mailman/listinfo/python-list
To install lxml (and easy_install) for Python 3 under Windows...
I had difficulty installing lxml for Python 3.1 under Windows, and took some notes as I worked through it. Here's how I finally managed it... Go to http://lxml.de/installation.html#ms-windows. Follow the link to the "binary egg distribution of lxml" here: http://cheeseshop.python.org/pypi/lxml Download the lxml-2.3-py3.1-win32.egg file. The instructions say, "Just use easy_install by following the installation instructions above." Ha! Those instructions don't work for Python 3 under Windows. The (wrong) instructions are: Get the easy_install tool [from link http://peak.telecommunity.com/DevCenter/EasyInstall ] and run the following as super-user (or administrator): easy_install lxml On MS Windows, the above will install the binary builds that we provide. If there is no binary build of the latest release yet, please search PyPI for the last release that has them and pass that version to easy_install like this: easy_install lxml==2.2.2 However, that peak.telecommunity.com link goes to an old page that doesn't support Python 3. Instead, first we must install Distribute, which is a fork of Setuptools, to get easy_install. Download distribute_setup.py from here: http://pypi.python.org/pypi/distribute#distribute-setup-py Then run distribute_setup.py Now you should have an easy_install.exe file here: C:\python31\Scripts\easy_install.exe You can run it from a Windows Command Prompt to install lxml-2.3-py3.1- win32.egg, like this: C:\python31\Scripts\easy_install.exe lxml-2.3-py3.1-win32.egg Dave Burton http://www.burtonsys.com/email/ -- http://mail.python.org/mailman/listinfo/python-list
Re: To install lxml (and easy_install) for Python 3 under Windows...
On 12-6-2011 18:38, ncdave4l...@mailinator.com wrote: > I had difficulty installing lxml for Python 3.1 under Windows, and > took some notes as I worked through it. Here's how I finally managed > it... > > [...] In cases like this, Christoph Gohlke's page with 'Unofficial Windows Binaries for Python Extension Packages' can be very handy: http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml -irmen -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Card alternatives?
> Are there any other, better solutions? Others are e.g.: - Pypapi - Camelot - Kiwi - Sqlkit - Gnuenterprise etc... Sincerely, Wolfgang -- Führungskräfte leisten keine Arbeit(D'Alembert) -- http://mail.python.org/mailman/listinfo/python-list
debian defaults not up-to-date
Just finished installing Mint 10 and all has gone well. However, when I removed some applications, I received this error during the removal: INFO: using unknown version '/usr/bin/python2.7' (debian_defaults not up- to-date?) The removal proceeds without any other warnings or errors. Not sure what I should do to correct this. Your help is most appreciated... Thanks in advance -- http://mail.python.org/mailman/listinfo/python-list
Re: Unsupported operand type(s) for +: 'float' and 'tuple'
Am 11.06.2011 03:02 schrieb Gabriel Genellina: Perhaps those names make sense in your problem at hand, but usually I try to use more meaningful ones. Until here we agree. > 0 and O look very similar in some fonts. That is right - but who would use such fonts for programming? Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: debian defaults not up-to-date
On Jun 12, 2011 10:32 AM, "blues2use" wrote: > > Just finished installing Mint 10 and all has gone well. However, when I > removed some applications, I received this error during the removal: > > INFO: using unknown version '/usr/bin/python2.7' (debian_defaults not up- > to-date?) > > The removal proceeds without any other warnings or errors. > > Not sure what I should do to correct this. > > Your help is most appreciated... > > Thanks in advance > -- This is a Mint question. You'll probably get better answers on their forum. It has to do with their package manager. Python just happens to be the package that's messed up. -- http://mail.python.org/mailman/listinfo/python-list
[RELEASE] Python 2.7.2
On behalf of the Python development team, I'm rosy to announce the immediate availability of Python 2.7.2. Since the release candidate 2 weeks ago, there have been 2 changes: 1. pyexpat.__version__ has be changed to be the Python version. 2. A regression from 3.1.3 in the handling of comments in the netrc module has been resolved. (see issue #12009). 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last major verison of the 2.x line and will be receiving only bug fixes while new feature development focuses on 3.x. The 2.7 series includes many features that were first released in Python 3.1. The faster io module, the new nested with statement syntax, improved float repr, set literals, dictionary views, and the memoryview object have been backported from 3.1. Other features include an ordered dictionary implementation, unittests improvements, a new sysconfig module, auto-numbering of fields in the str/unicode format method, and support for ttk Tile in Tkinter. For a more extensive list of changes in 2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python distribution. To download Python 2.7.2 visit: http://www.python.org/download/releases/2.7.1/ The 2.7.2 changelog is at: http://hg.python.org/cpython/raw-file/eb3c9b74884c/Misc/NEWS 2.7 documentation can be found at: http://docs.python.org/2.7/ This is a production release, please report any bugs to http://bugs.python.org/ Enjoy and for those in the northern hemisphere, have a nice summer! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 2.7.2's contributors) -- http://mail.python.org/mailman/listinfo/python-list
[RELEASED] Python 3.1.4
On behalf of the Python development team, I'm sanguine to announce a release candidate for the fourth bugfix release for the Python 3.1 series, Python 3.1.4. Since the 3.1.4 release candidate 2 weeks ago, there have been three changes: 1. test_zipfile has been fixed on systems with an ASCII filesystem encoding. 2. pyexpat.__version__ has be changed to be the Python version. 3. A regression from 2.7.1 in the handling of comments in the netrc module has been resolved. (see issue #12009). 3.1.4 will the last bug fix release in the 3.1 series before 3.1. After 3.1.4, 3.1 will be in security-only fix mode. The Python 3.1 version series focuses on the stabilization and optimization of the features and changes that Python 3.0 introduced. For example, the new I/O system has been rewritten in C for speed. File system APIs that use unicode strings now handle paths with undecodable bytes in them. Other features include an ordered dictionary implementation, a condensed syntax for nested with statements, and support for ttk Tile in Tkinter. For a more extensive list of changes in 3.1, see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in the Python distribution. This is a production release. To download Python 3.1.4 visit: http://www.python.org/download/releases/3.1.4/ A list of changes in 3.1.4 can be found here: http://hg.python.org/cpython/raw-file/feae9f9e9f30/Misc/NEWS The 3.1 documentation can be found at: http://docs.python.org/3.1 Bugs can always be reported to: http://bugs.python.org Enjoy and be merry! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 3.1.4's contributors) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.7.2
On 12 juin, 19:57, Benjamin Peterson wrote: > On behalf of the Python development team, I'm rosy to announce the immediate > availability of Python 2.7.2. > Small error: The link points to Python 2.7.1. The 2.7.2 page exists: http://www.python.org/download/releases/2.7.2/ Update Python 2.7.2 and 3.1.4 on my win box. Total time < 5mn. Good job. Thanks. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Function declarations ?
On 2011-06-10, Asen Bozhilov wrote: > Andre Majorel wrote: > >> Is there a way to keep the definitions of the high-level >> functions at the top of the source ? I don't see a way to >> declare a function in Python. > > Languages with variable and function declarations usually use > hoisted environment. Hoisted ? With a pulley and a cable ? > JavaScript is the perfect example. Hoisted environment allows > you to use call expression before the physical declaration of > the function in the source text. The issue here is not the ability to call a function before its declaration. It's being able to do so before its definition. > Hope this helps, why Python use definitions instead of > declarations. It's not either/or. Any language has to provide a way to define functions whether or not it provides a way to declare them. Anyway, it seems the Python way to declare a function is def f (): pass Thanks everyone. -- André Majorel http://www.teaser.fr/~amajorel/ J'ai des droits. Les autres ont des devoirs. -- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-Dev] [RELEASED] Python 3.1.4
2011/6/12 Paul Moore : > On 12 June 2011 18:58, Benjamin Peterson wrote: >> On behalf of the Python development team, I'm sanguine to announce a release >> candidate for the fourth bugfix release for the Python 3.1 series, Python >> 3.1.4. > > Is this actually a RC, or is that a typo? That is a typo. This is a final release! > Paul. > -- Regards, Benjamin -- http://mail.python.org/mailman/listinfo/python-list
Re: Permission denied and lock issue with multiprocess logging
On Jun 12, 8:49 am, david dani wrote: > When i am running the implementation of multiprocessloggingthrough > queue handler, i get this error. It is the same with sockethandler as > well as with pipe handler if multiprocesses are involved. I am not > getting any hint to solve this problem. Please help to solve the > problem. There is an old bug on AIX which might be relevant, see http://bugs.python.org/issue1234 It depends on how your Python was built - see the detailed comments about how Python should be configured before being built on AIX systems. N.B. This error has nothing to do with logging - it's related to semaphore behaviour in the presence of fork(), which of course happens in multiprocessing scenarios. Regards, Vinay Sajip -- http://mail.python.org/mailman/listinfo/python-list
Re: parallel computations: subprocess.Popen(...).communicate()[0] does not work with multiprocessing.Pool
In article Hseu-Ming Chen wrote: >I am having an issue when making a shell call from within a >multiprocessing.Process(). Here is the story: i tried to parallelize >the computations in 800-ish Matlab scripts and then save the results >to MySQL. The non-parallel/serial version has been running fine for >about 2 years. However, in the parallel version via multiprocessing >that i'm working on, it appears that the Matlab scripts have never >been kicked off and nothing happened with subprocess.Popen. The debug >printing below does not show up either. I obviously do not have your code, and have not even tried this as an experiment in a simplified environment, but: >import subprocess >from multiprocessing import Pool > >def worker(DBrow,config): > # run one Matlab script > cmd1 = "/usr/local/bin/matlab ... myMatlab.1.m" > subprocess.Popen([cmd1], shell=True, > stdout=subprocess.PIPE).communicate()[0] > print "this does not get printed" ... ># kick off parallel processing >pool = Pool() >for DBrow in DBrows: pool.apply_async(worker,(DBrow,config)) >pool.close() >pool.join() The multiprocessing code makes use of pipes to communicate between the various subprocesses it creates. I suspect these "extra" pipes are interfering with your subprocesses, when pool.close() waits for the Matlab script to do something with its copy of the pipes. To make the subprocess module close them -- so that Matlab does not have them in the first place and hence pool.close() cannot get stuck there -- add "close_fds=True" to the Popen() call. There could still be issues with competing wait() and/or waitpid() calls (assuming you are using a Unix-like system, or whatever the equivalent is for Windows) "eating" the wrong subprocess completion notifications, but that one is harder to solve in general :-) so if close_fds fixes things, it was just the pipes. If close_fds does not fix things, you will probably need to defer the pool.close() step until after all the subprocesses complete. -- In-Real-Life: Chris Torek, Wind River Systems Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: gmail (figure it out) http://web.torek.net/torek/index.html -- http://mail.python.org/mailman/listinfo/python-list
Re: debian defaults not up-to-date
On Jun 12, 10:25 pm, blues2use wrote: > Just finished installing Mint 10 and all has gone well. However, when I > removed some applications, I received this error during the removal: > > INFO: using unknown version '/usr/bin/python2.7' (debian_defaults not up- > to-date?) > > The removal proceeds without any other warnings or errors. > > Not sure what I should do to correct this. > > Your help is most appreciated... > > Thanks in advance Yes I have also got this from time to time (On Debian in my case) Not sure what... May be better to ask on a debian-(related) list -- http://mail.python.org/mailman/listinfo/python-list
Re: Handling emails
On Sun, 12 Jun 2011 21:57:38 +0800, TheSaint wrote: > However, some line will fail to decode correctly. I can't imagine why emails > don't comply to a standard. Any headers should be in ASCII; Non-ASCII characters should be encoded using quoted-printable and/or base-64 encoding. Any message with non-ASCII characters in the headers can safely be discarded as spam (I've never seen this bug in "legitimate" email). Many MTAs will simply reject such messages. The message body can be in any encoding, or in multiple encodings (e.g. for multipart/mixed content), or none (e.g. the body may be binary data rather than text). -- http://mail.python.org/mailman/listinfo/python-list
Re: Handling emails
On Sun, Jun 12, 2011 at 6:46 PM, Nobody wrote: > Any message with non-ASCII characters in the headers can safely be > discarded as spam (I've never seen this bug in "legitimate" email). > Many MTAs will simply reject such messages. http://en.wikipedia.org/wiki/Email_address#Internationalization It may not yet be in common use, but tossing international e-mails is probably not a great policy going forward. The reign of ASCII is coming to an end, security concerns about unicode's complexity notwithstanding. -- http://mail.python.org/mailman/listinfo/python-list
Re: Function declarations ?
Andre Majorel wrote: > >Anyway, it seems the Python way to declare a function is > > def f (): >pass No, that DEFINES a function. There is no way to declare a function in Python. It isn't done, because it isn't necessary. That code doesn't do what you think it does. Example: def f(): pass g = f def f(): return 3 print g() That prints "none". That module has two definitions of f, not one. The meaning of the name "f" changes partway through the module. Python is not C. You need to use Python habits, not C habits. What construct led you to think you need to declare a function like that? This code, for example, works fine: def g(): return f() def f(): return 3 print g() The name "f" does not have to be defined until the function "g" is actually executed. -- Tim Roberts, t...@probo.com Providenza & Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
Xah Lee wrote: > >(a lil weekend distraction from comp lang!) > >in recent years, there came this Colemak layout. The guy who created >it, Colemak, has a site, and aggressively market his layout. It's in >linuxes distro by default, and has become somewhat popular. >... >If your typing doesn't come anywhere close to a data-entry clerk, then >any layout more efficient than Dvorak is practically meaningless. More than that, any layout "more efficient" than QWERTY is practically meaningless. The whole "intentional inefficiency" thing in the design of the QWERTY layout is an urban legend. Once your fingers have the mapping memorized, the actual order is irrelevent. Studies have shown that even a strictly alphabetical layout works perfectly well, once the typist is acclimated. -- Tim Roberts, t...@probo.com Providenza & Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list
Subsetting a dataset
I have a huge dataset containing millions of rows and several dozen columns in a tab delimited text file. I need to extract a small subset of rows and only three columns. One of the three columns has two word string with header “Scientific Name”. The other two columns carry numbers for Longitude and Latitude, as below. Sci Name Longitude Latitude Column4 Gen sp1 82.5 28.4 … Gen sp2 45.9 29.7 … Gen sp1 57.9 32.9 … … … … … Of the many species listed under the column “Sci Name”, I am interested in only one species which will have multiple records interspersed in the millions of rows, and I will probably have to use filename.readline() to read the rows one at a time. How would I search for a particular species in the dataset and create a new dataset for the species with only the three columns? Next, I have to create such datasets for hundreds of species. All these species are listed in another text file. There must be a way to define an iterative function that looks at one species at a time in the list of species and creates separate dataset for each species. The huge dataset contains more species than those listed in the list of my interest. I very much appreciate any help. I am a beginner in Python. So, complete code would be more helpful. - Kumar -- Section of Integrative Biology University of Texas at Austin Austin, Texas 78712, USA -- http://mail.python.org/mailman/listinfo/python-list
Subsetting a dataset
I have a huge dataset containing millions of rows and several dozen columns in a tab delimited text file. I need to extract a small subset of rows and only three columns. One of the three columns has two word string with header “Scientific Name”. The other two columns carry numbers for Longitude and Latitude, as below. Sci Name Longitude Latitude Column4 Gen sp1 82.5 28.4 … Gen sp2 45.9 29.7 … Gen sp1 57.9 32.9 … … … … … Of the many species listed under the column “Sci Name”, I am interested in only one species which will have multiple records interspersed in the millions of rows, and I will probably have to use filename.readline() to read the rows one at a time. How would I search for a particular species in the dataset and create a new dataset for the species with only the three columns? Next, I have to create such datasets for hundreds of species. All these species are listed in another text file. There must be a way to define an iterative function that looks at one species at a time in the list of species and creates separate dataset for each species. The huge dataset contains more species than those listed in the list of my interest. I very much appreciate any help. I am a beginner in Python. So, complete code would be more helpful. - Kumar -- Section of Integrative Biology University of Texas at Austin Austin, Texas 78712, USA -- http://mail.python.org/mailman/listinfo/python-list
Re: Subsetting a dataset
On Sun, Jun 12, 2011 at 9:53 PM, Kumar Mainali wrote: > I have a huge dataset containing millions of rows and several dozen columns > in a tab delimited text file. I need to extract a small subset of rows and > only three columns. One of the three columns has two word string with header > “Scientific Name”. The other two columns carry numbers for Longitude and > Latitude, as below. > Sci Name Longitude Latitude Column4 > Gen sp1 82.5 28.4 … > Gen sp2 45.9 29.7 … > Gen sp1 57.9 32.9 … > … … … … > Of the many species listed under the column “Sci Name”, I am interested in > only one species which will have multiple records interspersed in the > millions of rows, and I will probably have to use filename.readline() to > read the rows one at a time. How would I search for a particular species in > the dataset and create a new dataset for the species with only the three > columns? > Next, I have to create such datasets for hundreds of species. All these > species are listed in another text file. There must be a way to define an > iterative function that looks at one species at a time in the list of > species and creates separate dataset for each species. The huge dataset > contains more species than those listed in the list of my interest. > I very much appreciate any help. I am a beginner in Python. So, complete > code would be more helpful. > - Kumar Read in the file with the lists of species. For each line in that list, open up a file and then put it into a dictionary where the key is the species name and the value is the file. Then, once you have all your files created, open up the second file. For each line in the second file, split it on the tabs and check to see if the first item is in the dict. If it is, grab your necessary values and write it to that corresponding file. In rough, untested code animals = dict() for line in open('species_list) : #make a file for that animal and associate it with the name animals[line.strip()] = open('%s_data.csv' % line.strip(),'w') #now open the second file for line in open('animal_data') : data = line.split('\t') if data[name_col] in animals : animals[data[name_col]].write('%s\t%s\t%s' & (data[name_col], data[lat_col], data[lon_col]) replacing the respective file names and column numbers as appropriate, of course. > > -- http://mail.python.org/mailman/listinfo/python-list
Re: Subsetting a dataset
On 6/13/2011 12:53 AM, Kumar Mainali wrote: I have a huge dataset containing millions of rows and several dozen columns in a tab delimited text file. I need to extract a small subset of rows and only three columns. One of the three columns has two word string with header “Scientific Name”. The other two columns carry numbers for Longitude and Latitude, as below. Sci NameLongitudeLatitudeColumn4 Gen sp182.528.4… Gen sp245.929.7… Gen sp157.932.9… Of the many species listed under the column “Sci Name”, I am interested in only one species which will have multiple records interspersed in the millions of rows, and I will probably have to use filename.readline() to read the rows one at a time. How would I search for a particular species in the dataset and create a new dataset for the species with only the three columns? Next, I have to create such datasets for hundreds of species. All these species are listed in another text file. There must be a way to define an iterative function that looks at one species at a time in the list of species and creates separate dataset for each species. The huge dataset contains more species than those listed in the list of my interest. Consider using a real dataset program with Sci_name indexed. Then you can extract the rows for any species as needed. You should only need separate files if you want to export them or more or less permanently split the database. You could try sqlite, which come with python, or one of the other free database programs. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Subsetting a dataset
On Sun, Jun 12, 2011 at 9:53 PM, Kumar Mainali wrote: > I have a huge dataset containing millions of rows and several dozen columns > in a tab delimited text file. I need to extract a small subset of rows and > only three columns. One of the three columns has two word string with header > “Scientific Name”. The other two columns carry numbers for Longitude and > Latitude, as below. > > Sci Name Longitude Latitude Column4 > Gen sp1 82.5 28.4 … > Gen sp2 45.9 29.7 … > Gen sp1 57.9 32.9 … > … … … … > > Of the many species listed under the column “Sci Name”, I am interested in > only one species which will have multiple records interspersed in the > millions of rows, and I will probably have to use filename.readline() to > read the rows one at a time. How would I search for a particular species in > the dataset and create a new dataset for the species with only the three > columns? > > Next, I have to create such datasets for hundreds of species. All these > species are listed in another text file. There must be a way to define an > iterative function that looks at one species at a time in the list of > species and creates separate dataset for each species. The huge dataset > contains more species than those listed in the list of my interest. > > I very much appreciate any help. I am a beginner in Python. So, complete > code would be more helpful > You could use the csv module, in CPython since 2.3. Don't be fooled by the name - it allows you to redefine various aspects making it appropriate for tab-separated values as well: http://docs.python.org/release/3.2/library/csv.html http://docs.python.org/release/2.7.2/library/csv.html -- http://mail.python.org/mailman/listinfo/python-list