Re: what's this instance?
J. Peng wrote: > def safe_float(object): > try: > retval = float(object) > except (ValueError, TypeError), oops: > retval = str(oops) > return retval > > x=safe_float([1,2,3,4]) > print x > > > The code above works well.But what's the instance of "oops"? where is it > coming from? I'm totally confused on it.thanks. The line except (ValueError, TypeError), oops: will trap ValueError and TypeError exceptions. The actual exception object will be assigned to the name "oops". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: is it possible to set namespace to an object.
On 21 Jan, 22:00, George Sakkis <[EMAIL PROTECTED]> wrote: > On Jan 21, 2:52 pm, glomde <[EMAIL PROTECTED]> wrote: > > > > > On 21 Jan, 20:16, George Sakkis <[EMAIL PROTECTED]> wrote: > > > > On Jan 21, 1:56 pm, glomde <[EMAIL PROTECTED]> wrote: > > > > > On 21 Jan, 18:59, Wildemar Wildenburger > > > > > <[EMAIL PROTECTED]> wrote: > > > > > glomde wrote: > > > > > > Hi, > > > > > > > is it somehow possible to set the current namespace so that is in an > > > > > > object. > > > > > > [snip] > > > > > > set namespace testObj > > > > > > Name = "Test" > > > > > > > Name would set testObj.Name to "Test". > > > > > > > [snip] > > > > > > > Is the above possible? > > > > > > Don't know, sorry. But let me ask you this: Why do you want to do > > > > > this? > > > > > Maybe there is another way to solve the problem that you want to > > > > > solve. > > > > > The reason is that I do not want to repeat myself. It is to set up XML > > > > type like > > > > trees and I would like to be able to do something like. > > > > > with ElemA(): > > > > Name = "Top" > > > > Description "Blahaha..." > > > > with ElemB(): > > > > Name = "ChildA" > > > > Description "Blahaha..." > > > > > > > > > This would be the instead of. > > > > with ElemA() as node: > > > > node.Name = "Top" > > > > node.Description "Blahaha..." > > > > with ElemB() as node: > > > > node.Name = "ChildA" > > > > node.Description "Blahaha..." > > > > > > > > > So to save typing and have something that I think looks nicer. > > > > ... and more confusing for anyone reading the code (including you > > > after a few weeks/months). If you want to save a few keystrokes, you > > > may use 'n' instead of 'node' or use an editor with easy auto > > > completion. > > > > By the way, is there any particular reason for generating the XML > > > programmatically like this ? Why not have a separate template and use > > > one of the dozen template engines to populate it ? > > > > George > > > I am not using it for XML generation. It was only an example. But > > the reason for using it programmatically is that you mix power > > of python with templating. Using for loops and so on. > > Any template engine worth its name supports loops. Other than that, > various engines provide different degrees of integration with Python, > from pretty limited (e.g. Django templates) to quite extensive (e.g. > Mako, Tenjin). > > > The above was only an example. And yes it might be confusing if you > > read the code. But I still want to do it, the question is it possible? > > I would be surprised if it is. Yet another idea you may want to > explore if you want that syntax so much is by (ab)using the class > statement since it introduces a new namespace: > > class ElemA: > Name = "Top" > Description "Blahaha..." > class ElemB: > Name = "ChildA" > Description "Blahaha..." > > PEP 359 would address this easily (it's actually the first use case > shown) but unfortunately it was withdrawn. > > George > > [1]http://www.python.org/dev/peps/pep-0359/#example-simple-namespaces Yes the make statement would have done it. But I realized that it might be possible if it is possible to override the __setattr__ of local. Then the enter function would set a global variable and the default setattr would set/get variables from this variable. Is this possible? -- http://mail.python.org/mailman/listinfo/python-list
Re: what's this instance?
On Tue, 22 Jan 2008 15:36:49 +0800, J. Peng wrote: > def safe_float(object): > try: > retval = float(object) > except (ValueError, TypeError), oops: > retval = str(oops) > return retval > > x=safe_float([1,2,3,4]) > print x > > > The code above works well.But what's the instance of "oops"? where is it > coming from? I'm totally confused on it.thanks. `oops` is bound to the `ValueError` or `TypError` object if `float()` raises such an exception. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Calculate net transfer rate without dummy file
whatazor schrieb: > Hi, > how can I calulate transfer rate to a host , without using a file ? > can ping module (written by Jeremy Hylton) be useful ? You can't measure without transmitting data. It's not only the network connection between the two hosts that is important, but also the sending and receiving processes, if they can cope with the amount of data or not and so forth. Diez -- http://mail.python.org/mailman/listinfo/python-list
Question on sort() key function
Hello, I have this class: class File: def __init__(self): self.name = '' self.path = '' self.date = 0 self.mod_date = 0 self.keywords = [] self.url = '' ...and after creating a list of File objects called flist, I'd like to sort it like thus: flist.sort(key=File.mod_date.toordinal) However, Python says: AttributeError: class File has no attribute 'mod_date' Well if you ask me, there are many things that may be said about my File class, but the absence of the attribute 'mod_date' ain't one of them. What do you think? And yes, this loop works fine: for f in flist: print f.mod_date.isoformat() (which IMO proves that all mod_date members are properly initialized as datetime objects). robert -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
Robert Latest <[EMAIL PROTECTED]> writes: > flist.sort(key=File.mod_date.toordinal) > > However, Python says: > AttributeError: class File has no attribute 'mod_date' The attribute is on instances of File, not on the class itself. See if this works: flist.sort(key=lambda f: f.mod_date.toordinal) -- http://mail.python.org/mailman/listinfo/python-list
Re: Just for fun: Countdown numbers game solver
Hi Arnaud > I've tried a completely different approach, that I imagine as 'folding'. I > thought it would improve performance over my previous effort but extremely > limited and crude benchmarking seems to indicate disappointingly comparable > performance... I wrote a stack-based version yesterday and it's also slow. It keeps track of the stack computation and allows you to backtrack. I'll post it sometime, but it's too slow for my liking and I need to put in some more optimizations. I'm trying not to think about this problem. What was wrong with the very fast(?) code you sent earlier? Terry -- http://mail.python.org/mailman/listinfo/python-list
Re: what's this instance?
J. Peng a écrit : > def safe_float(object): > try: > retval = float(object) > except (ValueError, TypeError), oops: > retval = str(oops) > return retval > The code above works well. For which definition of "works well" ? This function is really ill-named - it returns either a float or a string, so it is definitively not safe : def dosomething(x): return x + (x / 0.5) x=safe_float([1,2,3,4]) // a dozen line of code here y = dosomething(x) And now, have fun trying to trace the real problem... Better to not use this function at all IMHO - at least, you'll get a meaningfull traceback. > But what's the instance of "oops"? where is it > coming from? I'm totally confused on it.thanks. cf other answers on this. -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
Paul Rubin wrote: > The attribute is on instances of File, not on the class itself. See > if this works: > >flist.sort(key=lambda f: f.mod_date.toordinal) It doesn't throw an error any more, but neither does it sort the list. This, however, works: -- def by_date(f1, f2): return f1.mod_date.toordinal() - f2.mod_date.toordinal() flist.sort(by_date) -- So I'm sticking with it, although I sort of liked the key approach. robert -- http://mail.python.org/mailman/listinfo/python-list
Re: Trouble writing to database: RSS-reader
MRAB a écrit : > On Jan 21, 9:15 pm, Bruno Desthuilliers > <[EMAIL PROTECTED]> wrote: >> Arne a écrit : (snip) >>> So, I shouldn't use this techinicke (probably wrong spelled) >> May I suggest "technic" ?-) > > That should be "technique"; just ask a Francophone! :-) My bad :( -- http://mail.python.org/mailman/listinfo/python-list
Re: what's this instance?
Bruno Desthuilliers 写道: > J. Peng a écrit : >> def safe_float(object): >> try: >> retval = float(object) >> except (ValueError, TypeError), oops: >> retval = str(oops) >> return retval > >> The code above works well. > > For which definition of "works well" ? > I got it from Core Python Programming book I bought.You may ask it to Westley Chun.:) -- http://mail.python.org/mailman/listinfo/python-list
Re: Bug in __init__?
Bart Ogryczak a écrit : > On 2008-01-18, citizen Zbigniew Braniecki testified: (snip usual default mutable list arg problem) >> class A(): >> >>def add (self, el): >> self.lst.extend(el) >> >>def __init__ (self, val=[]): >> print val >> self.lst = val > > What you want probably is: > def __init__ (self, val=None): > if(val == None): Better to use an identity test here - there's only one instance of the None object, and identity test is faster than equality test (one function call faster IIRC !-). Also, the parens are totallu useless. if val is None: > self.lst = [] > else: > from copy import copy > ### see also deepcopy > self.lst = copy(val) What makes you think the OP wants a copy ? -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 4:31 pm, Alnilam <[EMAIL PROTECTED]> wrote: > Sorry for the noob question, but I've gone through the documentation > on python.org, tried some of the diveintopython and boddie's examples, > and looked through some of the numerous posts in this group on the > subject and I'm still rather confused. I know that there are some > great tools out there for doing this (BeautifulSoup, lxml, etc.) but I > am trying to accomplish a simple task with a minimal (as in nil) > amount of adding in modules that aren't "stock" 2.5, and writing a > huge class of my own (or copying one from diveintopython) seems > overkill for what I want to do. > > Here's what I want to accomplish... I want to open a page, identify a > specific point in the page, and turn the information there into > plaintext. For example, on thewww.diveintopython.orgpage, I want to > turn the paragraph that starts "Translations are freely > permitted" (and ends ..."let me know"), into a string variable. > > Opening the file seems pretty straightforward. > > >>> import urllib > >>> page = urllib.urlopen("http://diveintopython.org/";) > >>> source = page.read() > >>> page.close() > > gets me to a string variable consisting of the un-parsed contents of > the page. > Now things get confusing, though, since there appear to be several > approaches. > One that I read somewhere was: > > >>> from xml.dom.ext.reader import HtmlLib Pardon me, but the standard issue Python 2.n (for n in range(5, 2, -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous 200-modules PyXML package installed. And you don't want the 75Kb BeautifulSoup? -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
Robert Latest wrote: > Paul Rubin wrote: >> The attribute is on instances of File, not on the class itself. See >> if this works: >> >>flist.sort(key=lambda f: f.mod_date.toordinal) > > It doesn't throw an error any more, but neither does it sort the list. This, > however, works: > > -- > def by_date(f1, f2): > return f1.mod_date.toordinal() - f2.mod_date.toordinal() > > flist.sort(by_date) > -- > > So I'm sticking with it, although I sort of liked the key approach. > > robert This should work then: def date_key(f): return f.mod_date.toordinal() flist.sort(key=date_key) This can also be written as flist.sort(key=lambda f: f.mod_date.toordinal()) Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
Hello, I checked under linux and it works : text.txt : "first line of the text file second line of the text file" test.py : "import sys a = sys.stdin.readlines() x = ''.join(a) x = x.upper() sys.stdout.write(x)" >cat text.txt | python test.py But I reinstalled Python 2.5 under Windows XP and it doesn't work anyway. Can you confirm that your script works with Win XP and Python 2.5 ? Regards Rolf van de Krol a écrit : > I don't know what you did with your Python installation, but for me this > works perfectly. > > test3.py contains: > > import sys > > print sys.stdin.readlines() > > > test.txt contains: > > Testline1 > Testline2 > > > Output of 'python test3.py < test.txt' is: > > ['Testline1\n', 'Testline2'] > > > Just plain simple and just works. > > Rolf > > > > Bernard Desnoues wrote: >> Rolf van de Krol a écrit : >> >>> According to various tutorials this should work. >>> >>> >>> |import sys >>> data = sys.stdin.readlines() >>> print "Counted", len(data), "lines."| >>> >>> >>> Please use google before asking such questions. This was found with >>> only one search for the terms 'python read stdin' >>> >>> Rolf >>> >>> Bernard Desnoues wrote: >>> Hi, I've got a problem with the use of Redmon (redirection port monitor). I intend to develop a virtual printer so that I can modify data sent to the printer. Redmon send the data flow to the standard input and lauchs the Python program which send modified data to the standard output (Windows XP and Python 2.5 context). I can manipulate the standard output. "import sys sys.stdout.write(data)" it works. But how to manipulate standard input so that I can store data in a string or in an object file ? There's no "read" method. "a = sys.stdin.read()" doesn't work. "f = open(sys.stdin)" doesn't work. I don't find anything in the documentation. How to do that ? Thanks in advance. Bernard Desnoues Librarian Bibliothèque de géographie - Sorbonne >> >> Hello Rolf, >> >> I know this code because I have search a solution ! >> Your google code doesn't work ! No attribute "readlines". >> >> >>> import sys >> >>> data = sys.stdin.readlines() >> >> Traceback (most recent call last): >>File "", line 1, in >> data = sys.stdin.readlines() >> AttributeError: readlines -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
Peter Otten wrote: > Robert Latest wrote: > >> Paul Rubin wrote: >>> The attribute is on instances of File, not on the class itself. See >>> if this works: >>> >>>flist.sort(key=lambda f: f.mod_date.toordinal) >> >> It doesn't throw an error any more, but neither does it sort the list. This, >> however, works: >> >> -- >> def by_date(f1, f2): >> return f1.mod_date.toordinal() - f2.mod_date.toordinal() >> >> flist.sort(by_date) >> -- >> >> So I'm sticking with it, although I sort of liked the key approach. >> >> robert > > This should work then: > > def date_key(f): > return f.mod_date.toordinal() > flist.sort(key=date_key) > > This can also be written as > > flist.sort(key=lambda f: f.mod_date.toordinal()) Well, that's almost Paul's (non-working) suggestion above, but it works because of the parentheses after toordinal. Beats me how both versions can be valid, anyway. To me it's all greek. I grew up with C function pointers, and they always work. robert -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
Bernard Desnoues wrote: > Hello, > > I checked under linux and it works : > text.txt : > "first line of the text file > second line of the text file" > > test.py : > "import sys > a = sys.stdin.readlines() > x = ''.join(a) > x = x.upper() > sys.stdout.write(x)" > > >cat text.txt | python test.py > > But I reinstalled Python 2.5 under Windows XP and it doesn't work > anyway. Can you confirm that your script works with Win XP and Python 2.5 ? How are you invoking the script under WinXP? If you're using the standard file associations then stdin/stdout won't work correctly. However, they produce a specific error message: C:\temp>type test3.py import sys print sys.stdin.readlines () C:\temp> C:\temp>type test3.py | test3.py Traceback (most recent call last): File "C:\temp\test3.py", line 3, in print sys.stdin.readlines () IOError: [Errno 9] Bad file descriptor C:\temp>type test3.py | python test3.py ['import sys\n', '\n', 'print sys.stdin.readlines ()'] TJG -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
On Jan 22, 8:42 pm, Bernard Desnoues <[EMAIL PROTECTED]> wrote: > Hello, > > I checked under linux and it works : > text.txt : > "first line of the text file > second line of the text file" > > test.py : > "import sys > a = sys.stdin.readlines() > x = ''.join(a) > x = x.upper() > sys.stdout.write(x)" > > >cat text.txt | python test.py > > But I reinstalled Python 2.5 under Windows XP and it doesn't work > anyway. Can you confirm that your script works with Win XP and Python 2.5 ? > > Regards > > Rolf van de Krol a écrit : > > > I don't know what you did with your Python installation, but for me this > > works perfectly. > > > test3.py contains: > > > > import sys > > > print sys.stdin.readlines() > > > > > test.txt contains: > > > > Testline1 > > Testline2 > > > > > Output of 'python test3.py < test.txt' is: > > > > ['Testline1\n', 'Testline2'] > > > > > Just plain simple and just works. > > > Rolf > > > Bernard Desnoues wrote: > >> Rolf van de Krol a écrit : > > >>> According to various tutorials this should work. > > >>> > >>> |import sys > >>> data = sys.stdin.readlines() > >>> print "Counted", len(data), "lines."| > >>> > > >>> Please use google before asking such questions. This was found with > >>> only one search for the terms 'python read stdin' > > >>> Rolf > > >>> Bernard Desnoues wrote: > > Hi, > > I've got a problem with the use of Redmon (redirection port > monitor). I intend to develop a virtual printer so that I can modify > data sent to the printer. > Redmon send the data flow to the standard input and lauchs the > Python program which send modified data to the standard output > (Windows XP and Python 2.5 context). > I can manipulate the standard output. > > "import sys > sys.stdout.write(data)" > > it works. > But how to manipulate standard input so that I can store data in a > string or in an object file ? There's no "read" method. > > "a = sys.stdin.read()" doesn't work. > "f = open(sys.stdin)" doesn't work. > > I don't find anything in the documentation. How to do that ? > Thanks in advance. > > Bernard Desnoues > Librarian > Bibliothèque de géographie - Sorbonne > > >> Hello Rolf, > > >> I know this code because I have search a solution ! > >> Your google code doesn't work ! No attribute "readlines". > > >> >>> import sys > >> >>> data = sys.stdin.readlines() > > >> Traceback (most recent call last): > >>File "", line 1, in > >> data = sys.stdin.readlines() > >> AttributeError: readlines Excuse me, gentlemen, may I be your referee *before* you resort to pistols at dawn? = IDLE = IDLE 1.2.1 >>> import sys >>> sys.stdin.readlines Traceback (most recent call last): File "", line 1, in sys.stdin.readlines AttributeError: readlines >>> = Command Prompt = C:\junk>python Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.stdin.readlines >>> HTH, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
On Tue, 22 Jan 2008 09:56:55 +, Robert Latest wrote: > Peter Otten wrote: >> Robert Latest wrote: >> >> This should work then: >> >> def date_key(f): >> return f.mod_date.toordinal() >> flist.sort(key=date_key) >> >> This can also be written as >> >> flist.sort(key=lambda f: f.mod_date.toordinal()) > > Well, that's almost Paul's (non-working) suggestion above, but it works > because of the parentheses after toordinal. Beats me how both versions can > be valid, anyway. > > To me it's all greek. I grew up with C function pointers, and they > always work. > > robert Suppose `func` is a C function pointer, then foo = func; and foo = func(); have different meanings. It's just the same in Python. First is the function itself, second *calls* the function. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: what's this instance?
J. Peng a écrit : > Bruno Desthuilliers 写道: >> J. Peng a écrit : >>> def safe_float(object): >>> try: >>> retval = float(object) >>> except (ValueError, TypeError), oops: >>> retval = str(oops) >>> return retval >>> The code above works well. >> For which definition of "works well" ? >> > > I got it from Core Python Programming book I bought.You may ask it to > Westley Chun.:) Ok: Mr Chun, if you here us ?-) -- http://mail.python.org/mailman/listinfo/python-list
possible to overide setattr in local scope?
In a class it is poosible to override setattr, so that you can decide how you should handle setting of variables. Is this possible to do outside of an class on module level. mysetattr(obj, var, value): print "Hello" So that test = 5 would print Hello -- http://mail.python.org/mailman/listinfo/python-list
MySQLdb and DictCursor
Hi, Im using a MySQLdb connection with a DictCursor, and to me it seems the wrapping to dictionaries only prepend column names when there is an actual conflict in the keywords. I would like the cursor to always prepend table names no matter what. Is this possible? Thanks, -Frank -- http://mail.python.org/mailman/listinfo/python-list
Re: possible to overide setattr in local scope?
glomde wrote: > In a class it is poosible to override setattr, so that you can decide > how you should > handle setting of variables. > > Is this possible to do outside of an class on module level. > > mysetattr(obj, var, value): > print "Hello" > > So that > > test = 5 > > > would print > Hello No, that's not possible. What you could do instead is to create a singlton that you use to store the values in, instead of the module directly. Like this (untested): class ModuleState(object): # borg pattern - why not... _shared_state = {} def __init__(self): self.__dict__ = ModuleState._shared_state def __setattr__(self, name, value): setattr(self, name, "hello") state = ModuleState() Then you do state.test = 5 Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On 22 Jan, 06:31, Alnilam <[EMAIL PROTECTED]> wrote: > Sorry for the noob question, but I've gone through the documentation > on python.org, tried some of the diveintopython and boddie's examples, > and looked through some of the numerous posts in this group on the > subject and I'm still rather confused. I know that there are some > great tools out there for doing this (BeautifulSoup, lxml, etc.) but I > am trying to accomplish a simple task with a minimal (as in nil) > amount of adding in modules that aren't "stock" 2.5, and writing a > huge class of my own (or copying one from diveintopython) seems > overkill for what I want to do. It's unfortunate that you don't want to install extra modules, but I'd probably use libxml2dom [1] for what you're about to describe... > Here's what I want to accomplish... I want to open a page, identify a > specific point in the page, and turn the information there into > plaintext. For example, on thewww.diveintopython.orgpage, I want to > turn the paragraph that starts "Translations are freely > permitted" (and ends ..."let me know"), into a string variable. > > Opening the file seems pretty straightforward. > > >>> import urllib > >>> page = urllib.urlopen("http://diveintopython.org/";) > >>> source = page.read() > >>> page.close() > > gets me to a string variable consisting of the un-parsed contents of > the page. Yes, there may be shortcuts that let some parsers read directly from the server, but it's always good to have the page text around, anyway. > Now things get confusing, though, since there appear to be several > approaches. > One that I read somewhere was: > > >>> from xml.dom.ext.reader import HtmlLib > >>> reader = HtmlLib.Reader() > >>> doc = reader.fromString(source) > > This gets me doc as > > >>> paragraphs = doc.getElementsByTagName('p') > > gets me all of the paragraph children, and the one I specifically want > can then be referenced with: paragraphs[5] This method seems to be > pretty straightforward, but what do I do with it to get it into a > string cleanly? In less sophisticated DOM implementations, what you'd do is to loop over the "descendant" nodes of the paragraph which are text nodes and concatenate them. > >>> from xml.dom.ext import PrettyPrint > >>> PrettyPrint(paragraphs[5]) > > shows me the text, but still in html, and I can't seem to get it to > turn into a string variable, and I think the PrettyPrint function is > unnecessary for what I want to do. Yes, PrettyPrint is for prettyprinting XML. You just want to visit and collect the text nodes. >Formatter seems to do what I want, > but I can't figure out how to link the "Element Node" at > paragraphs[5] with the formatter functions to produce the string I > want as output. I tried some of the htmllib.HTMLParser(formatter > stuff) examples, but while I can supposedly get that to work with > formatter a little easier, I can't figure out how to get HTMLParser to > drill down specifically to the 6th paragraph's contents. Given that you've found the paragraph above, you just need to write a recursive function which visits child nodes, and if it finds a text node then it collects the value of the node in a list; otherwise, for elements, it visits the child nodes of that element; and so on. The recursive approach is presumably what the formatter uses, but I can't say that I've really looked at it. Meanwhile, with libxml2dom, you'd do something like this: import libxml2dom d = libxml2dom.parseURI("http://www.diveintopython.org/";, html=1) saved = None # Find the paragraphs. for p in d.xpath("//p"): # Get the text without leading and trailing space. text = p.textContent.strip() # Save the appropriate paragraph text. if text.startswith("Translations are freely permitted") and \ text.endswith("just let me know."): saved = text break The magic part of this code which saves you from needing to write that recursive function mentioned above is the textContent property on the paragraph element. Paul [1] http://www.python.org/pypi/libxml2dom -- http://mail.python.org/mailman/listinfo/python-list
Re: assigning values in python and perl
On 2008-01-17, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Thu, 17 Jan 2008 11:40:59 +0800, J. Peng wrote: > >> May I ask, python's pass-by-reference is passing the object's reference >> to functions, but perl, or C's pass-by-reference is passing the variable >> itself's reference to functions. So althought they're all called >> pass-by-reference,but will get different results.Is it? > > Python is not call by reference. > > Any book or person that says it is, is wrong to do so. > > Python's function call semantics are not the same as C, or Perl, or > Pascal. They are, however, similar to those of Lisp, Scheme, Emerald and > especially CLU. It is neither pass by reference, nor pass by value. I don't think it is the function call semantics that are so different as it is the assignment itself that is different. an assignment in C, doesn't bind a new object to the name, but stores new information in the object. Trying to explain the different behaviour of C and python of examples calling function that assign to a parameter, without explaining how the assignment works will IMO not give people enough to understand what is happening. -- Antoon Pardon -- http://mail.python.org/mailman/listinfo/python-list
Re: ctypes CDLL - which paths are searched?
Thomas Heller wrote: > Helmut Jarausch schrieb: >> Hi, >> >> how can I specify the paths to be searched for a dynamic library >> to be loaded by ctypes' CDLL class on a Linux system. >> >> Do I have to set os.environment['LD_LIBRARY_PATH'] ? >> > > ctypes passes the argument given to CDLL(path) straight to > the dlopen(3) call, so your system documentation should tell you. > Thanks, but then it's difficult to use CDLL. Setting os.environ['LD_LIBRARY_PATH'] within the script which calls CDLL is too late. What other methods are possible rather than put an explicit export LD_LIBRARY_PATH=... before running the script, if I don't want to put the dynamic library into a standard system library. Many thanks, Helmut. -- Helmut Jarausch Lehrstuhl fuer Numerische Mathematik RWTH - Aachen University D 52056 Aachen, Germany -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
I suppose my question should have been, is there an obviously faster way? Anyway, of the four ways below, the first is substantially fastest. Is there an obvious reason why? Thanks, Alan Isaac PS My understanding is that the behavior of the last is implementation dependent and not guaranteed. def pairs1(x): for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)): yield x12 def pairs2(x): xiter = iter(x) while True: yield xiter.next(), xiter.next() def pairs3(x): for i in range( len(x)//2 ): yield x[2*i], x[2*i+1], def pairs4(x): xiter = iter(x) for x12 in izip(xiter,xiter): yield x12 -- http://mail.python.org/mailman/listinfo/python-list
Re: Question on sort() key function
Robert Latest <[EMAIL PROTECTED]> writes: > >flist.sort(key=lambda f: f.mod_date.toordinal) > > It doesn't throw an error any more, but neither does it sort the list. This, > however, works: Oh, I didn't realize that toordinal was a callable, given your earlier sample. You want: flist.sort(key=lambda f: f.mod_date.toordinal() ) -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
Well, that's at least weird. I did test my code with Python 2.5 on Win XP, using the command prompt. But testing it with IDLE gives exactly the same error Bernard has. So apparently STDIN can't be accessed with IDLE. Rolf John Machin wrote: > > Excuse me, gentlemen, may I be your referee *before* you resort to > pistols at dawn? > > = IDLE = > IDLE 1.2.1 > import sys sys.stdin.readlines > > Traceback (most recent call last): > File "", line 1, in > sys.stdin.readlines > AttributeError: readlines > > > = Command Prompt = > C:\junk>python > Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit > (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > import sys sys.stdin.readlines > > > > HTH, > John > -- http://mail.python.org/mailman/listinfo/python-list
isgenerator(...) - anywhere to be found?
For a simple greenlet/tasklet/microthreading experiment I found myself in the need to ask the question isgenerator(v) but didn't find any implementation in the usual suspects - builtins or inspect. I was able to help myself out with a simple (out of my head, hope its def isgenerator(v): def _g(): yield return type(v) == type(_g()) But I wonder why there is no such method already available? Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: ctypes CDLL - which paths are searched?
Helmut Jarausch schrieb: > Thomas Heller wrote: >> Helmut Jarausch schrieb: >>> Hi, >>> >>> how can I specify the paths to be searched for a dynamic library >>> to be loaded by ctypes' CDLL class on a Linux system. >>> >>> Do I have to set os.environment['LD_LIBRARY_PATH'] ? >>> >> >> ctypes passes the argument given to CDLL(path) straight to >> the dlopen(3) call, so your system documentation should tell you. >> > > Thanks, > > but then it's difficult to use CDLL. Setting > os.environ['LD_LIBRARY_PATH'] within the script which > calls CDLL is too late. > What other methods are possible rather than put an explicit > export LD_LIBRARY_PATH=... > before running the script, if I don't want to put the dynamic > library into a standard system library. I guess you can also use an absolute pathname (but the dlopen(3) manpage should tell you more. I'm not too familiar with linux). Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
> Pardon me, but the standard issue Python 2.n (for n in range(5, 2, > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous > 200-modules PyXML package installed. And you don't want the 75Kb > BeautifulSoup? I wasn't aware that I had PyXML installed, and can't find a reference to having it installed in pydocs. And that highlights the problem I have at the moment with using other modules. I move from computer to computer regularly, and while all have a recent copy of Python, each has different (or no) extra modules, and I don't always have the luxury of downloading extras. That being said, if there's a simple way of doing it with BeautifulSoup, please show me an example. Maybe I can figure out a way to carry the extra modules I need around with me. -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
on 22.01.2008 14:20 Diez B. Roggisch said the following: > > def isgenerator(v): > def _g(): yield > return type(v) == type(_g()) > > But I wonder why there is no such method already available? This tests for generator objects, and you could also use:: return type(v) is types.GeneratorType I think that this is pretty direct already. I also need to test for generator functions from time to time for which I use:: def _isaGeneratorFunction(func): '''Check the bitmask of `func` for the magic generator flag.''' return bool(func.func_code.co_flags & CO_GENERATOR) cheers, stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >For a simple greenlet/tasklet/microthreading experiment I found myself in >the need to ask the question > >isgenerator(v) > >but didn't find any implementation in the usual suspects - builtins or >inspect. > >I was able to help myself out with a simple (out of my head, hope its > >def isgenerator(v): >def _g(): yield >return type(v) == type(_g()) > >But I wonder why there is no such method already available? > Why do you need a special case for generators? If you just pass the object in question to iter(), instead, then you'll either get back something that you can iterate over, or you'll get an exception for things that aren't iterable. Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
Hi, This is Windows bug that is described here: http://support.microsoft.com/default.aspx?kbid=321788 This article also contains solution: you need to add registry value: HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Policies \Explorer InheritConsoleHandles = 1 (REG_DWORD type) Do not forget to launch new console (cmd.exe) after editing registry. Alternatively you can use following command cat file | python script.py instead of cat file | python script.py Regards, Konstantin On Jan 22, 1:02 pm, Rolf van de Krol <[EMAIL PROTECTED]> wrote: > Well, that's at least weird. I did test my code with Python 2.5 on Win > XP, using the command prompt. But testing it with IDLE gives exactly the > same error Bernard has. So apparently STDIN can't be accessed with IDLE. > > Rolf > > John Machin wrote: > > > Excuse me, gentlemen, may I be your referee *before* you resort to > > pistols at dawn? > > > = IDLE = > > IDLE 1.2.1 > > import sys > sys.stdin.readlines > > > Traceback (most recent call last): > > File "", line 1, in > > sys.stdin.readlines > > AttributeError: readlines > > > = Command Prompt = > > C:\junk>python > > Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit > > (Intel)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > > import sys > sys.stdin.readlines > > > > > > HTH, > > John -- http://mail.python.org/mailman/listinfo/python-list
Problem with processing XML
Hi. I'm new to Python and trying to use it to solve a specific problem. I have an XML file in which I need to locate a specific text node and replace the contents with some other text. The text in question is actually about 70k of base64 encoded data. I wrote some code that works on my Linux box using xml.dom.minidom, but it will not run on the windows box that I really need it on. Python 2.5.1 on both. On the windows machine, it's a clean install of the Python .msi from python.org. The linux box is Ubuntu 7.10, which has some Python XML packages installed which can't easily be removed (namely python-libxml2 and python-xml). I have boiled the code down to its simplest form which shows the problem:- import xml.dom.minidom import sys input_file = sys.argv[1]; output_file = sys.argv[2]; doc = xml.dom.minidom.parse(input_file) file = open(output_file, "w") doc.writexml(file) The error is:- $ python test2.py input2.xml output.xml Traceback (most recent call last): File "test2.py", line 9, in doc.writexml(file) File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml node.writexml(writer, indent, addindent, newl) File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml node.writexml(writer,indent+addindent,addindent,newl) File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml _write_data(writer, attrs[a_name].value) File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data data = data.replace("&", "&").replace("<", "<") AttributeError: 'NoneType' object has no attribute 'replace' As I said, this code runs fine on the Ubuntu box. If I could work out why the code runs on this box, that would help because then I call set up the windows box the same way. The input file contains an block which is what actually causes the problem. If you remove that node and subnodes, it works fine. For a while at least, you can view the input file at http://rafb.net/p/5R1JlW12.html Someone suggested that I should try xml.etree.ElementTree, however writing the same type of simple code to import and then write the file mangles the xsd:schema stuff because ElementTree does not understand namespaces. By the way, is pyxml a live project or not? Should it still be used? It's odd that if you go to http://www.python.org/ and click the link "Using python for..." XML, it leads you to http://pyxml.sourceforge.net/topics/ If you then follow the download links to http://sourceforge.net/project/showfiles.php?group_id=6473 you see that the latest file is 2004, and there are no versions for newer pythons. It also says "PyXML is no longer maintained". Shouldn't the link be removed from python.org? Thanks in advance! -- http://mail.python.org/mailman/listinfo/python-list
Re: building psycopg2 on windows using mingw, "cannot find -lpq"
> > The compile works, BUT linking fails: > > > 2.5\Release\psycopg\_psycopg.def -Lc:\python25\libs -Lc: > > \python25\PCBuild -Lc:/p > > ostgres/83RC2/lib -lpython25 -lpq -lws2_32 -ladvapi32 -o build > > > -Lc:/postgres/83RC2/lib > > Are you sure using forward slashes in the path works here? Not at all. But that commandline is generated by setup.py, not by me : ( and setup.py extracts the paths from pg_config so: I have no idea how to make it use backslash :( Thanks for the idea, Harald -- http://mail.python.org/mailman/listinfo/python-list
Re: building psycopg2 on windows using mingw, "cannot find -lpq"
> I use psycopg2 all the time on windows. I use the binary installer > instead of source. Works great for me. > > -Tom Me2. Just in 7 out of 200 it does not work with the currently available binary installer, on some startups, so I decided to follow a recommendation out of the psycopg2 list to compile it from trunk :( Harald -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Jean-Paul Calderone wrote: > On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" > <[EMAIL PROTECTED]> wrote: >>For a simple greenlet/tasklet/microthreading experiment I found myself in >>the need to ask the question >> >>isgenerator(v) >> >>but didn't find any implementation in the usual suspects - builtins or >>inspect. >> >>I was able to help myself out with a simple (out of my head, hope its >> >>def isgenerator(v): >>def _g(): yield >>return type(v) == type(_g()) >> >>But I wonder why there is no such method already available? >> > > Why do you need a special case for generators? If you just pass the > object in question to iter(), instead, then you'll either get back > something that you can iterate over, or you'll get an exception for > things that aren't iterable. Because - as I said - I'm working on a micro-thread thingy, where the scheduler needs to push returned generators to a stack and execute them. Using send(), which rules out iter() anyway. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Stefan Rank wrote: > on 22.01.2008 14:20 Diez B. Roggisch said the following: >> >> def isgenerator(v): >> def _g(): yield >> return type(v) == type(_g()) >> >> But I wonder why there is no such method already available? > > > This tests for generator objects, and you could also use:: > >return type(v) is types.GeneratorType > > I think that this is pretty direct already. Not as nice as it could be, but certainly way less hackish than my approach. Thanks! > I also need to test for generator functions from time to time for which > I use:: > >def _isaGeneratorFunction(func): >'''Check the bitmask of `func` for the magic generator flag.''' >return bool(func.func_code.co_flags & CO_GENERATOR) Not sure if that's not a bit too much on the dark magic side.. but good to know that it exists. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 1:19 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: [...] > PS My understanding is that the behavior > of the last is implementation dependent > and not guaranteed. [...] > def pairs4(x): > xiter = iter(x) > for x12 in izip(xiter,xiter): > yield x12 According to the docs [1], izip is defined to be equivalent to: def izip(*iterables): iterables = map(iter, iterables) while iterables: result = [it.next() for it in iterables] yield tuple(result) This guarantees that it.next() will be performed from left to right, so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1), (4, 3)]. Is there anything else that I am overlooking? [1] http://docs.python.org/lib/itertools-functions.html -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
make exe from application with py2exe
Hello, Is there any idea how can i create (.exe) from application (.exe ) with py2exe? Regards, Vedran -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
Sorry, I meant: Alternatively you can use following command cat file | python script.py instead of cat file | script.py On Jan 22, 1:54 pm, Konstantin Shaposhnikov <[EMAIL PROTECTED]> wrote: > Hi, > > This is Windows bug that is described > here:http://support.microsoft.com/default.aspx?kbid=321788 > > This article also contains solution: you need to add registry value: > > HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Policies > \Explorer > InheritConsoleHandles = 1 (REG_DWORD type) > > Do not forget to launch new console (cmd.exe) after editing registry. > > Alternatively you can use following command > > cat file | python script.py > > instead of > > cat file | python script.py > > Regards, > Konstantin > > On Jan 22, 1:02 pm, Rolf van de Krol <[EMAIL PROTECTED]> wrote: > > > Well, that's at least weird. I did test my code with Python 2.5 on Win > > XP, using the command prompt. But testing it with IDLE gives exactly the > > same error Bernard has. So apparently STDIN can't be accessed with IDLE. > > > Rolf > > > John Machin wrote: > > > > Excuse me, gentlemen, may I be your referee *before* you resort to > > > pistols at dawn? > > > > = IDLE = > > > IDLE 1.2.1 > > > import sys > > sys.stdin.readlines > > > > Traceback (most recent call last): > > > File "", line 1, in > > > sys.stdin.readlines > > > AttributeError: readlines > > > > = Command Prompt = > > > C:\junk>python > > > Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit > > > (Intel)] on win32 > > > Type "help", "copyright", "credits" or "license" for more information. > > > import sys > > sys.stdin.readlines > > > > > > > > HTH, > > > John -- http://mail.python.org/mailman/listinfo/python-list
Re: Max Long
On Jan 21, 7:42 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Mon, 21 Jan 2008 22:02:34 -0200, [EMAIL PROTECTED] > <[EMAIL PROTECTED]> escribió: > > > > > > > On Jan 21, 5:36 pm, Gary Herron <[EMAIL PROTECTED]> wrote: > >> [EMAIL PROTECTED] wrote: > >> > How can I figure out the largest long available? I was hoping for > > >> There is no explicit (defined) limit. The amount of available address > >> space forms a practical limit. > > > But not the only limitation: > > [...] > > Traceback (most recent call last): > > File "", line 2, in > > a = cf.Type12MH(k,1) > > File "C:\Program Files\PyGTK\Python\lib\collatz_functions.py", line > > 745, in Type12MH > > return TWO**(SIX*a - ONE) - ONE > > ValueError: mpz.pow outrageous exponent > > > The power function can't do exponents that have 32 or more bits > > even if the memory can hold the resulting number. > > Isn't it a limitation of the gmpy library, not of the builtin long type? Well, gmpy for sure. But as for Python's builtin longs, I wouldn't know as I've only got one lifetime. Python longs c:\python25\user>long_ago.py 1 20.0310001373291 2 90.0310001373291 3 740.0310001373291 4 6590.062362396 559260.062362396 6 533280.219000101089 7 479940 63.562362 8 4319453 8983.0787 9 GMPY longs c:\python25\user>long_ago.py 1 2 0.0 2 9 0.016324249 3 74 0.016324249 4659 0.016324249 5 5926 0.016324249 6 53328 0.016324249 7 479940 0.016324249 84319453 0.032648499 9 38875064 0.1576485 10 349875565 1.3613351 > > -- > Gabriel Genellina- Hide quoted text - > > - Show quoted text - -- http://mail.python.org/mailman/listinfo/python-list
Re: make exe from application with py2exe
[EMAIL PROTECTED] wrote: > Is there any idea how can i create (.exe) from application (.exe ) > with py2exe? yes. here [1], here [2] and maybe here [3]. bye. http://catb.org/~esr/faqs/smart-questions.html [1] http://www.google.com [2] http://www.py2exe.org [3] -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Jean-Paul Calderone wrote: > On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch" > <[EMAIL PROTECTED]> wrote: >>Jean-Paul Calderone wrote: >> >>> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" >>> <[EMAIL PROTECTED]> wrote: For a simple greenlet/tasklet/microthreading experiment I found myself in the need to ask the question [snip] >>> >>> Why do you need a special case for generators? If you just pass the >>> object in question to iter(), instead, then you'll either get back >>> something that you can iterate over, or you'll get an exception for >>> things that aren't iterable. >> >>Because - as I said - I'm working on a micro-thread thingy, where the >>scheduler needs to push returned generators to a stack and execute them. >>Using send(), which rules out iter() anyway. > > Sorry, I still don't understand. Why is a generator different from any > other iterator? Because you can use send(value) on it for example. Which you can't with every other iterator. And that you can utizilize to create a little framework of co-routines or however you like to call it that will yield values when they want, or generators if they have nested co-routines the scheduler needs to keep track of and invoke after another. I'm currently at work and can't show you the code - I don't claim that my current approach is the shizzle, but so far it serves my purposes - and I need a isgenerator() Diez -- http://mail.python.org/mailman/listinfo/python-list
ANN: pyglet 1.0
The first stable/production version of pyglet has been released. http://www.pyglet.org --- pyglet provides an object-oriented programming interface for developing games and other visually-rich applications for Windows, Mac OS X and Linux. Some of the features of pyglet are: * No external dependencies or installation requirements. For most application and game requirements, pyglet needs nothing else besides Python, simplifying distribution and installation. * Take advantage of multiple windows and multi-monitor desktops. pyglet allows you to use as many windows as you need, and is fully aware of multi-monitor setups for use with fullscreen games. * Load images, sound, music and video in almost any format. pyglet can optionally use AVbin to play back audio formats such as MP3, OGG/Vorbis and WMA, and video formats such as DivX, MPEG-2, H.264, WMV and Xvid. pyglet is provided under the BSD open-source license, allowing you to use it for both commercial and other open-source projects with very little restriction. Cheers Alex. -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >Jean-Paul Calderone wrote: > >> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" >> <[EMAIL PROTECTED]> wrote: >>>For a simple greenlet/tasklet/microthreading experiment I found myself in >>>the need to ask the question >>> >>> [snip] >> >> Why do you need a special case for generators? If you just pass the >> object in question to iter(), instead, then you'll either get back >> something that you can iterate over, or you'll get an exception for >> things that aren't iterable. > >Because - as I said - I'm working on a micro-thread thingy, where the >scheduler needs to push returned generators to a stack and execute them. >Using send(), which rules out iter() anyway. Sorry, I still don't understand. Why is a generator different from any other iterator? Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 7:44 am, Alnilam <[EMAIL PROTECTED]> wrote: > ...I move from computer to > computer regularly, and while all have a recent copy of Python, each > has different (or no) extra modules, and I don't always have the > luxury of downloading extras. That being said, if there's a simple way > of doing it with BeautifulSoup, please show me an example. Maybe I can > figure out a way to carry the extra modules I need around with me. Pyparsing's footprint is intentionally small - just one pyparsing.py file that you can drop into a directory next to your own script. And the code to extract paragraph 5 of the "Dive Into Python" home page? See annotated code below. -- Paul from pyparsing import makeHTMLTags, SkipTo, anyOpenTag, anyCloseTag import urllib import textwrap page = urllib.urlopen("http://diveintopython.org/";) source = page.read() page.close() # define a simple paragraph matcher pStart,pEnd = makeHTMLTags("P") paragraph = pStart.suppress() + SkipTo(pEnd) + pEnd.suppress() # get all paragraphs from the input string (or use the # scanString generator function to stop at the correct # paragraph instead of reading them all) paragraphs = paragraph.searchString(source) # create a transformer that will strip HTML tags tagStripper = anyOpenTag.suppress() | anyCloseTag.suppress() # get paragraph[5] and strip the HTML tags p5TextOnly = tagStripper.transformString(paragraphs[5][0]) # remove extra whitespace p5TextOnly = " ".join(p5TextOnly.split()) # print out a nicely wrapped string - so few people know # that textwrap is part of the standard Python distribution, # but it is very handy print textwrap.fill(p5TextOnly, 60) -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Jan 22, 7:46 am, Stefan Rank <[EMAIL PROTECTED]> wrote: > I also need to test for generator functions from time to time for which > I use:: > > def _isaGeneratorFunction(func): > '''Check the bitmask of `func` for the magic generator flag.''' > return bool(func.func_code.co_flags & CO_GENERATOR) > > cheers, > stefan Might want to catch AttributeError in this routine - not all func arguments will have a func_code attribute. See below: class Z(object): def __call__(*args): for i in range(3): yield 1 for i in Z()(): print i # prints 1 three times import types print type(Z()()) == types.GeneratorType # prints 'True' print Z()().func_code # raises AttributeError, doesn't have a func_code attribute -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Don't want child process inheriting open sockets
I'm using subprocess.Popen() to create a child process. The child process is inheriting the parent process' open sockets, but I don't want that. I believe that on Unix systems I could use the FD_CLOEXEC flag, but I'm running Windows. Any suggestions? -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
On Jan 22, 8:11 am, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > Hi. > > I'm new to Python and trying to use it to solve a specific problem. I > have an XML file in which I need to locate a specific text node and > replace the contents with some other text. The text in question is > actually about 70k of base64 encoded data. > Here is a pyparsing hack for your problem. I normally advise against using literal strings like "" to match XML or HTML tags in a parser, since this doesn't cover variations in case, embedded whitespace, or unforeseen attributes, but your example was too simple to haul in the extra machinery of an expression created by pyparsing's makeXMLTags. Also, I don't generally recommend pyparsing for working on XML, since there are so many better and faster XML-specific modules available. But if this does the trick for you for your specific base64-removal task, great. -- Paul # requires pyparsing 1.4.8 or later from pyparsing import makeXMLTags, withAttribute, keepOriginalText, SkipTo xml = """ ... long XML string goes here ... """ # define a filter that will key off of the tag with the # attribute 'name="PctShow.Image"', and then use suppress to filter the # body of the following tag dataTag = makeXMLTags("data")[0] dataTag.setParseAction(withAttribute(name="PctShow.Image"), keepOriginalText) filter = dataTag + "" + SkipTo("").suppress() + "" xmlWithoutBase64Block = filter.transformString(xml) print xmlWithoutBase64Block -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Stefan Rank wrote: > on 22.01.2008 14:20 Diez B. Roggisch said the following: >> def isgenerator(v): >> def _g(): yield >> return type(v) == type(_g()) >> >> But I wonder why there is no such method already available? > > > This tests for generator objects, and you could also use:: > >return type(v) is types.GeneratorType > > I think that this is pretty direct already. > > I also need to test for generator functions from time to time for which > I use:: > >def _isaGeneratorFunction(func): >'''Check the bitmask of `func` for the magic generator flag.''' >return bool(func.func_code.co_flags & CO_GENERATOR) Can you please write a function for the inspect module + docs + a small unit tests and submit a patch? The inspect module is missing the isgenerator function. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Tue, 22 Jan 2008 15:52:02 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >Jean-Paul Calderone wrote: > > [snip] >> >> Sorry, I still don't understand. Why is a generator different from any >> other iterator? > >Because you can use send(value) on it for example. Which you can't with >every other iterator. And that you can utizilize to create a little >framework of co-routines or however you like to call it that will yield >values when they want, or generators if they have nested co-routines the >scheduler needs to keep track of and invoke after another. Ah. Thanks for clarifying. Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Alan Isaac>What is the fastest way? (Ignore the import time.)< Maybe someday someone will realize such stuff belongs to the python STD lib... If you need a lazy generator without padding, that splits starting from the start, then this is the faster to me if n is close to 2: def xpartition(seq, n=2): return izip( *(iter(seq),)*n ) If you need the faster greedy version without padding then there are two answers, one for Psyco and one for Python without... :-) If you need padding or to start from the end then there are more answers... Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
on 22.01.2008 16:09 Paul McGuire said the following: > On Jan 22, 7:46 am, Stefan Rank <[EMAIL PROTECTED]> wrote: >> I also need to test for generator functions from time to time for which >> I use:: >> >>def _isaGeneratorFunction(func): >>'''Check the bitmask of `func` for the magic generator flag.''' >>return bool(func.func_code.co_flags & CO_GENERATOR) >> >> cheers, >> stefan > > Might want to catch AttributeError in this routine - not all func > arguments will have a func_code attribute. See below: > > class Z(object): > def __call__(*args): > for i in range(3): > yield 1 > > for i in Z()(): > print i > # prints 1 three times > > import types > print type(Z()()) == types.GeneratorType > # prints 'True' > > print Z()().func_code > # raises AttributeError, doesn't have a func_code attribute You are right about that for generator *objects*. But _isaGeneratorFunction tests for generator *functions* (the ones you call in order to get a generator object) and those must have a func_code. So in your example:: >>> from compiler.consts import CO_GENERATOR >>> Z().__call__.func_code.co_flags & CO_GENERATOR 32 >>> Z.__call__.func_code.co_flags & CO_GENERATOR 32 You have to use __call__ directly, you can't use the code-object-flag test on the callable class instance Z(), but I think that's just as well since this kind of test should not be necessary at all, except in rare code parts (such as Diez' microthreading experiments). cheers, stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Curses and Threading
> In fact you have *two* threads: the main thread, and the one you create > explicitly. > After you start the clock thread, the main thread continues executing, > immediately entering the finally clause. > If you want to wait for the other thread to finish, use the join() method. > But I'm unsure if this is the right way to mix threads and curses. This is what the python documentation says: join([timeout]) Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates. So according to this since I need to block the main thread until the clock thread ends I would need the main thread to call "cadtime().join()", correct? I'm not sure how to do this because I don't have a class or anything for the main thread that I know of. I tried putting that after cadtime().start() but that doesn't work. I guess what I'm trying to say is how can I tell the main thread what to do when it doesn't exist in my code? Thanks for the help -Brett -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > According to the docs [1], izip is defined to be equivalent to: > > def izip(*iterables): > iterables = map(iter, iterables) > while iterables: > result = [it.next() for it in iterables] > yield tuple(result) > > This guarantees that it.next() will be performed from left to right, > so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1), > (4, 3)]. > > Is there anything else that I am overlooking? > > [1] http://docs.python.org/lib/itertools-functions.html http://bugs.python.org/issue1121416> fwiw, Alan Isaac -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 1:19 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > I suppose my question should have been, > is there an obviously faster way? > Anyway, of the four ways below, the > first is substantially fastest. Is > there an obvious reason why? Can you post your results? I get different ones (pairs1 and pairs2 rewritten slightly to avoid unnecessary indirection). == pairs.py === from itertools import * def pairs1(x): return izip(islice(x,0,None,2),islice(x,1,None,2)) def pairs2(x): xiter = iter(x) while True: yield xiter.next(), xiter.next() def pairs3(x): for i in range( len(x)//2 ): yield x[2*i], x[2*i+1], def pairs4(x): xiter = iter(x) return izip(xiter,xiter) def compare(): import timeit for i in '1234': t = timeit.Timer('list(pairs.pairs%s(l))' % i, 'import pairs; l=range(1000)') print 'pairs%s: %s' % (i, t.timeit(1)) if __name__ == '__main__': compare() = marigold:python arno$ python pairs.py pairs1: 0.789824962616 pairs2: 4.08462786674 pairs3: 2.90438890457 pairs4: 0.536775827408 pairs4 wins. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
On Jan 22, 9:11 am, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > By the way, is pyxml a live project or not? Should it still be used? > It's odd that if you go tohttp://www.python.org/and click the link > "Using python for..." XML, it leads you tohttp://pyxml.sourceforge.net/topics/ > > If you then follow the download links > tohttp://sourceforge.net/project/showfiles.php?group_id=6473you see that > the latest file is 2004, and there are no versions for newer pythons. > It also says "PyXML is no longer maintained". Shouldn't the link be > removed from python.org? I was wondering that myself. Any answer yet? -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: > > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, > > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous > > 200-modules PyXML package installed. And you don't want the 75Kb > > BeautifulSoup? > > I wasn't aware that I had PyXML installed, and can't find a reference > to having it installed in pydocs. ... Ugh. Found it. Sorry about that, but I still don't understand why there isn't a simple way to do this without using PyXML, BeautifulSoup or libxml2dom. What's the point in having sgmllib, htmllib, HTMLParser, and formatter all built in if I have to use use someone else's modules to write a couple of lines of code that achieve the simple thing I want. I get the feeling that this would be easier if I just broke down and wrote a couple of regular expressions, but it hardly seems a 'pythonic' way of going about things. # get the source (assuming you don't have it locally and have an internet connection) >>> import urllib >>> page = urllib.urlopen("http://diveintopython.org/";) >>> source = page.read() >>> page.close() # set up some regex to find tags, strip them out, and correct some formatting oddities >>> import re >>> p = re.compile(r'(.*?)',re.DOTALL) >>> tag_strip = re.compile(r'>(.*?)<',re.DOTALL) >>> fix_format = re.compile(r'\n +',re.MULTILINE) # achieve clean results. >>> paragraphs = re.findall(p,source) >>> text_list = re.findall(tag_strip,paragraphs[5]) >>> text = "".join(text_list) >>> clean_text = re.sub(fix_format," ",text) This works, and is small and easily reproduced, but seems like it would break easily and seems a waste of other *ML specific parsers. -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Jan 22, 6:20 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > For a simple greenlet/tasklet/microthreading experiment I found myself in > the need to ask the question > > isgenerator(v) > > but didn't find any implementation in the usual suspects - builtins or > inspect. types.GeneratorType exists in newer Pythons, but I'd suggest just checking for a send method. ;) That way, you can use something that emulates the interface without being forced to use a generator. hasattr(ob, 'send').. -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > http://bugs.python.org/issue1121416> > > fwiw, > Alan Isaac Thanks. So I guess I shouldn't take the code snippet I quoted as a specification of izip but rather as an illustration. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
On 1/21/2008 9:02 AM, Bernard Desnoues wrote: > Hi, > > I've got a problem with the use of Redmon (redirection port monitor). I > intend to develop a virtual printer so that I can modify data sent to > the printer. FWIW: there is a nice update the RedMon (v1.7) called RedMon EE (v1.81) available at http://www.is-foehr.com/ that I have used and like a lot. From the developers website: Fixed issues and features [with respect to the orininal RedMon] * On Windows Terminal Server or Windows XP with fast user switching, the "Prompt for filename" dialog will appear on the current session. * "SaveAs" now shows XP style dialogs if running under XP * Support for PDF Security added - experimental -. * Support for setting the task priority - experimental - * Use of file-shares as output * Environment variables are passed to the AfterWorks Process now. * Environment variables are replaced in the program arguments. No workaround is needed. * RedMon EE comes with an RPC communication feature which could transfer output-files back to the client starting the print job on a print server. Error messages will be send to the client. * Redmon EE may start a process after the print job has finished (After works process). e.g. starting a presentation program to show the pdf generated by GhostScript. * additional debug messages may be written for error analysis. No special debug version is needed. * user interface has been rewritten. May be it's more friendly. Added some basic system information which may help if running in failures. * new feature: running on a print server. * cleanup of documentnames "Microsoft -" * define templates for output-file names with full environment variable substitution e.g. %homedrive%\%homedir%\%redmon-user%-%date%-%time%-%n.pdf * RedMon EE does not support for NT 3.5 and Windows 95/98 ! -Thynnus -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
On 22 Jan, 15:11, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > > I wrote some code that works on my Linux box using xml.dom.minidom, but > it will not run on the windows box that I really need it on. Python > 2.5.1 on both. > > On the windows machine, it's a clean install of the Python .msi from > python.org. The linux box is Ubuntu 7.10, which has some Python XML > packages installed which can't easily be removed (namely python-libxml2 > and python-xml). I don't think you're straying into libxml2 or PyXML territory here... > I have boiled the code down to its simplest form which shows the problem:- > > import xml.dom.minidom > import sys > > input_file = sys.argv[1]; > output_file = sys.argv[2]; > > doc = xml.dom.minidom.parse(input_file) > file = open(output_file, "w") On Windows, shouldn't this be the following...? file = open(output_file, "wb") > doc.writexml(file) > > The error is:- > > $ python test2.py input2.xml output.xml > Traceback (most recent call last): >File "test2.py", line 9, in > doc.writexml(file) >File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml > node.writexml(writer, indent, addindent, newl) >File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml > node.writexml(writer,indent+addindent,addindent,newl) >File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml > _write_data(writer, attrs[a_name].value) >File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data > data = data.replace("&", "&").replace("<", "<") > AttributeError: 'NoneType' object has no attribute 'replace' > > As I said, this code runs fine on the Ubuntu box. If I could work out > why the code runs on this box, that would help because then I call set > up the windows box the same way. If I encountered the same issue, I'd have to inspect the goings-on inside minidom, possibly using judicious trace statements in the minidom.py file. Either way, the above looks like an attribute node produces a value of None rather than any kind of character string. > The input file contains an block which is what actually > causes the problem. If you remove that node and subnodes, it works > fine. For a while at least, you can view the input file at > http://rafb.net/p/5R1JlW12.html The horror! ;-) > Someone suggested that I should try xml.etree.ElementTree, however > writing the same type of simple code to import and then write the file > mangles the xsd:schema stuff because ElementTree does not understand > namespaces. I'll leave this to others: I don't use ElementTree. > By the way, is pyxml a live project or not? Should it still be used? > It's odd that if you go to http://www.python.org/and click the link > "Using python for..." XML, it leads you to > http://pyxml.sourceforge.net/topics/ > > If you then follow the download links to > http://sourceforge.net/project/showfiles.php?group_id=6473 you see that > the latest file is 2004, and there are no versions for newer pythons. > It also says "PyXML is no longer maintained". Shouldn't the link be > removed from python.org? The XML situation in Python's standard library is controversial and can be probably inaccurately summarised by the following chronology: 1. XML is born, various efforts start up (see the qp_xml and xmllib modules). 2. Various people organise themselves, contributing software to the PyXML project (4Suite, xmlproc). 3. The XML backlash begins: we should all apparently be using stuff like YAML (but don't worry if you haven't heard of it). 4. ElementTree is released, people tell you that you shouldn't be using SAX or DOM any more, "pull" parsers are all the rage (although proponents overlook the presence of xml.dom.pulldom in the Python standard library). 5. ElementTree enters the standard library as xml.etree; PyXML falls into apparent disuse (see remarks about SAX and DOM above). I think I looked seriously at wrapping libxml2 (with libxml2dom [1]) when I experienced issues with both PyXML and 4Suite when used together with mod_python, since each project used its own Expat libraries and the resulting mis-linked software produced very bizarre results. Moreover, only cDomlette from 4Suite seemed remotely fast, and yet did not seem to be an adequate replacement for the usual PyXML functionality. People will, of course, tell you that you shouldn't use a DOM for anything and that the "consensus" is to use ElementTree or lxml (see above), but I can't help feeling that this has a damaging effect on the XML situation for Python: some newcomers would actually benefit from the traditional APIs, may already be familiar with them from other contexts, and may consider Python lacking if the support for them is in apparent decay. It requires a degree of motivation to actually attempt to maintain software providing such APIs (which was my solution to the problem), but if someone isn't totally bound to Python then they might easily start
Re: HTML parsing confusion
Alnilam wrote: > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, >> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous >> > 200-modules PyXML package installed. And you don't want the 75Kb >> > BeautifulSoup? >> >> I wasn't aware that I had PyXML installed, and can't find a reference >> to having it installed in pydocs. ... > > Ugh. Found it. Sorry about that, but I still don't understand why > there isn't a simple way to do this without using PyXML, BeautifulSoup > or libxml2dom. What's the point in having sgmllib, htmllib, > HTMLParser, and formatter all built in if I have to use use someone > else's modules to write a couple of lines of code that achieve the > simple thing I want. I get the feeling that this would be easier if I > just broke down and wrote a couple of regular expressions, but it > hardly seems a 'pythonic' way of going about things. This is simply a gross misunderstanding of what BeautifulSoup or lxml accomplish. Dealing with mal-formatted HTML whilst trying to make _some_ sense is by no means trivial. And just because you can come up with a few lines of code using rexes that work for your current use-case doesn't mean that they serve as general html-fixing-routine. Or do you think the rather long history and 75Kb of code for BS are because it's creator wasn't aware of rexes? And it also makes no sense stuffing everything remotely useful into the standard lib. This would force to align development and release cycles, resulting in much less features and stability as it can be wished. And to be honest: I fail to see where your problem is. BeatifulSoup is a single Python file. So whatever you carry with you from machine to machine, if it's capable of holding a file of your own code, you can simply put BeautifulSoup beside it - even if it was a floppy disk. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
On 1/22/2008 8:54 AM, Konstantin Shaposhnikov wrote: > Hi, > > This is Windows bug that is described here: > http://support.microsoft.com/default.aspx?kbid=321788 > > This article also contains solution: you need to add registry value: > > HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Policies > \Explorer > InheritConsoleHandles = 1 (REG_DWORD type) > > Do not forget to launch new console (cmd.exe) after editing registry. > > Alternatively you can use following command > > cat file | python script.py > > instead of > > cat file | python script.py > > Regards, > Konstantin Nice one, Konstantin! I can confirm that adding the registry key solves the problem on XPsp2: -After adding InheritConsoleHandles DWORD 1 key- Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. D:\temp>type test3.py | test3.py ['import sys\n', '\n', 'print sys.stdin.readlines ()\n'] D:\temp> The KB article is quite poorly written. Even though it seems to state that issue was 'solved for win2k with sp4, for XP with sp1', and gives no indication that the key is needed after the sp's are applied *even though* it is in fact necessary to the solution. Questions: -Any side effects to look out for? -If the change is relatively benign, should it be part of the install? -Is this worth a documentation patch? If yes to where, and I'll give it a shot. -Thynnus -- http://mail.python.org/mailman/listinfo/python-list
Processing XML that's embedded in HTML
Hi, I need to parse a fairly complex HTML page that has XML embedded in it. I've done parsing before with the xml.dom.minidom module on just plain XML, but I cannot get it to work with this HTML page. The XML looks like this: Owner 1 07/16/2007 No Doe, John 1905 S 3rd Ave , Hicksville IA 9 Owner 2 07/16/2007 No Doe, Jane 1905 S 3rd Ave , Hicksville IA 9 It appears to be enclosed with The rest of the document is html, javascript div tags, etc. I need the information only from the row where the Relationship tag = Owner and the Priority tag = 1. The rest I can ignore. When I tried parsing it with minidom, I get an ExpatError: mismatched tag: line 1, column 357 so I think the HTML is probably malformed. I looked at BeautifulSoup, but it seems to separate its HTML processing from its XML processing. Can someone give me some pointers? I am currently using Python 2.5 on Windows XP. I will be using Internet Explorer 6 since the document will not display correctly in Firefox. Thank you very much! Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: Boa constructor debugging - exec some code at breakpoint?
On Jan 22, 1:23 am, Joel <[EMAIL PROTECTED]> wrote: > Can you please tell me how this can be done.. > are there any other IDEs for the same purpose if Boa can't do it? > > Joel > > On Jan 6, 11:01 am, Joel <[EMAIL PROTECTED]> wrote: > > > Hey there.. > > I'm using boa constructor to debug a python application. For my > > application, I need to insert break points and execute some piece of > > code interactively through shell or someother window when the > > breakpoint has been reached. Unfortunately the shell I think is a > > seperate process so whatever variables are set while executing in > > debugger dont appear in the shell when I try to print using print > > statement. > > > Can anyone tell me how can I do this? > > > Really appreciate any support, Thanks > > > Joel > > P.S. Please CC a copy of reply to my email ID if possible. IDLE does breakpoints...you might fine the ActiveState distro more to your liking too. It's a little bit more fleshed out as an IDE than IDLE is. Or you could go full blown and use Eclipse with the Python plug-in. Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: Curses and Threading
On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: >> In fact you have *two* threads: the main thread, and the one you create >> explicitly. > >> After you start the clock thread, the main thread continues executing, >> immediately entering the finally clause. >> If you want to wait for the other thread to finish, use the join() method. >> But I'm unsure if this is the right way to mix threads and curses. > > This is what the python documentation says: > > join([timeout]) > Wait until the thread terminates. This blocks the calling thread > until the thread whose join() method is called terminates. > > So according to this since I need to block the main thread until the > clock thread ends I would need the main thread to call > "cadtime().join()", correct? I'm not sure how to do this because I > don't have a class or anything for the main thread that I know of. I > tried putting that after cadtime().start() but that doesn't work. I > guess what I'm trying to say is how can I tell the main thread what to > do when it doesn't exist in my code? > > Thanks for the help > -Brett join() is a method on Thread objects. So you'll need a reference to the Thread you create, then call join() on that. thread = cadtime() thread.start() thread.join() Ian -- http://mail.python.org/mailman/listinfo/python-list
Submitting with PAMIE
Hi I really need help. I've been looking around for an answer forever. I need to submit a form with no name and also the submit button has no name or value. How might I go about doing either of these. Thanks -- http://mail.python.org/mailman/listinfo/python-list
Using utidylib, empty string returned in some cases
Hello I'm using debian linux, Python 2.4.4, and utidylib (http:// utidylib.berlios.de/). I wrote simple functions to get a web page, convert it from windows-1251 to utf8 and then I'd like to clean html with it. Here is two pages I use to check my program: http://www.ya.ru/ (in this case everything works ok) http://www.yellow-pages.ru/rus/nd2/qu5/ru15632 (in this case tidy did not return me anything just empty string) code: -- # coding: utf-8 import urllib, urllib2, tidy def get_page(url): user_agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)' headers = { 'User-Agent' : user_agent } data= {} req = urllib2.Request(url, data, headers) responce = urllib2.urlopen(req) page = responce.read() return page def convert_1251(page): p = page.decode('windows-1251') u = p.encode('utf-8') return u def clean_html(page): tidy_options = { 'output_xhtml' : 1, 'add_xml_decl' : 1, 'indent' : 1, 'input-encoding' : 'utf8', 'output-encoding' : 'utf8', 'tidy_mark' : 1, } cleaned_page = tidy.parseString(page, **tidy_options) return cleaned_page test_url = 'http://www.yellow-pages.ru/rus/nd2/qu5/ru15632' #test_url = 'http://www.ya.ru/' #f = open('yp.html', 'r') #p = f.read() print clean_html(convert_1251(get_page(test_url))) -- What am I doing wrong? Can anyone help, please? -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On 22 Jan, 17:57, Mike Driscoll <[EMAIL PROTECTED]> wrote: > > I need to parse a fairly complex HTML page that has XML embedded in > it. I've done parsing before with the xml.dom.minidom module on just > plain XML, but I cannot get it to work with this HTML page. It's HTML day on comp.lang.python today! ;-) > The XML looks like this: > > > > Owner > > 1 > > 07/16/2007 > > No > > Doe, John > > 1905 S 3rd Ave , Hicksville IA 9 > > > > > > Owner > > 2 > > 07/16/2007 > > No > > Doe, Jane > > 1905 S 3rd Ave , Hicksville IA 9 > > > > It appears to be enclosed with id="grdRegistrationInquiryCustomers"> You could probably find the Row elements with the following XPath expression: //XML/BoundData/Row More specific would be this: //[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/Row See below for the relevance of this. You could also try using getElementById on the document, specifying the id attribute's value given above, then descending to find the Row elements. > The rest of the document is html, javascript div tags, etc. I need the > information only from the row where the Relationship tag = Owner and > the Priority tag = 1. The rest I can ignore. When I tried parsing it > with minidom, I get an ExpatError: mismatched tag: line 1, column 357 > so I think the HTML is probably malformed. Or that it isn't well-formed XML, at least. > I looked at BeautifulSoup, but it seems to separate its HTML > processing from its XML processing. Can someone give me some pointers? With libxml2dom [1] I'd do something like this: import libxml2dom d = libxml2dom.parse(filename, html=1) # or: d = parseURI(uri, html=1) rows = d.xpath("//XML/BoundData/Row") # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/ BoundData/Row") Even though the document is interpreted as HTML, you should get a DOM containing the elements as libxml2 interprets them. > I am currently using Python 2.5 on Windows XP. I will be using > Internet Explorer 6 since the document will not display correctly in > Firefox. That shouldn't be much of a surprise, it must be said: it isn't XHTML, where you might be able to extend the document via XML, so the whole document has to be "proper" HTML. Paul [1] http://www.python.org/pypi/libxml2dom -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Diez B. Roggisch wrote: > Jean-Paul Calderone wrote: > >> On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch" >> <[EMAIL PROTECTED]> wrote: >>> Jean-Paul Calderone wrote: >>> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > For a simple greenlet/tasklet/microthreading experiment I found myself > in the need to ask the question > > [snip] Why do you need a special case for generators? If you just pass the object in question to iter(), instead, then you'll either get back something that you can iterate over, or you'll get an exception for things that aren't iterable. >>> Because - as I said - I'm working on a micro-thread thingy, where the >>> scheduler needs to push returned generators to a stack and execute them. >>> Using send(), which rules out iter() anyway. >> Sorry, I still don't understand. Why is a generator different from any >> other iterator? > > Because you can use send(value) on it for example. Which you can't with > every other iterator. And that you can utizilize to create a little > framework of co-routines or however you like to call it that will yield > values when they want, or generators if they have nested co-routines the > scheduler needs to keep track of and invoke after another. So if you need the send() method, why not just check for that:: try: obj.send except AttributeError: # not a generator-like object else: # is a generator-like object Then anyone who wants to make an extended iterator and return it can expect it to work just like a real generator would. STeVe -- http://mail.python.org/mailman/listinfo/python-list
Beginners question about debugging (import)
I'm starting with Python. First with some interactive things, working through the tutorial, then with definitions in a file called sudoku.py. Of course I make lots of mistakes, so I have to include that file time and again. I discovered (the hard way) that the second time you invoke from sudoku.py import * nothing happens. There is reload. But it only seems to work with import sudoku Now I find myself typing ``sudoku.'' all the time: x=sudoku.sudoku() y=sudoku.create_set_of_sets() sudoku.symbols Is there a more convenient way? (This is a howto question, rather difficult to get answered from the documentation.) Groetjes Albert ~ -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- like all pyramid schemes -- ultimately falters. [EMAIL PROTECTED]&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > pairs4 wins. Oops. I see a smaller difference, but yes, pairs4 wins. Alan Isaac import time from itertools import islice, izip x = range(51) def pairs1(x): return izip(islice(x,0,None,2),islice(x,1,None,2)) def pairs2(x): xiter = iter(x) while True: yield xiter.next(), xiter.next() def pairs3(x): for i in range( len(x)//2 ): yield x[2*i], x[2*i+1], def pairs4(x): xiter = iter(x) return izip(xiter,xiter) t = time.clock() for x1, x2 in pairs1(x): pass t1 = time.clock() - t t = time.clock() for x1, x2 in pairs2(x): pass t2 = time.clock() - t t = time.clock() for x1, x2 in pairs3(x): pass t3 = time.clock() - t t = time.clock() for x1, x2 in pairs4(x): pass t4 = time.clock() - t print t1, t2, t3, t4 Output: 0.317524154606 1.13436847421 1.07100930426 0.262926712753 -- http://mail.python.org/mailman/listinfo/python-list
Re: Beginners question about debugging (import)
Albert van der Horst schrieb: > I'm starting with Python. First with some interactive things, > working through the tutorial, > then with definitions in a file called sudoku.py. > Of course I make lots of mistakes, so I have to include that file > time and again. > > I discovered (the hard way) that the second time you invoke > from sudoku.py import * > nothing happens. > > There is reload. But it only seems to work with > import sudoku > > Now I find myself typing ``sudoku.'' all the time: > > x=sudoku.sudoku() > y=sudoku.create_set_of_sets() > sudoku.symbols > > Is there a more convenient way? > > (This is a howto question, rather difficult to get answered > from the documentation.) import sudoku as s However, I find it easier to just create a test.py and run that from the shell. For the exact reason that reload has it's caveats and in the end, more complex testing-code isn't really feasible anyway. If you need to, drop into the interactive prompt using python -i test.py Diez -- http://mail.python.org/mailman/listinfo/python-list
rpy registry
Howdy, I've been using rpy (1.0.1) and python (2.5.1) on my office computer with great success. When I went to put rpy on my laptop, however, I get an error trying to load rpy. "Unable to determine R version from the registry. Trying another method." followed by a few lines of the usual error message style (ending with "NameError: global name 'RuntimeExecError' is not defined." I have reinstalled R (now 2.6.1), rpy, and python without any luck (being sure to check the "include in registry" on the installation of R). Everything else I have used thus far works perfectly. Any thoughts on what might be causing problems? Thanks, -Hans -- http://mail.python.org/mailman/listinfo/python-list
difflib confusion
hello all, I have a bit of a confusing question. firstly I wanted a library which can do an svn like diff with two files. let's say I have file1 and file2 where file2 contains some thing which file1 does not have. now if I do readlines() on both the files, I have a list of all the lines. I now want to do a diff and find out which word is added or deleted or changed. and that too on which character, if not at least want to know the word that has the change. any ideas please? kk -- http://mail.python.org/mailman/listinfo/python-list
Re: printing escape character
On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote: > Hi, > > I want to print string without "decoding" escaped characters to > newline etc. > like print "a\nb" -> a\nb > is there a simple way to do it in python or should i somehow use > string.replace(..) function ? >>> print 'a\nb'.encode('string_escape') a\nb -- Jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 6:34 pm, Paddy <[EMAIL PROTECTED]> wrote: [...] > Hi George, > You need to 'get it right' first. Micro optimizations for speed > without thought of the wider context is a bad habit to form and a time > waster. > If the routine is all that needs to be delivered and it does not > perform at an acceptable speed then find out what is acceptable and > optimise towards that goal. My questions were set to get posters to > think more about the need for speed optimizations and where they > should be applied, (if at all). > > A bit of forethought might justify leaving the routine alone, or > optimising for readability instead. But it's fun! Some-of-us-can't-help-it'ly yours -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: printing escape character
On Jan 22, 7:58 pm, "Jerry Hill" <[EMAIL PROTECTED]> wrote: > On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote: > > > Hi, > > > I want to print string without "decoding" escaped characters to > > newline etc. > > like print "a\nb" -> a\nb > > is there a simple way to do it in python or should i somehow use > > string.replace(..) function ? > >>> print 'a\nb'.encode('string_escape') > > a\nb > > -- > Jerry thank you, jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 5:34 am, George Sakkis <[EMAIL PROTECTED]> wrote: > On Jan 22, 12:15 am, Paddy <[EMAIL PROTECTED]> wrote: > > > On Jan 22, 3:20 am, Alan Isaac <[EMAIL PROTECTED]> wrote:> I want to > > generate sequential pairs from a list. > > <> > > > What is the fastest way? (Ignore the import time.) > > > 1) How fast is the method you have? > > 2) How much faster does it need to be for your application? > > 3) Are their any other bottlenecks in your application? > > 4) Is this the routine whose smallest % speed-up would give the > > largest overall speed up of your application? > > I believe the "what is the fastest way" question for such small well- > defined tasks is worth asking on its own, regardless of whether it > makes a difference in the application (or even if there is no > application to begin with). Hi George, You need to 'get it right' first. Micro optimizations for speed without thought of the wider context is a bad habit to form and a time waster. If the routine is all that needs to be delivered and it does not perform at an acceptable speed then find out what is acceptable and optimise towards that goal. My questions were set to get posters to think more about the need for speed optimizations and where they should be applied, (if at all). A bit of forethought might justify leaving the routine alone, or optimising for readability instead. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
printing escape character
Hi, I want to print string without "decoding" escaped characters to newline etc. like print "a\nb" -> a\nb is there a simple way to do it in python or should i somehow use string.replace(..) function ? thanks for any reply hrocho -- http://mail.python.org/mailman/listinfo/python-list
A global or module-level variable?
This has to be easier than I'm making it I've got a module, remote.py, which contains a number of classes, all of whom open a port for communication. I'd like to have a way to coordinate these port numbers akin to this: So I have this in the __init__.py file for a package called cstore: nextport=42000 def getNextPort(): nextport += 1 return nextport : Then, in the class where I wish to use this (in cstore.remote.py): : class Spam(): def __init__(self, **kwargs): self._port = cstore.getNextPort() I can't seem to make this work, though. As given here, I get an "UnboundLocalError:local variable 'nextport' referenced before assignment". When I try prefixing the names inside __init__.py with "cstore.", I get an error that the global name "cstore" is not defined. I've been looking at this long enough that my eyes are blurring. Any ideas? BTW, the driving force here is that I'm going to need to wrap this in some thread synchronization. For now, though, I'm just trying to get the basics working. Thanks! Bret -- http://mail.python.org/mailman/listinfo/python-list
question
I'm still learning Python and was wanting to get some thoughts on this. I apologize if this sounds ridiculous... I'm mainly asking it to gain some knowledge of what works better. The main question I have is if I had a lot of lists to choose from, what's the best way to write the code so I'm not wasting a lot of memory? I've attempted to list a few examples below to hopefully be a little clearer about my question. Lets say I was going to be pulling different data, depending on what the user entered. I was thinking I could create a function which contained various functions inside: def albumInfo(theBand): def Rush(): return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] def Enchant(): return ['A Blueprint of the World', 'Wounded', 'Time Lost'] ... The only problem with the code above though is that I don't know how to call it, especially since if the user is entering a string, how would I convert that string into a function name? For example, if the user entered 'Rush', how would I call the appropriate function --> albumInfo(Rush()) But if I could somehow make that code work, is it a good way to do it? I'm assuming if the user entered 'Rush' that only the list in the Rush() function would be stored, ignoring the other functions inside the albumInfo() function? I then thought maybe just using a simple if/else statement might work like so: def albumInfo(theBand): if theBand == 'Rush': return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] elif theBand == 'Enchant': return ['A Blueprint of the World', 'Wounded', 'Time Lost'] ... Does anyone think this would be more efficient? I'm not familiar with how 'classes' work yet (still reading through my 'Core Python' book) but was curious if using a 'class' would be better suited for something like this? Since the user could possibly choose from 100 or more choices, I'd like to come up with something that's efficient as well as easy to read in the code. If anyone has time I'd love to hear your thoughts. Thanks. Jay -- http://mail.python.org/mailman/listinfo/python-list
Re: A global or module-level variable?
Bret <[EMAIL PROTECTED]> writes: > nextport=42000 > > def getNextPort(): > nextport += 1 > return nextport If you have to do it that way, use: def getNextPort(): global nextport nextport += 1 return nextport the global declaration stops the compiler from treating nextport as local and then trapping the increment as to an uninitialized variable. -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > >> http://bugs.python.org/issue1121416> >> >> fwiw, >> Alan Isaac > > Thanks. So I guess I shouldn't take the code snippet I quoted as a > specification of izip but rather as an illustration. You can be bolder here as the izip() docs explicitly state """ Note, the left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using "izip(*[iter(s)]*n)". """ and the bug report with Raymond Hettinger saying """ Left the evaluation order as an unspecified, implementation specific detail. """ is about zip(), not izip(). Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
Paul McGuire wrote: > > Here is a pyparsing hack for your problem. Thanks Paul! This looks like an interesting approach, and once I get my head around the syntax, I'll give it a proper whirl. -- http://mail.python.org/mailman/listinfo/python-list
Re: question
Since you aren't familyer with classes i will keep this within the scope of functions... If you have code like this def a(): def b(): a+=1 Then you can only call function b when you are within function a James On Jan 22, 2008 8:58 PM, <[EMAIL PROTECTED]> wrote: > I'm still learning Python and was wanting to get some thoughts on this. I > apologize if this sounds ridiculous... I'm mainly asking it to gain some > knowledge of what works better. The main question I have is if I had a lot > of lists to choose from, what's the best way to write the code so I'm not > wasting a lot of memory? I've attempted to list a few examples below to > hopefully be a little clearer about my question. > > Lets say I was going to be pulling different data, depending on what the user > entered. I was thinking I could create a function which contained various > functions inside: > > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> albumInfo(Rush()) > > But if I could somehow make that code work, is it a good way to do it? I'm > assuming if the user entered 'Rush' that only the list in the Rush() function > would be stored, ignoring the other functions inside the albumInfo() function? > > I then thought maybe just using a simple if/else statement might work like so: > > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > > Does anyone think this would be more efficient? > > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > > Thanks. > > Jay > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://search.goldwatches.com/?Search=Movado+Watches http://www.jewelerslounge.com http://www.goldwatches.com -- http://mail.python.org/mailman/listinfo/python-list
Re: PyGTK, Glade, and ComboBoxEntry.append_text()
Greg Johnston wrote: > Hey all, > > I'm a relative newbie to Python (switched over from Scheme fairly > recently) but I've been using PyGTK and Glade to create an interface, > which is a combo I'm very impressed with. > > There is, however, one thing I've been wondering about. It doesn't > seem possible to modify ComboBoxEntry choice options on the fly--at > least with append_text(), etc--because they were not created with > gtk.combo_box_entry_new_text(). Basically, I'm wondering if there's > any way around this. > > Thank you, > Greg Johnston PyGTK mailing list: http://pygtk.org/feedback.html -- http://mail.python.org/mailman/listinfo/python-list
Re: question
On Jan 22, 7:58 pm, <[EMAIL PROTECTED]> wrote: > I'm still learning Python and was wanting to get some thoughts on this. I > apologize if this sounds ridiculous... I'm mainly asking it to gain some > knowledge of what works better. The main question I have is if I had a lot > of lists to choose from, what's the best way to write the code so I'm not > wasting a lot of memory? I've attempted to list a few examples below to > hopefully be a little clearer about my question. > > Lets say I was going to be pulling different data, depending on what the user > entered. I was thinking I could create a function which contained various > functions inside: > > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> albumInfo(Rush()) > > But if I could somehow make that code work, is it a good way to do it? I'm > assuming if the user entered 'Rush' that only the list in the Rush() function > would be stored, ignoring the other functions inside the albumInfo() function? > > I then thought maybe just using a simple if/else statement might work like so: > > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > > Does anyone think this would be more efficient? > > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > > Thanks. > > Jay What you want is a dictionary: albumInfo = { 'Rush': 'Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'], 'Enchant': ['A Blueprint of the World', 'Wounded', 'Time Lost'], ... } then to find the info just do: >>> albumInfo['Enchant'] ['A Blueprint of the World', 'Wounded', 'Time Lost'] It also makes it easy to add a new album on the fly: >>> albumInfo["Lark's tongue in Aspic"] = [ ... ] Hope that helps. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
[Peter Otten] > You can be bolder here as the izip() docs explicitly state > > """ > Note, the left-to-right evaluation order of the iterables is > guaranteed. This makes possible an idiom for clustering a data series into > n-length groups using "izip(*[iter(s)]*n)". > """ . . . > is about zip(), not izip(). FWIW, I just added a similar guarantee for zip(). Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: question
> def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > The only problem with the code above though is that I > don't know how to call it, especially since if the user is > entering a string, how would I convert that string into a > function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> > albumInfo(Rush()) > It looks like you're reaching for a dictionary idiom: album_info = { 'Rush': [ 'Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres', ], 'Enchant': [ 'A Blueprint of the World', 'Wounded', 'Time Lost', ], } You can then reference the bits: who = "Rush" #get this from the user? print "Albums by %s" % who for album_name in album_info[who]: print ' *', album_name This is much more flexible when it comes to adding groups and albums because you can load the contents of album_info dynamically from your favorite source (a file, DB, or teh intarweb) rather than editing & restarting your app every time. -tkc PS: to answer your original question, you can use the getattr() function, such as results = getattr(albumInfo, who)() but that's an ugly solution for the example you gave. -- http://mail.python.org/mailman/listinfo/python-list
Re: bags? 2.5.x?
On Jan 21, 11:13 pm, Dan Stromberg <[EMAIL PROTECTED]> wrote: > On Thu, 17 Jan 2008 18:18:53 -0800, Raymond Hettinger wrote: > >> >> I keep wanting something like them - especially bags with something > >> >> akin to set union, intersection and difference. > > >> > How about this recepie > >> > http://www.ubookcase.com/book/Oreilly/ > > >> The author of the bag class said that he was planning to submit bags > >> for inclusion in 2.5 - is there a particular reason why they didn't go > >> in? > > > Three reasons: > > > 1. b=collections.defaultdict(int) went a long way towards meeting the > > need to for a fast counter. > > > 2. It's still not clear what the best API would be. What should list(b) > > return for b.dict = {'a':3, 'b':0, 'c':-3}? Perhaps, [('a', 3), ('b', > > 0), ('c', -3)] or ['a', 'a', 'a'] > > or ['a'] > > or ['a', 'b', 'c'] > > or raise an Error for the negative entry. > > I'd suggest that .keys() return the unique list, and that list() return > the list of tuples. Then people can use list comprehensions or map() to > get what they really need. I think that a bag is a cross between a dict (but the values are always positive integers) and a set (but duplicates are permitted). I agree that .keys() should the unique list, but that .items() should return the tuples and list() should return the list of keys including duplicates. bag() should accept an iterable and count the duplicates. For example: >>> sentence = "the cat sat on the mat" >>> my_words = sentence.split() >>> print my_words ['the', 'cat', 'sat', 'on', 'the', 'mat'] >>> my_bag = bag(my_words) >>> print my_bag bag({'on': 1, 'the': 2, 'sat': 1, 'mat': 1, 'cat': 1}) my_list = list(my_bag) ['on', 'the', 'the', 'sat', 'mat', 'cat'] It should be easy to convert a bag to a dict and also a dict to a bag, raising ValueError if it sees a value that's not a non-negative integer (a value of zero just means "there isn't one of these in the bag"!). > > It might not be a bad thing to have an optional parameter on __init__ > that would allow the user to specify if they need negative counts or not; > so far, I've not needed them though. > > > 3. I'm still working on it and am not done yet. > > Decent reasons. :) > > Thanks! > > Here's a diff to bag.py that helped me. I'd like to think these meanings > are common, but not necessarily! > > $ diff -b /tmp/bag.py.original /usr/local/lib/bag.py > 18a19,58 > > > def _difference(lst): > > left = lst[0] > > right = lst[1] > > return max(left - right, 0) > > _difference = staticmethod(_difference) > > > def _relationship(self, other, operator): > > if isinstance(other, bag): > > self_keys = set(self._data.keys()) > > other_keys = set(other._data.keys()) > > union_keys = self_keys | other_keys > > #print 'union_keys is',union_keys > > result = bag() > > for element in list(union_keys): > > temp = operator([ self[element], other > [element] ]) > > #print 'operator is', operator > > #print 'temp is', temp > > result[element] += temp > > return result > > else: > > raise NotImplemented > > > def union(self, other): > > return self._relationship(other, sum) > > > __or__ = union > > > def intersection(self, other): > > return self._relationship(other, min) > > > __and__ = intersection > > > def maxunion(self, other): > > return self._relationship(other, max) > > > def difference(self, other): > > return self._relationship(other, self._difference) > > > __sub__ = difference -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On Jan 22, 11:32 am, Paul Boddie <[EMAIL PROTECTED]> wrote: > > The rest of the document is html, javascript div tags, etc. I need the > > information only from the row where the Relationship tag = Owner and > > the Priority tag = 1. The rest I can ignore. When I tried parsing it > > with minidom, I get an ExpatError: mismatched tag: line 1, column 357 > > so I think the HTML is probably malformed. > > Or that it isn't well-formed XML, at least. I probably should have posted that I got the error on the first line of the file, which is why I think it's the HTML. But I wouldn't be surprised if it was the XML that's behaving badly. > > > I looked at BeautifulSoup, but it seems to separate its HTML > > processing from its XML processing. Can someone give me some pointers? > > With libxml2dom [1] I'd do something like this: > > import libxml2dom > d = libxml2dom.parse(filename, html=1) > # or: d = parseURI(uri, html=1) > rows = d.xpath("//XML/BoundData/Row") > # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/ > BoundData/Row") > > Even though the document is interpreted as HTML, you should get a DOM > containing the elements as libxml2 interprets them. > > > I am currently using Python 2.5 on Windows XP. I will be using > > Internet Explorer 6 since the document will not display correctly in > > Firefox. > > That shouldn't be much of a surprise, it must be said: it isn't XHTML, > where you might be able to extend the document via XML, so the whole > document has to be "proper" HTML. > > Paul > > [1]http://www.python.org/pypi/libxml2dom I must have tried this module quite a while ago since I already have it installed. I see you're the author of the module, so you can probably tell me what's what. When I do the above, I get an empty list either way. See my code below: import libxml2dom d = libxml2dom.parse(filename, html=1) rows = d.xpath('//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/ Row') # rows = d.xpath("//XML/BoundData/Row") print rows I'm not sure what is wrong here...but I got lxml to create a tree from by doing the following: from lxml import etree from StringIO import StringIO parser = etree.HTMLParser() tree = etree.parse(filename, parser) xml_string = etree.tostring(tree) context = etree.iterparse(StringIO(xml_string)) However, when I iterate over the contents of "context", I can't figure out how to nab the row's contents: for action, elem in context: if action == 'end' and elem.tag == 'relationship': # do something...but what!? # this if statement probably isn't even right Thanks for the quick response, though! Any other ideas? Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: question
Hi there :) A little tip upfront: In the future you might want to come up with a more descriptive subject line. This will help readers decide early if they can possibly help or not. [EMAIL PROTECTED] wrote: > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > Yuck! ;) > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? While this is relatively easy, it is *waaayyy* too complicated an approach here, because . . . > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > . . . this is a lot more fitting for this problem. You could also have used a dictionary here, but the above is better if you have a lot of lists, because only the one you use is created (I think . . .). You might also want to consider preparing a textfile and reading it into a list (via lines = open("somefile.txt").readlines()) and then work with that so you don't have to hardcode it into the program. This however is somewhat advanced (if you're just starting out), so don't sweat it. > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > Think of classes as "models of things and their behavior" (like an animal, a car or a robot). What you want is a simple "request->answer" style functionality, hence a function. Hope that helps. Happy coding :) /W -- http://mail.python.org/mailman/listinfo/python-list
Re: Bug in __init__?
On 2008-01-22, citizen Bruno Desthuilliers testified: >> from copy import copy >> ### see also deepcopy >> self.lst = copy(val) > > What makes you think the OP wants a copy ? I´m guessing he doesn´t want to mutate original list, while changing contents of self.lst. bart -- "chłopcy dali z siebie wszystko, z czego tv pokazała głównie bebechy" http://candajon.azorragarse.info/ http://azorragarse.candajon.info/ -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 11:39 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > Alnilam wrote: > > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: > >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, > >> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous > >> > 200-modules PyXML package installed. And you don't want the 75Kb > >> > BeautifulSoup? > > >> I wasn't aware that I had PyXML installed, and can't find a reference > >> to having it installed in pydocs. ... > > > Ugh. Found it. Sorry about that, but I still don't understand why > > there isn't a simple way to do this without using PyXML, BeautifulSoup > > or libxml2dom. What's the point in having sgmllib, htmllib, > > HTMLParser, and formatter all built in if I have to use use someone > > else's modules to write a couple of lines of code that achieve the > > simple thing I want. I get the feeling that this would be easier if I > > just broke down and wrote a couple of regular expressions, but it > > hardly seems a 'pythonic' way of going about things. > > This is simply a gross misunderstanding of what BeautifulSoup or lxml > accomplish. Dealing with mal-formatted HTML whilst trying to make _some_ > sense is by no means trivial. And just because you can come up with a few > lines of code using rexes that work for your current use-case doesn't mean > that they serve as general html-fixing-routine. Or do you think the rather > long history and 75Kb of code for BS are because it's creator wasn't aware > of rexes? > > And it also makes no sense stuffing everything remotely useful into the > standard lib. This would force to align development and release cycles, > resulting in much less features and stability as it can be wished. > > And to be honest: I fail to see where your problem is. BeatifulSoup is a > single Python file. So whatever you carry with you from machine to machine, > if it's capable of holding a file of your own code, you can simply put > BeautifulSoup beside it - even if it was a floppy disk. > > Diez I am, by no means, trying to trivialize the work that goes into creating the numerous modules out there. However as a relatively novice programmer trying to figure out something, the fact that these modules are pushed on people with such zealous devotion that you take offense at my desire to not use them gives me a bit of pause. I use non-included modules for tasks that require them, when the capability to do something clearly can't be done easily another way (eg. MySQLdb). I am sure that there will be plenty of times where I will use BeautifulSoup. In this instance, however, I was trying to solve a specific problem which I attempted to lay out clearly from the outset. I was asking this community if there was a simple way to use only the tools included with Python to parse a bit of html. If the answer is no, that's fine. Confusing, but fine. If the answer is yes, great. I look forward to learning from someone's example. If you don't have an answer, or a positive contribution, then please don't interject your angst into this thread. -- http://mail.python.org/mailman/listinfo/python-list
Re: get the size of a dynamically changing file fast ?
Mike Driscoll wrote: > On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > >> hello, >> >> I've a program (not written in Python) that generates a few thousands >> bytes per second, >> these files are dumped in 2 buffers (files), at in interval time of 50 msec, >> the files can be read by another program, to do further processing. >> >> A program written in VB or delphi can handle the data in the 2 buffers >> perfectly. >> Sometimes Python is also able to process the data correctly, >> but often it can't :-( >> >> I keep one of the files open en test the size of the open datafile each >> 50 msec. >> I have tried >> os.stat ( ) [ ST_SIZE] >> os.path.getsize ( ... ) >> but they both have the same behaviour, sometimes it works, and the data >> is collected each 50 .. 100 msec, >> sometimes 1 .. 1.5 seconds is needed to detect a change in filesize. >> >> I'm using python 2.4 on winXP. >> >> Is there a solution for this problem ? >> >> thanks, >> Stef Mientki >> > > Tim Golden has a method to watch for changes in a directory on his > website: > > http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_for_changes.html > > This old post also mentions something similar: > > http://mail.python.org/pipermail/python-list/2007-October/463065.html > > And here's a cookbook recipe that claims to do it as well using > decorators: > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620 > > Hopefully that will get you going. > > Mike > thanks Mike, sorry for the late reaction. I've it working perfect now. After all, os.stat works perfectly well, the problem was in the program that generated the file with increasing size, by truncating it after each block write, it apperently garantees that the file is flushed to disk and all problems are solved. cheers, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list