Re: FTP example going through a FTP Proxy
On Jan 7, 12:32 pm, jakecjacobson wrote: > Hi, > > I need to write a simple Python script that I can connect to a FTP > server and download files from the server to my local box. I am > required to go through a FTP Proxy and I don't see any examples on how > to do this. The FTP proxy doesn't require username or password to > connect but the FTP server that I am connecting to does. > > Any examples on how to do this would be greatly appreciated. I am > limited to using Python version 2.4.3 on a Linux box. This is what I have tried so far, import urllib proxies = {'ftp':'ftp://proxy_server:21'} ftp_server = 'ftp.somecompany.com' ftp_port='21' username = '' password = 'secretPW' ftp_string='ftp://' + username + '@' + password + ftp_server + ':' + ftp_port data = urllib.urlopen(ftp_string, proxies=proxies) data=urllib.urlopen(req).read() print data I get the following error: Traceback (most recent call last): File "./ftptest.py", line 22, in ? data = urllib.urlopen(ftp_server, proxies=proxies) File "/usr/lib/python2.4/urllib.py", line 82, in urlopen return opener.open(url) File "/usr/lib/python2.4/urllib.py", line 190, in open return getattr(self, name)(url) File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp host, path = splithost(url) File "/usr/lib/python2.4/urllib.py", line 949, in splithost match = _hostprog.match(url) TypeError: expected string or buffer -- http://mail.python.org/mailman/listinfo/python-list
Re: FTP example going through a FTP Proxy
On Jan 7, 2:11 pm, jakecjacobson wrote: > On Jan 7, 12:32 pm, jakecjacobson wrote: > > > Hi, > > > I need to write a simple Python script that I can connect to a FTP > > server and download files from the server to my local box. I am > > required to go through a FTP Proxy and I don't see any examples on how > > to do this. The FTP proxy doesn't require username or password to > > connect but the FTP server that I am connecting to does. > > > Any examples on how to do this would be greatly appreciated. I am > > limited to using Python version 2.4.3 on a Linux box. > > This is what I have tried so far, > > import urllib > > proxies = {'ftp':'ftp://proxy_server:21'} > ftp_server = 'ftp.somecompany.com' > ftp_port='21' > username = '' > password = 'secretPW' > > ftp_string='ftp://' + username + '@' + password + ftp_server + ':' + > ftp_port > > data = urllib.urlopen(ftp_string, proxies=proxies) > > data=urllib.urlopen(req).read() > > print data > > I get the following error: > > Traceback (most recent call last): > File "./ftptest.py", line 22, in ? > data = urllib.urlopen(ftp_server, proxies=proxies) > File "/usr/lib/python2.4/urllib.py", line 82, in urlopen > return opener.open(url) > File "/usr/lib/python2.4/urllib.py", line 190, in open > return getattr(self, name)(url) > File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp > host, path = splithost(url) > File "/usr/lib/python2.4/urllib.py", line 949, in splithost > match = _hostprog.match(url) > TypeError: expected string or buffer I might be getting closer. Now I am getting "I/O error(ftp error): (111, 'Connection refused')" error with the following code: import urllib2 proxies = {'ftp':'ftp://proxy_server:21'} ftp_server = 'ftp.somecompany.com' ftp_port='21' username = '' password = 'secretPW' password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm() top_level_url = ftp_server password_mgr.add_password(None, top_level_url, username, password) proxy_support = urllib2.ProxyHandler(proxies) handler = urllib2.HTTPBasicAuthHandler(password_mgr) opener = urllib2.build_opener(proxy_support) opener = urllib2.build_opener(handler) a_url = 'ftp://' + ftp_server + ':' + ftp_port + '/' print a_url try: data = opener.open(a_url) print data except IOError, (errno, strerror): print "I/O error(%s): %s" % (errno, strerror) -- http://mail.python.org/mailman/listinfo/python-list
FTP example going through a FTP Proxy
Hi, I need to write a simple Python script that I can connect to a FTP server and download files from the server to my local box. I am required to go through a FTP Proxy and I don't see any examples on how to do this. The FTP proxy doesn't require username or password to connect but the FTP server that I am connecting to does. Any examples on how to do this would be greatly appreciated. I am limited to using Python version 2.4.3 on a Linux box. -- http://mail.python.org/mailman/listinfo/python-list
Re: FTP example going through a FTP Proxy
On Jan 7, 3:56 pm, jakecjacobson wrote: > On Jan 7, 2:11 pm, jakecjacobson wrote: > > > > > On Jan 7, 12:32 pm, jakecjacobson wrote: > > > > Hi, > > > > I need to write a simple Python script that I can connect to a FTP > > > server and download files from the server to my local box. I am > > > required to go through a FTP Proxy and I don't see any examples on how > > > to do this. The FTP proxy doesn't require username or password to > > > connect but the FTP server that I am connecting to does. > > > > Any examples on how to do this would be greatly appreciated. I am > > > limited to using Python version 2.4.3 on a Linux box. > > > This is what I have tried so far, > > > import urllib > > > proxies = {'ftp':'ftp://proxy_server:21'} > > ftp_server = 'ftp.somecompany.com' > > ftp_port='21' > > username = '' > > password = 'secretPW' > > > ftp_string='ftp://' + username + '@' + password + ftp_server + ':' + > > ftp_port > > > data = urllib.urlopen(ftp_string, proxies=proxies) > > > data=urllib.urlopen(req).read() > > > print data > > > I get the following error: > > > Traceback (most recent call last): > > File "./ftptest.py", line 22, in ? > > data = urllib.urlopen(ftp_server, proxies=proxies) > > File "/usr/lib/python2.4/urllib.py", line 82, in urlopen > > return opener.open(url) > > File "/usr/lib/python2.4/urllib.py", line 190, in open > > return getattr(self, name)(url) > > File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp > > host, path = splithost(url) > > File "/usr/lib/python2.4/urllib.py", line 949, in splithost > > match = _hostprog.match(url) > > TypeError: expected string or buffer > > I might be getting closer. Now I am getting "I/O error(ftp error): > (111, 'Connection refused')" error with the following code: > > import urllib2 > > proxies = {'ftp':'ftp://proxy_server:21'} > ftp_server = 'ftp.somecompany.com' > ftp_port='21' > username = '' > password = 'secretPW' > > password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm() > top_level_url = ftp_server > password_mgr.add_password(None, top_level_url, username, password) > > proxy_support = urllib2.ProxyHandler(proxies) > handler = urllib2.HTTPBasicAuthHandler(password_mgr) > opener = urllib2.build_opener(proxy_support) > opener = urllib2.build_opener(handler) > a_url = 'ftp://' + ftp_server + ':' + ftp_port + '/' > print a_url > > try: > data = opener.open(a_url) > print data > except IOError, (errno, strerror): > print "I/O error(%s): %s" % (errno, strerror) I tried the same code from a different box and got a different error message: I/O error(ftp error): 501 USER format: proxy-user:auth- met...@destination. Closing connection. My guess is that my original box couldn't connect with the firewall proxy so I was getting a connection refused error. Now it appears that the password mgr has an issue if I understand the error correctly. I really hope that someone out in the Python Community can give me a pointer. -- http://mail.python.org/mailman/listinfo/python-list
Getting/Setting HTTP Headers
I need to write a feed parser that takes a url for any Atom or RSS feed and transform it into an Atom feed. I done the transformation part but I want to support conditional HTTP requests. I have not been able to find any examples that show: 1. How to read the Last_Modified or ETag header value from the requester 2. How to set the corresponding HTTP header value, either a 302 not modified or the new Last_Modified date and/or ETag values -- http://mail.python.org/mailman/listinfo/python-list
Processing XML File
I need to take a XML web resource and split it up into smaller XML files. I am able to retrieve the web resource but I can't find any good XML examples. I am just learning Python so forgive me if this question has been answered many times in the past. My resource is like: ... ... ... ... So in this example, I would need to output 2 files with the contents of each file what is between the open and close document tag. -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML File
On Jan 29, 1:04 pm, Adam Tauno Williams wrote: > On Fri, 2010-01-29 at 09:25 -0800, jakecjacobson wrote: > > I need to take a XML web resource and split it up into smaller XML > > files. I am able to retrieve the web resource but I can't find any > > good XML examples. I am just learning Python so forgive me if this > > question has been answered many times in the past. > > My resource is like: > > > > ... > > ... > > > > > > ... > > ... > > > > So in this example, I would need to output 2 files with the contents > > of each file what is between the open and close document tag. > > Do you want to parse the document or SaX? > > I have a SaX example at > <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/99b227b08f7f/s...> Thanks but I am way over my head with XML, Python. I am working with DDMS and need to output the individual resource nodes to their own file. I hope that this helps and I need a good example and how to use it. Here is what a resource node looks like: https://metadata.dod.mil/mdr/ns/DDMS/1.4/ https://metadata.dod.mil/mdr/ns/DDMS/1.4/"; xmlns:ddms="https://metadata.dod.mil/mdr/ns/DDMS/1.4/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:ICISM="urn:us:gov:ic:ism:v2"> https://metadata.dod.mil/mdr/ ns/MDR/1.0/MDR.owl#GovernanceNamespace" ddms:value="TBD"/> Sample Taxonomy This is a sample taxonomy created for the Help page. Sample Developer FGM, Inc. 703-885-1000 sampledevelo...@fgm.com You can see the DDMS site at https://metadata.dod.mil/. -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML File
On Jan 29, 2:41 pm, Stefan Behnel wrote: > Sells, Fred, 29.01.2010 20:31: > > > Google is your friend. Elementtree is one of the better documented > > IMHO, but there are many modules to do this. > > Unless the OP provides some more information, "do this" is rather > underdefined. And sending someone off to Google who is just learning the > basics of Python and XML and trying to solve a very specific problem with > them is not exactly the spirit I'm used to in this newsgroup. > > Stefan Just want to thank everyone for their posts. I got it working after I discovered a name space issue with this code. xmlDoc = libxml2.parseDoc(guts) # Ignore namespace and just get the Resource resourceNodes = xmlDoc.xpathEval('//*[local-name()="Resource"]') for rNode in resourceNodes: print rNode -- http://mail.python.org/mailman/listinfo/python-list
Authenticating to web service using https and client certificate
Hi, I need to post some XML files to a web client that requires a client certificate to authenticate. I have some code that works on posting a multipart form over http but I need to modify it to pass the proper certificate and post the XML file. Is there any example code that will point me in the correct direction? Thanks for your help. -- http://mail.python.org/mailman/listinfo/python-list
exceptions.TypeError an integer is required
I am trying to do a post to a REST API over HTTPS and requires the script to pass a cert to the server. I am getting "exceptions.TypeError an integer is required" error and can't find the reason. I commenting out the lines of code, it is happening on the connection.request() line. Here is the problem code. Would love some help if possible. head = {"Content-Type" : "application/x-www-form-urlencoded", "Accept" : "text/plain"} parameters = urlencode({"collection" : collection, "entryxml" : open (file,'r').read()}) try: connection = httplib.HTTPSConnection(host, port, key_file, cert_file) connection.request('POST', path, parameters, head) response = connection.getresponse() print response.status, response.reason except: print sys.exc_type, sys.exc_value connection.close() -- http://mail.python.org/mailman/listinfo/python-list
Re: exceptions.TypeError an integer is required
On Jul 24, 3:11 pm, Steven D'Aprano wrote: > On Fri, 24 Jul 2009 11:24:58 -0700, jakecjacobson wrote: > > I am trying to do a post to a REST API over HTTPS and requires the > > script to pass a cert to the server. I am getting "exceptions.TypeError > > an integer is required" error and can't find the reason. I commenting > > out the lines of code, it is happening on the connection.request() line. > > Here is the problem code. Would love some help if possible. > > Please post the traceback that you get. > > My guess is that you are passing a string instead of an integer, probably > for the port. > > [...] > > > except: > > print sys.exc_type, sys.exc_value > > As a general rule, a bare except of that fashion is bad practice. Unless > you can explain why it is normally bad practice, *and* why your case is > an exception (no pun intended) to the rule "never use bare except > clauses", I suggest you either: > > * replace "except:" with "except Exception:" instead. > > * better still, re-write the entire try block as: > > try: > [code goes here] > finally: > connection.close() > > and use the Python error-reporting mechanism instead of defeating it. > > -- > Steven Steven, You are quite correct in your statements. My goal was not to make great code but something that I could quickly test. My assumption was that the httplib.HTTPSConnection() would do the cast to int for me. As soon as I cast it to an int, I was able to get past that issue. Still not able to post because I am getting a bad cert error. Jake Jacobson -- http://mail.python.org/mailman/listinfo/python-list
bad certificate error
Hi, I am getting the following error when doing a post to REST API, Enter PEM pass phrase: Traceback (most recent call last): File "./ices_catalog_feeder.py", line 193, in ? main(sys.argv[1]) File "./ices_catalog_feeder.py", line 60, in main post2Catalog(catalog_host, catalog_port, catalog_path, os.path.join (input_dir, file), collection_name, key_file, cert_file) File "./ices_catalog_feeder.py", line 125, in post2Catalog connection.request('POST', path, parameters, head) File "/usr/lib/python2.4/httplib.py", line 810, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.4/httplib.py", line 833, in _send_request self.endheaders() File "/usr/lib/python2.4/httplib.py", line 804, in endheaders self._send_output() File "/usr/lib/python2.4/httplib.py", line 685, in _send_output self.send(msg) File "/usr/lib/python2.4/httplib.py", line 652, in send self.connect() File "/usr/lib/python2.4/httplib.py", line 1079, in connect ssl = socket.ssl(sock, self.key_file, self.cert_file) File "/usr/lib/python2.4/socket.py", line 74, in ssl return _realssl(sock, keyfile, certfile) socket.sslerror: (1, 'error:14094412:SSL routines:SSL3_READ_BYTES:sslv3 alert bad certificate') My code where this error occurs is: head = {"Content-Type" : "application/x-www-form-urlencoded", "Accept" : "text/plain"} parameters = urlencode({"collection" : collection, "entryxml" : open (file,'r').read()}) print "Sending the file to: " + host try: try: # Default port is 443. # key_file is the name of a PEM formatted file that contains your private key. # cert_file is a PEM formatted certificate chain file. connection = httplib.HTTPSConnection(host, int(port), key_file, cert_file) connection.request('POST', path, parameters, head) response = connection.getresponse() print response.status, response.reason except httplib.error, (value,message): print value + ':' + message finally: connection.close() I was wondering if this is due to the server having a invalid server cert? If I go to this server in my browser, I get a "This server tried to identify itself with invalid information". Is there a way to ignore this issue with Python? Can I setup a trust store and add this server to the trust store? -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 27, 2:23 pm, "Gabriel Genellina" wrote: > En Mon, 27 Jul 2009 12:57:40 -0300, jakecjacobson > escribió: > > > I was wondering if this is due to the server having a invalid server > > cert? If I go to this server in my browser, I get a "This server > > tried to identify itself with invalid information". Is there a way to > > ignore this issue with Python? Can I setup a trust store and add this > > server to the trust store? > > I don't see the point in trusting someone that you know is telling lies > about itself. > > -- > Gabriel Genellina It is a test box that the team I am on runs. That is why I would trust it. -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 28, 3:29 am, Nick Craig-Wood wrote: > jakecjacobson wrote: > > I am getting the following error when doing a post to REST API, > > > Enter PEM pass phrase: > > Traceback (most recent call last): > > File "./ices_catalog_feeder.py", line 193, in ? > > main(sys.argv[1]) > > File "./ices_catalog_feeder.py", line 60, in main > > post2Catalog(catalog_host, catalog_port, catalog_path, os.path.join > > (input_dir, file), collection_name, key_file, cert_file) > > File "./ices_catalog_feeder.py", line 125, in post2Catalog > > connection.request('POST', path, parameters, head) > > File "/usr/lib/python2.4/httplib.py", line 810, in request > > self._send_request(method, url, body, headers) > > File "/usr/lib/python2.4/httplib.py", line 833, in _send_request > > self.endheaders() > > File "/usr/lib/python2.4/httplib.py", line 804, in endheaders > > self._send_output() > > File "/usr/lib/python2.4/httplib.py", line 685, in _send_output > > self.send(msg) > > File "/usr/lib/python2.4/httplib.py", line 652, in send > > self.connect() > > File "/usr/lib/python2.4/httplib.py", line 1079, in connect > > ssl = socket.ssl(sock, self.key_file, self.cert_file) > > File "/usr/lib/python2.4/socket.py", line 74, in ssl > > return _realssl(sock, keyfile, certfile) > > socket.sslerror: (1, 'error:14094412:SSL > > routines:SSL3_READ_BYTES:sslv3 alert bad certificate') > > > My code where this error occurs is: > > > head = {"Content-Type" : "application/x-www-form-urlencoded", > > "Accept" : "text/plain"} > > parameters = urlencode({"collection" : collection, "entryxml" : open > > (file,'r').read()}) > > print "Sending the file to: " + host > > > try: > > try: > > # Default port is 443. > > # key_file is the name of a PEM formatted file that contains your > > private key. > > # cert_file is a PEM formatted certificate chain file. > > connection = httplib.HTTPSConnection(host, int(port), key_file, > > cert_file) > > connection.request('POST', path, parameters, head) > > response = connection.getresponse() > > print response.status, response.reason > > except httplib.error, (value,message): > > print value + ':' + message > > finally: > > connection.close() > > > I was wondering if this is due to the server having a invalid server > > cert? > > I'd say judging from the traceback you messed up key_file or cert_file > somehow. > > Try using the openssl binary on them (read the man page to see how!) > to check them out. > > > If I go to this server in my browser, I get a "This server tried to > > identify itself with invalid information". Is there a way to > > ignore this issue with Python? Can I setup a trust store and add > > this server to the trust store? > > Invalid how? Self signed certificate? Domain mismatch? Expired certificate? > > -- > Nick Craig-Wood --http://www.craig-wood.com/nick Nick, Thanks for the help on this. I will check my steps on openssl again and see if I messed up. What I tried to do was: 1. Save my PKI cert to disk. It was saved as a P12 file 2. Use openssl to convert it to the needed .pem file type 3. Saved the CA that my cert was signed by as a .crt file These are the 2 files that I was using for key_file and * cert_file -> CA * key_file -> my PKI cert converted to a .pem file "Invalid how? Self signed certificate? Domain mismatch? Expired certificate?" It is a server name mismatch. For everyone that wants to discuss why we shouldn't do this, great but I can't change the fact that I need to do this. I can't use http or even get a correct cert at this time. This is a quick a dirty project to demonstrate capability. I need something more than slide show briefs. -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 28, 9:48 am, Jean-Paul Calderone wrote: > On Tue, 28 Jul 2009 03:35:55 -0700 (PDT), jakecjacobson > wrote: > > [snip] > > >"Invalid how? Self signed certificate? Domain mismatch? Expired > >certificate?" It is a server name mismatch. > > Python 2.4 is not capable of allowing you to customize this verification > behavior. It is hard coded to let OpenSSL make the decision about whether > to accept the certificate or not. > > Either M2Crypto or pyOpenSSL will let you ignore verification errors. The > new ssl module in Python 2.6 may also as well. > > Jean-Paul Thanks, I will look into these suggestions. -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 29, 2:08 am, "Gabriel Genellina" wrote: > En Tue, 28 Jul 2009 09:02:40 -0300, Steven D'Aprano > escribió: > > > > > On Mon, 27 Jul 2009 23:16:39 -0300, Gabriel Genellina wrote: > > >> I don't see the point on "fixing" either the Python script or httplib to > >> accomodate for an invalid server certificate... If it's just for > >> internal testing, I'd use HTTP instead (at least until the certificate > >> is fixed). > > > In real life, sometimes you need to drive with bad brakes on your car, > > walk down dark alleys in the bad part of town, climb a tree without a > > safety line, and use a hammer without wearing goggles. We can do all > > these things. > > > The OP has said that, for whatever reason, he needs to ignore a bad > > server certificate when connecting to HTTPS. Python is a language where > > developers are allowed to shoot themselves in the foot, so long as they > > do so in full knowledge of what they're doing. > > > So, putting aside all the millions of reasons why the OP shouldn't accept > > an invalid certificate, how can he accept an invalid certificate? > > Yes, I understand the situation, but I'm afraid there is no way (that I > know of). At least not without patching _ssl.c; all the SSL negotiation is > handled by the OpenSSL library itself. > > I vaguely remember a pure Python SSL implementation somewhere that perhaps > could be hacked to bypass all controls. But making it work properly will > probably require a lot more effort than installing a self signed > certificate in the server... > > -- > Gabriel Genellina I have it working and I want to thank everyone for their efforts and very helpful hints. The error was with me and not understanding the documentation about the cert_file & key_file. After using openssl to divide up my p12 file into a cert file and a key file using the instructions http://security.ncsa.uiuc.edu/research/grid-howtos/usefulopenssl.php. I got everything working. Again, much thanks. Jake -- http://mail.python.org/mailman/listinfo/python-list
Help making this script better
Hi, After much Google searching and trial & error, I was able to write a Python script that posts XML files to a REST API using HTTPS and passing PEM cert & key file. It seems to be working but would like some pointers on how to handle errors. I am using Python 2.4, I don't have the capability to upgrade even though I would like to. I am very new to Python so help will be greatly appreciated and I hope others can use this script. #!/usr/bin/python # # catalog_feeder.py # # This sciript will process a directory of XML files and push them to the Enterprise Catalog. # You configure this script by using a configuration file that describes the required variables. # The path to this file is either passed into the script as a command line argument or hard coded # in the script. The script will terminate with an error if it can't process the XML file. # # IMPORT STATEMENTS import httplib import mimetypes import os import sys import shutil import time from urllib import * from time import strftime from xml.dom import minidom def main(c): start_time = time.time() # Set configuration parameters try: # Process the XML conf file xmldoc = minidom.parse(c) catalog_host = readConfFile(xmldoc, 'catalog_host') catalog_port = int(readConfFile(xmldoc, 'catalog_port')) catalog_path = readConfFile(xmldoc, 'catalog_path') collection_name = readConfFile(xmldoc, 'collection_name') cert_file = readConfFile(xmldoc, 'cert_file') key_file = readConfFile(xmldoc, 'key_file') log_file = readConfFile(xmldoc, 'log_file') input_dir = readConfFile(xmldoc, 'input_dir') archive_dir = readConfFile(xmldoc, 'archive_dir') hold_dir = readConfFile(xmldoc, 'hold_dir') except Exception, inst: # I had an error so report it and exit script print "Unexpected error opening %s: %s" % (c, inst) sys.exit(1) # Log Starting logOut = verifyLogging(log_file) if logOut: log(logOut, "Processing Started ...") # Get list of XML files to process if os.path.exists(input_dir): files = getFiles2Post(input_dir) else: if logOut: log(logOut, "WARNING!!! Couldn't find input directory: " + input_dir) cleanup(logOut) else: print "Dir doen't exist: " + input_dir sys.exit(1) try: # Process each file to the catalog connection = httplib.HTTPSConnection(catalog_host, catalog_port, key_file, cert_file) for file in files: log(logOut, "Processing " + file + " ...") try: response = post2Catalog(connection, catalog_path, os.path.join (input_dir, file), collection_name) if response.status == 200: msg = "Succesfully posted " + file + " to cataloge ..." print msg log(logOut, msg) # Move file to done directory shutil.move(os.path.join(input_dir, file), os.path.join (archive_dir, file)) else: msg = "Error posting " + file + " to cataloge [" + response.read () + "] ..." print msg log(logOut, response.read()) # Move file to error dir shutil.move(os.path.join(input_dir, file), os.path.join(hold_dir, file)) except IOError, (errno): print "%s" % (errno) except httplib.HTTPException, (e): print "Unexpected error %s " % (e) run_time = time.time() - start_time print 'Run time: %f seconds' % run_time # Clean up connection.close() cleanup(logOut) # Get an arry of files from the input_dir def getFiles2Post(d): return (os.listdir(d)) # Read out the conf file and set the needed global variable def readConfFile(xmldoc, tag): return (xmldoc.getElementsByTagName(tag)[0].firstChild.data) # Write out the message to log file def log(f, m): f.write(strftime("%Y-%m-%d %H:%M:%S") + " : " + m + '\n') # Clean up and exit def cleanup(logOut): if logOut: log(logOut, "Proce
How to unencode a string
This seems like a real simple newbie question but how can a person unencode a string? In Perl I use something like: "$part=~ s/\%([A-Fa- f0-9]{2})/pack('C', hex($1))/seg;" If I have a string like Word1%20Word2%20Word3 I want to get Word1 Word2 Word3. Would also like to handle special characters like '",(){} [] etc/ -- http://mail.python.org/mailman/listinfo/python-list
Re: How to unencode a string
On Aug 27, 6:51 pm, Piet van Oostrum wrote: > >>>>> jakecjacobson (j) wrote: > >j> This seems like a real simple newbie question but how can a person > >j> unencode a string? In Perl I use something like: "$part=~ s/\%([A-Fa- > >j> f0-9]{2})/pack('C', hex($1))/seg;" > >j> If I have a string like Word1%20Word2%20Word3 I want to get Word1 > >j> Word2 Word3. > > urllib.unquote(string) > > >j> Would also like to handle special characters like '",(){} > >j> [] etc/ > > What would you like to do with them? Or do you mean to replace %27 by ' etc? > -- > Piet van Oostrum > URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4] > Private email: p...@vanoostrum.org Yes, take '%27' and replace with ', etc. -- http://mail.python.org/mailman/listinfo/python-list
How to Convert IO Stream to XML Document
I am trying to build a Python script that reads a Sitemap file and push the URLs to a Google Search Appliance. I am able to fetch the XML document and parse it with regular expressions but I want to move to using native XML tools to do this. The problem I am getting is if I use urllib.urlopen(url) I can convert the IO Stream to a XML document but if I use urllib2.urlopen and then read the response, I get the content but when I use minidom.parse() I get a "IOError: [Errno 2] No such file or directory:" error THIS WORKS but will have issues if the IO Stream is a compressed file def GetPageGuts(net, url): pageguts = urllib.urlopen(url) xmldoc = minidom.parse(pageguts) return xmldoc # THIS DOESN'T WORK, but I don't understand why def GetPageGuts(net, url): request=getRequest_obj(net, url) response = urllib2.urlopen(request) response.headers.items() pageguts = response.read() # Test to see if the response is a gzip/compressed data stream if isCompressedFile(response, url): compressedstream = StringIO.StringIO(pageguts) gzipper = gzip.GzipFile(fileobj = compressedstream) pageguts = gzipper.read() xmldoc = minidom.parse(pageguts) response.close() return xmldoc # I am getting the following error Starting SiteMap Manager ... Traceback (most recent call last): File "./tester.py", line 267, in ? main() File "./tester.py", line 49, in main fetchSiteMap(ResourceDict, line) File "./tester.py", line 65, in fetchSiteMap pageguts = GetPageGuts(ResourceDict['NET'], url) File "./tester.py", line 89, in GetPageGuts xmldoc = minidom.parse(pageguts) File "/usr/lib/python2.4/xml/dom/minidom.py", line 1915, in parse return expatbuilder.parse(file) File "/usr/lib/python2.4/xml/dom/expatbuilder.py", line 922, in parse fp = open(file, 'rb') IOError: [Errno 2] No such file or directory: '\nhttp://www.sitemaps.org/ schemas/sitemap/0.9">\n\nhttp://www.myorg.org/janes/ sitemaps/binder_sitemap.xml\n2010-09-09\n\n\nhttp://www.myorg.org/janes/sitemaps/ dir_sitemap.xml\n2010-05-05\n \n\nhttp://www.myorg.org/janes/sitemaps/ mags_sitemap.xml\n2010-09-09\n \n\nhttp://www.myorg.org/janes/sitemaps/ news_sitemap.xml\n2010-09-09\n \n\nhttp://www.myorg.org/janes/sitemaps/ sent_sitemap.xml\n2010-09-09\n \n\nhttp://www.myorg.org/janes/sitemaps/ srep_sitemap.xml\n2001-05-04\n \n\nhttp://www.myorg.org/janes/sitemaps/yb_sitemap.xml\n2010-09-09\n\n\n' # A couple of supporting things def getRequest_obj(net, url): request = urllib2.Request(url) request.add_header('User-Agent', 'ICES Sitemap Bot dni-ices- searchad...@ugov.gov') request.add_header('Accept-encoding', 'gzip') return request def isCompressedFile(r, u): answer=False if r.headers.has_key('Content-encoding'): answer=True else: # Check to see if the URL ends in .gz if u.endswith(".gz"): answer=True return answer -- http://mail.python.org/mailman/listinfo/python-list