Using Beautiful Soup
Heya. I have never used a module/script before, and the first problem I have run into is that I do not know how to install a module/script. I have downloaded Beautiful Soup, but how do I use it in one of my own programs? I know that I use an "include" statement, but do I first need to make a copy of BeautifulSoup.pyc or BeautifulSoup.py into the Python directory? Thanks in advanced for any and all help that you may provide. Many thanks. -- http://mail.python.org/mailman/listinfo/python-list
Extracting text from a string
Hello. I am having a little trouble extracting text from a string. The string that I am dealing with is pasted below, and I want to extract the prices that are contained in the string below. Thanks in advanced for any and all help. Thank you. $14.99 , $27.99 , $66.99 , $129.99 , $254.99 -- http://mail.python.org/mailman/listinfo/python-list
Re: Extracting text from a string
Okay, so it sounds like I am in the right direction. However, I am not sure that the text is in a string or some other format becasue the string is enclosed in "[" and "]", not in ' '. -- http://mail.python.org/mailman/listinfo/python-list
Re: Extracting text from a string
This is the output I get: >>> prices [ $14.99 , $27.99 , $66.99 , $129.99 , $254.99 ] >>> -- http://mail.python.org/mailman/listinfo/python-list
split string problems
Hey. I am trying to grab the prices from the string below but I get a few errors when I try to do it: Take a look at the code and error messages below for me and thanks you in advanced to all that help. Thank you. Here's the code & error messages: >>> p [ $14.99 , $27.99 , $66.99 , $129.99 , $254.99 ] >>> p.split()[2] Traceback (most recent call last): File "", line 1, in -toplevel- p.split()[2] AttributeError: 'ResultSet' object has no attribute 'split' >>> -- http://mail.python.org/mailman/listinfo/python-list
urlopen() error
Hello. I am getting an error and it has gotten me stuck. I think the best thing I can do is post my code and the error message and thank everybody in advanced for any help that you give this issue. Thank you. # Here's the code: # import urllib2 import re import xlrd from BeautifulSoup import BeautifulSoup book = xlrd.open_workbook("ige_virtualMoney.xls") sh = book.sheet_by_index(0) rx = 1 for rx in range(sh.nrows): u = sh.cell_value(rx, 0) page = urllib2.urlopen(u) soup = BeautifulSoup(page) p = soup.findAll('span', "sale") p = str(p) p2 = re.findall('\$\d+\.\d\d', p) for price in p2: print price ## Here are the error messages: ## Traceback (most recent call last): File "E:\Python24\scraper.py", line 16, in -toplevel- page = urllib2.urlopen(u) File "E:\Python24\lib\urllib2.py", line 130, in urlopen return _opener.open(url, data) File "E:\Python24\lib\urllib2.py", line 350, in open protocol = req.get_type() File "E:\Python24\lib\urllib2.py", line 233, in get_type raise ValueError, "unknown url type: %s" % self.__original ValueError: unknown url type: List -- http://mail.python.org/mailman/listinfo/python-list
Pre-defining an action to take when an expected error occurs
Hello. I am getting the error that is displayed below, and I know exactly why it occurs. I posted some of my program's code below, and if you look at it you will see that the error terminates the program pre-maturely. Becasue of this pre-mature termination, the program is not able to execute it's final line of code, which is a very important line. The last line saves the Excel spreadsheet. So is there a way to make sure the last line executes? Thanks in advanced for all of the help. Thank you. Error IndexError: list index out of range Code Sample ### for rx in range(sh.nrows): rx = rx +1 u = sh.cell_value(rx, 0) u = str(u) if u != end: page = urllib2.urlopen(u) soup = BeautifulSoup(page) p = soup.findAll('span', "sale") p = str(p) p2 = re.findall('\$\d+\.\d\d', p) for row in p2: ws.write(r,0,row) w.save('price_list.xls') -- http://mail.python.org/mailman/listinfo/python-list
Re: Pre-defining an action to take when an expected error occurs
Thanks for all of the help. It all has been very useful to an new python programmer. I agree that I should fix the error/bug instead of handeling it with a try/etc. However, I do not know why "range(sh.nrows)" never gets the right amount of rows right. For example, if the Excel sheet has 10 rows with data in them, the statement "range(sh.nrows)" should build the list of numbers [0, 1,...9]. It should, but it doesn't do that. What it does is buld a list from [0, 1...20] or more or a little less, but the point is that it always grabs empy rows after the last row containing data. Why is that? I have no idea why, but I do know that that is what is producing the error I am getting. Thanks again for the responses that I have received already, and again thanks for any further help. Thanks you. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pre-defining an action to take when an expected error occurs
John Machin thanks for all of your help, and I take responsibility for the way I worded my sentences in my last reply to this topic. So in an effort to say sorry, I want to make it clear to everybody that it seems as though errors in my code and use of external programs (Excel in particular) are making "range(sh.nrows)" have faulty results. I am trying to pinpoint the spot in my code or use of Excel, before "range(sh.nrows) is executed, that is bugged. John Machin, I am thrilled that the package xlrd exists at all because it simplifies a daunting task for a beginner programer--me. Its uses are not bound to beginners either. So thanks for the package and your help to this point. -- http://mail.python.org/mailman/listinfo/python-list
Re: xlrd number of rows in worksheet (was: Re: Pre-defining an action to take when an expected error occurs)
It worked. Those two functions (usefulness_of_cells & number_of_good_rows) seem to work flawlessly...knock on wood. I have run a number of different Excel spreadsheets through the functions, and so far the functions have a 100% acuracy rating. The acuracy rating is based on the functions' returned number of cells containing text, excluding a space as text, against the actual, hand counted number of cells with text. Thank you John Machin for all of your help. I am using these two functions, with your name tagged to them both. Let me know if that's a problem. Thank you again. -- http://mail.python.org/mailman/listinfo/python-list
Concerning Regular Expressions
I've been reading a bunch of articles and tutorials on the net, but I cannot quite get ahold of the whole regular expression process. I have a list that contains about thirty strings, each in its own spot in the list. What I want to do is search the list, say it's called 'lines', for 'R0 -'. Thanks in advanced for any and all info that I recieve. -Tempo- -- http://mail.python.org/mailman/listinfo/python-list
Re: Concerning Regular Expressions
Whoops. I've got another tid-bit to tack onto this post. What kind of value does this expression return: re.sub(r'^R0 -', line) Does it return a '1' if successful and a '0' if not successful? -- http://mail.python.org/mailman/listinfo/python-list
Re: Concerning Regular Expressions
You are right that my move towards regular expressions was way premature, but also this post may too turn out to be a little premature. I guessed and checked myself a way to accomplish what I needed and I will include it in this post. But first Alex (doesn't have to be Alex) could you tell me if your snipplet and mine would be near perfect subsitutes for one another? I believe they accomplish the same task of looking for 'R0 -' in the list 'lines', however, as you have guessed, I do not know my way around Python very well yet. Here is my snipplet: log = open('C:\log_0.txt') lines = log.readlines() import re for line in lines: R = re.search(r'^R0', line) if R != None: n = 1 print n import time time.sleep(3) log.close() -- http://mail.python.org/mailman/listinfo/python-list
Is Python good for web crawlers?
I was wondering if python is a good language to build a web crawler with? For example, to construct a program that will routinely search x amount of sites to check the availability of a product. Or to search for news articles containing the word 'XYZ'. These are just random ideas to try to explain my question a bit further. Well if you have an opinion about this please let me know becasue I am very interested to hear what you have to say. Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is Python good for web crawlers?
Why do you say that the bottleneck of the crawler will always be downloading the page? Is it becasue there isn't already a modual to do this and I will have to start from scratch? Or a bandwidth issue? -- http://mail.python.org/mailman/listinfo/python-list
Re: Is Python good for web crawlers?
Does a web crawler have to download an entire page if it only needs to check if the product is in stock on a page? Or if it just needs to search for one match of a certain word on a page? -- http://mail.python.org/mailman/listinfo/python-list
Re: Is Python good for web crawlers?
I took your advice and got a copy of BeautifulSoup, but I am having trouble installing the module. Any advice? I noticed that I just can't put it into the 'lib' directory of python to install it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is Python good for web crawlers?
I agree. I think the way that I will learn to use most of it is by going through the source code. -- http://mail.python.org/mailman/listinfo/python-list
HTML page into a string
In my last post I received some advice to use urllib.read() to get a whole html page as a string, which will then allow me to use BeautifulSoup to do what I want with the string. But when I was researching the 'urllib' module I couldn't find anything about its sub-section '.read()' ? Is that the right module to get a html page into a string? Or am I completely missing something here? I'll take this as the more likely of the two cases. Thanks for any and all help. -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML page into a string
Perfect. Thanks a bunch for clearing that all up for me. You have delayed some long lost hours for me. -- http://mail.python.org/mailman/listinfo/python-list
Python and ASP
I recently uploaded a sample ASP-Python page to my web server and it didn't show up correctly. Before I explain what it did, I should mention that I got the same result when I tried to view the page from my desktop (winxp user). So when I tried to view the sample ASP with Python page from my desktop and web server, all that showed up was the source code. I'm not sure exactly what this means since I know that python 2.4 is installed on my computer and the ASP page still didn't show up correctly. Here's the sample ASP-Python code, courtesy of http://www.4guysfromrolla.com/webtech/082201-1.shtml : ASP-Python Test Page <%@ Language=Python %> <% Response.Write("Python lives in the ASP delimeters!") %> document.write("Python's throwing a party on the client-side!") Response.Write("Python gets ready to rumble inside a server-side scripting block!") Any ideas about what I am doing wrong? Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and ASP
It still doesn't work. I fixed that one error that you pointed out Roger Upole, but it still isn't working. All I did was copy and past the code above, plus Roger's fix, into Notepad2 and saved it as a '.asp'. When I opened it in Firefox, all that showed up was the source code of the file. It seems like it is reading the '.asp' file as a text file in the web browser. Any further ideas? Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and ASP
What do you mean? I can't just upload the file to the server that is going to host my site? -- http://mail.python.org/mailman/listinfo/python-list
Python, Forms, Databases
I have been looking around for a few days for ways to use Python with HTML forms. What I am interested in doing is placing the data that is submited through an HTML form and collected by Python into a MySQL database. I initially thought that I was going to be able to do this with ASP, but I found out that my web hosting provider doesn't have ASP support installed into their servers. Also, I couldn't find any dirt cheap hosting providers that did support Pyton and ASP together. The next solution I stumbled upon was to use Zope. However, I wasn't sure if I should spend more of my time looking into this or not, and I found a possible way around this, which is by using the CGI module. As you can see, I am somewhat new to Python web programming and I have confused myself. Can anybody name a few modules that I should look into and these modules are the ones that will allow me to accomplish what I am interested in doing? Thanks for your help. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python, Forms, Databases
Larry I do see your point. There does seem to be a lot more support for PHP and MySQL together than there is Python and ASP. But I want to first try to accomplish my goal by using Python first before I give up and revert back to PHP. So if I was going to parse HTML forms and place the data into a MySQL database, what should I use? CGI module? Zope? Webware? Thanks for any and all help. -- http://mail.python.org/mailman/listinfo/python-list
What are COM-enabled applications?
As the subject of this post suggests, I have one question; what are COM-enabled applications? I believe Microsoft Word is one of these apps, but what else? Is a web browser, Paint, Solitare, games, etc? I'm not sure if it varies from operating system to operating system, but I am talking about COM applications in Windows. Thanks for any and all of your help and time. -- http://mail.python.org/mailman/listinfo/python-list
Copy a file from PDA
Are there libraries out there that will assist me in copying a file from my Dell Axim PDA (Windows Mobile) and putting the copy onto my desktop (Windows XP)? Thanks so much. -- http://mail.python.org/mailman/listinfo/python-list
Build EXE on Mac OsX 10.4
Has anyone sucesfully built a *.exe file on a mac operating system before from a *.py file? I have been trying to do this with pyinstaller, but I keep getting errors and I don't know how to install UPX properly. I tried putting the linux UPX folder in my python 2.4 directory, but that didn't work. I am just generally confused right now. Ha. If anybody can lend me some insight I would really appreciate it. Thank you for taking the time to read this post. -b -- http://mail.python.org/mailman/listinfo/python-list