On 7/12/19 12:53 PM, Sam Paython wrote:
This is the code I am writing:
import requests
from bs4 import BeautifulSoup
request = requests.get("https://www.amazon.ca/dp/B07RZFQ6HC";)
content = request.content
soup = BeautifulSoup(content, "html.parser")
element = soup.find("span",{"id":"priceblock_dealprice"})
print(element.text.strip())

and this is the error I am getting:
C:\Users\Sam\PycharmProjects\untitled2\venv\Scripts\python.exe 
C:/Users/Sam/PycharmProjects/untitled2/src/app.py
Traceback (most recent call last):
   File "C:/Users/Sam/PycharmProjects/untitled2/src/app.py", line 9, in <module>
     print(element.text.strip())
AttributeError: 'NoneType' object has no attribute 'text'

Could someone please help?


The err.msg/stack-trace is your friend! The comment about "NoneType" means 'there's nothing there' (roughly!) to print().

The question then becomes: "why?" or "why not?"...

With a short piece of code like this, and (I am assuming) trying-out a library for the first time, may I recommend that you use the Python REPL, because it allows you to 'see' what's going-on behind the scenes/underneath the hood - and ultimately, reveals the problem.

From a Python terminal (cmd is appropriate to your PC's OpSys):

[dn@JrBrown ~]$ python3
Python 3.7.4 (default, Jul  9 2019, 16:48:28)
[GCC 8.3.1 20190223 (Red Hat 8.3.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> from bs4 import BeautifulSoup
>>> request = requests.get("https://www.amazon.ca/dp/B07RZFQ6HC";)
>>> request            # notice how I'm asking to 'see' what happened
<Response [503]>
>>> content = request.content
>>> content            # there is no need to enclose in print()!
b'<!DOCTYPE html>\n<!--[if lt IE 7]> <html lang="en-us" class="a-no-js ...many lines of HTML, excised in the interests of brevity...
\')[0].appendChild(elem);\n    }\n    </script>\n</body></html>\n'
>>> soup = BeautifulSoup(content, "html.parser")
>>> soup
<!DOCTYPE html>
...many more lines of HTML...
</body></html>

>>> element = soup.find("span",{"id":"priceblock_dealprice"})
>>> element
>>>

The last entry is asking for the contents of "element" to be displayed - and they are, excepting that element contains nothing/None. Oops!


Working 'backwards' (and using 'simple' Python functions to prove that it is not our use of requests/BS4 that is at-fault):

>>> soup.find( "price" )             # not found

>>> content.find( b"price" )         # the b"" is necessary because
                                        # we are dealing with bytes
                                        # not a Unicode string
-1
>>>                                    #

Sadly, the -1 indicates that "price" was not found. Which is bound to be disappointing to you.


Yet all is not lost!

If you read the HTML data that the REPL has happily splattered all over your terminal's screen (scroll back) (NB "soup" is easier to read than is "content"!) you will observe that what you saw in your web-browser is not what Amazon served in response to the Python "requests.get()"!
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to