On 12/22/2022 8:35 AM, hongy...@gmail.com wrote:
I want to extract / scrape the “Matrix form” dataset from the BCS website [1],
a.k.a., the data appeared in the 3rd column.
I tried with the following python code snippet, but still failed to figure out
the trick:
Tell what you observed, and what you expected. For example, does the
data get downloaded? Do you get error messages, and if so what are
they? Does the id variable contain anything at all? Etc.
import requests
from bs4 import BeautifulSoup
import re
proxies = {
'http': 'socks5h://127.0.0.1:18888',
'https': 'socks5h://127.0.0.1:18888'
}
requests.packages.urllib3.disable_warnings()
r =
requests.get('https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane',
proxies=proxies, verify=False)
soup = BeautifulSoup(r.content, features="lxml")
table = soup.find('table')
id = table.find_all('id')
My python environment is as follows:
werner@X10DAi:~$ pyenv shell datasci
(datasci) werner@X10DAi:~$ python --version
Python 3.11.1
Any tips will be appreciated.
[1]
https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane
Regards,
Zhao
--
https://mail.python.org/mailman/listinfo/python-list