On 12/22/2022 8:35 AM, hongy...@gmail.com wrote:
I want to extract / scrape the “Matrix form” dataset from the BCS website [1], 
a.k.a., the data appeared in the 3rd column.

I tried with the following python code snippet, but still failed to figure out 
the trick:

Tell what you observed, and what you expected. For example, does the data get downloaded? Do you get error messages, and if so what are they? Does the id variable contain anything at all? Etc.

import requests
from bs4 import BeautifulSoup
import re

proxies = {
     'http': 'socks5h://127.0.0.1:18888',
     'https': 'socks5h://127.0.0.1:18888'
}

requests.packages.urllib3.disable_warnings()
r = 
requests.get('https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane',
 proxies=proxies, verify=False)
soup = BeautifulSoup(r.content, features="lxml")

table = soup.find('table')
id = table.find_all('id')

My python environment is as follows:

werner@X10DAi:~$ pyenv shell datasci
(datasci) werner@X10DAi:~$ python --version
Python 3.11.1

Any tips will be appreciated.

[1] 
https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane

Regards,
Zhao

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to