Cloudflare, for whatever reason, appears to be rejecting the `User-
Agent` header that urllib is providing:`Python-urllib/3.9`. Using a
different `User-Agent` seems to get around the issue:
import urllib.request
req = urllib.request.Request(
url="https://juno.sh/direct-connection-to-jupyter-s
On Wed, Dec 8, 2021 at 4:51 AM Julius Hamilton
wrote:
>
> Hey,
>
> I am currently working on a simple program which scrapes text from webpages
> via a URL, then segments it (with Spacy).
>
> I’m trying to refine my program to use just the right tools for the job,
> for each of the steps.
>
> Reque
Hey,
I am currently working on a simple program which scrapes text from webpages
via a URL, then segments it (with Spacy).
I’m trying to refine my program to use just the right tools for the job,
for each of the steps.
Requests.get works great, but I’ve seen people use urllib.request.urlopen()
i