You seem to misunderstand how curl works. Curl will create a custom http request based on whatever parameter you tell it to. Some of those parameters will be defaulted if not specified.
For example the following headers: user-agent --> will be set to "curl/VERSION" host --> "dns name of the URL you are accessing, e.g. www.google.com" tls will be automatically handled by curl amongst other things Try adding a "-v" flag in order to get more detailed information. For example the request headers you are sending to the server. If its too much information, you can instead use "-i", which will print the response headers before printing the body. Or alternatively try with "-I" (capital i) or "--head" which will only send a Head request instead of a GET request. Only showing you response header information. curl -L https://www.softpedia.com/get/System/Back-Up-and-Recovery/Icedrive.shtml#download -o eraseme.html Several information here: 1. you are just sending a basic get request. No authorization, no cookie, no metainformation. --> yes that will most likely get you blocked by cloudflare. Its not even the server itself that blocked you, but cloudflare - to protect the actual server from crawlers or ddos attacks. (This is information which you didn't include in your post, but would've been relevant.) 2. "#download" is basically irrelevant in any request, because it will not be transmitted to the server. 3. you dont need to write the response into a file every time. curl will just write to stdout whatever the response is. --> you could use ' | grep "text" ' to filter for a specific text from the response. 4. the "content-type" response header will provide information about what content the response body contains. it may not always be an html. Sending requests blindly will most likely get you blocked. While curl is powerful, its not a magic tool to automatically get you what you need. That will sometimes require multiple, carefully crafted requests with understanding of what you want to get and what the server requires you to send. Thats why I said to try to find a developer api, because it will contain the necessary information that you need to include in the request, so that you wont get blocked. Also please do respect servers TOS. If they have an API, use that. If they dont, ask the server owner before you randomly send them requests to get information out of their websites. (Because that is exactly what a crawler is doing, more or less) curl is just a tool to send and receive http requests (or other protocols, but i wont go into detail here). If you want to improve your knowledge about how http works, I can recommend you the dev tools of your browser, where you can look into requests, what the requests and responses look like, what headers they include and some metadata. That will give you a pretty good idea of what http does. If you are interested to get into even more detail, you can look into the official http spec here: HTTP Spec: https://www.rfc-editor.org/rfc/rfc9110.html Additionally, please keep in mind that you are sending every email to multiple hundred (or thousands?) users on this mailing list. --- Bastian On Sat, Oct 18, 2025 at 3:51 PM ToddAndMargo via curl-users < [email protected]> wrote: > On 10/18/25 6:44 AM, ToddAndMargo via curl-users wrote: > >>> On Sat, Oct 18, 2025 at 1:22 PM ToddAndMargo via curl-users <curl- > >>> [email protected] <mailto:[email protected]>> wrote: > >>> > >>> On 10/18/25 3:06 AM, Daniel Stenberg wrote: > >>> > On Sat, 18 Oct 2025, ToddAndMargo via curl-users wrote: > >>> > > >>> >> How do I get around "You've been blocked" on this web sire: > >>> > > >>> > Presumably you get blocked by a site if you somehow violate > their > >>> terms > >>> > or use or their perception of good behavior. > >>> > > >>> > A primary way to not get blocked would be to not do that. To > >>> understand > >>> > the exact specifics and reasons, you would have to ask the > admins > >>> of the > >>> > website in question. > >>> > > >>> > >>> I am not doing anything different than I ever do. > >>> Do you see anything wrong with my code? > >>> -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl- > >>> users > >>> <https://lists.haxx.se/mailman/listinfo/curl-users> > >>> Etiquette: https://curl.se/mail/etiquette.html <https://curl.se/ > >>> mail/etiquette.html> > >>> > >>> > > > > On 10/18/25 6:29 AM, Bastian Jesuiter via curl-users wrote: > >> You didnt even post code. > > > > From my original post: > > > > curl -L https://www.softpedia.com/get/System/Back-Up-and-Recovery/ > > Icedrive.shtml#download -o eraseme.html > > > >> But regardless of that - if you get blocked, you'll need to resolve > >> that with the admin of the page. > >> Maybe you can even ask (them) if you may get an API Documentation for > >> Developers and or explain what your use case is and how you should > >> proceed. > >> Some websites are explicitly custom fetching, others - like the > >> icedrive website you shared recently - do not. > >> > >> You are most likely being catched by an AI Crawler blocker. Presumably > >> because your requests are similar to the behavior of an AI or > >> otherwise automated scraper. > >> Dont do that. > > > > Curl does that? > > > >> Try to find dev documentation or ask the service admin as Daniel said. > >> > >> --- > >> Bastian > > > > The web page that I get states: > > > > What can I do to resolve this? > > > > You can email the site owner to let them know you > > were blocked. Please include what you were doing > > when this page came up and the Cloudflare Ray ID > > found at the bottom of this page. > > > > Cloudflare Ray ID: 990732037c51c798 • Your IP: > > • Performance & security by Cloudflare > > > > And there is noting at the bottom of the page. > > > > > I just wrote Softpedia > -- > Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-users > Etiquette: https://curl.se/mail/etiquette.html >
-- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-users Etiquette: https://curl.se/mail/etiquette.html
