Scraping of unknown/untrusted websites is a common task in certain...fields? I don't want to comment on it too deeply, but I know that is something folks would do.
Imagine a site where someone inputs a URL, clicks submit, and then with the power of funding they return a summary of the page. On Mon, Jul 29, 2024, 3:52 PM robert engels <reng...@ix.netcom.com> wrote: > Isn’t the HttpClient almost always used to access other services? > > Why would a developer access a malicious service? > > I also think there are lots of ways for a service to crash the client - > .e.g it could attempt to return a very large response - if the client uses > a memory buffered reader, it will cause an OOM as well. > > On Jul 29, 2024, at 2:42 PM, Andy Boothe <andy.boo...@gmail.com> wrote: > > Following up here. > > I believe I have discovered that it is possible to craft a malicious HTTP > response that can cause the built-in HttpURLConnection and HttpClient > implementations to throw exceptions. Specifically, HttpURLConnection can be > made to throw a NegativeArraySizeException, and HttpClient can be made to > throw an OutOfMemoryError. Proof of this behavior is in the attached (very > simple) Java programs. > > This seems like A Bad Thing to me. > > I've moved from the dev list to this list based on a recommendation from > that list. Is this the right list? If not, can you point me in the right > direction? Perhaps a security list? > > Thank you, > > Andy Boothe > *Email*: andy.boo...@gmail.com > *Mobile*: (979) 574-1089 > On Wed, Jul 24, 2024 at 4:47 PM Andy Boothe <andy.boo...@gmail.com> wrote: > >> Hello, >> >> I'm moving this thread from jdk-dev to this list on the sage advice of >> Pavel Rappo. >> >> As a brief recap, it looks like HttpClient and HttpURLConnection do not >> currently support a way to set the maximum acceptable response header >> length. As a result, sending HTTP requests with these classes that result >> in a response with very long headers causes an OutOfMemoryError and a >> NegativeArraySizeException, respectively. (Simple programs for reproducing >> the issue are attached.) This seems like A Bad Thing. There is a (very >> brief) discussion in the thread about how to handle, but of course you guys >> are the experts. >> >> If my head is on straight and this turns out to be a real issue as >> opposed to a mistake on my part, I'm keen to help however I can. >> >> Andy Boothe >> *Email*: andy.boo...@gmail.com >> *Mobile*: (979) 574-1089 >> >> >> ---------- Forwarded message --------- >> From: Pavel Rappo <pavel.ra...@oracle.com> >> Date: Wed, Jul 24, 2024 at 4:30 PM >> Subject: Re: Very long response headers and java.net.http.HttpClient? >> To: Andy Boothe <andy.boo...@gmail.com> >> Cc: jdk-...@openjdk.org <jdk-...@openjdk.org> >> >> >> A proper list would be net-dev at openjdk.java.net. >> >> > On 24 Jul 2024, at 21:13, Andy Boothe <andy.boo...@gmail.com> wrote: >> > >> > Hello, >> > >> > I'm documenting some guidelines for using java.net.http.HttpClient >> defensively for my team. For example: "Always set a request timeout", >> "Don't assume HTTP response entities are small and/or will fit in memory", >> etc. >> > >> > One guideline I'd like to document is "Set a maximum for HTTP response >> header size." However, I can't seem to find a way to set that limit, either >> in documentation or in OpenJDK code. >> > >> > I tried my best to search the archives for this mailing list for any >> mentions, but came up empty. >> > >> > To make sure my head is on straight and there isn't an undocumented >> limit set by default, I wrote the attached (very quick and dirty) client >> and server programs. LongResponseHeaderDemoServer opens a raw server socket >> and reads (what it assumes is) a well-formed HTTP request, and then prints >> an HTTP response which includes a response header of infinite length. >> LongResponseHeaderDemoHttpClient uses java.net.http.HttpClient to make a >> request and print the response body. >> > >> > When I run LongResponseHeaderDemoServer in one terminal and make a curl >> request to the server in another terminal, this is what curl spits out: >> > >> > $ curl -vvv -D - http://localhost:3000 >> > * Host localhost:3000 was resolved. >> > * IPv6: ::1 >> > * IPv4: 127.0.0.1 >> > * Trying [::1]:3000... >> > * Connected to localhost (::1) port 3000 >> > > GET / HTTP/1.1 >> > > Host: localhost:3000 >> > > User-Agent: curl/8.6.0 >> > > Accept: */* >> > > >> > < HTTP/1.1 200 OK >> > HTTP/1.1 200 OK >> > < Content-Type: text/plain >> > Content-Type: text/plain >> > < Connection: close >> > Connection: close >> > < Content-Length: 3 >> > Content-Length: 3 >> > * Closing connection >> > curl: (100) A value or data field grew larger than allowed >> > >> > So curl detects the long response header and bails out. Safe and sane. >> > >> > However, when I run LongResponseHeaderDemoServer in one terminal and >> run LongResponseHeaderDemoHttpClient in another terminal, this is what >> happens: >> > >> > $ java LongResponseHeaderDemoHttpClient >> > Exception in thread "main" java.io.IOException: Requested array size >> exceeds VM limit >> > at >> java.net.http/jdk.internal.net.http.HttpClientImpl.send(HttpClientImpl.java:966) >> > at >> java.net.http/jdk.internal.net.http.HttpClientFacade.send(HttpClientFacade.java:133) >> > at >> LongResponseHeaderDemoHttpClient.main(LongResponseHeaderDemoHttpClient.java:13) >> > Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM >> limit >> > at java.base/java.util.Arrays.copyOf(Arrays.java:3541) >> > at >> java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:242) >> > at >> java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:806) >> > at java.base/java.lang.StringBuilder.append(StringBuilder.java:246) >> > at >> java.net.http/jdk.internal.net.http.Http1HeaderParser.readResumeHeader(Http1HeaderParser.java:250) >> > at >> java.net.http/jdk.internal.net.http.Http1HeaderParser.parse(Http1HeaderParser.java:124) >> > at >> java.net.http/jdk.internal.net.http.Http1Response$HeadersReader.handle(Http1Response.java:605) >> > at >> java.net.http/jdk.internal.net.http.Http1Response$HeadersReader.handle(Http1Response.java:536) >> > at >> java.net.http/jdk.internal.net.http.Http1Response$Receiver.accept(Http1Response.java:527) >> > at >> java.net.http/jdk.internal.net.http.Http1Response$HeadersReader.tryAsyncReceive(Http1Response.java:583) >> > at >> java.net.http/jdk.internal.net.http.Http1AsyncReceiver.flush(Http1AsyncReceiver.java:233) >> > at >> java.net.http/jdk.internal.net.http.Http1AsyncReceiver$$Lambda/0x00000008010dbd50.run(Unknown >> Source) >> > at >> java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:182) >> > at >> java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149) >> > at >> java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:207) >> > at >> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) >> > at >> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) >> > at java.base/java.lang.Thread.runWith(Thread.java:1596) >> > at java.base/java.lang.Thread.run(Thread.java:1583) >> > >> > Ostensibly, HttpClient just keeps on reading the never-ending header >> until it OOMs. This seems to confirm that there is no default limit to >> header size. It also seems like A Very Bad Thing to me. This suggests that >> any time a program makes an HTTP request to an untrusted source using >> HttpClient, for example when crawling the web, they are at risk of an OOM. >> > >> > For grins, I also wrote an application >> LongResponseHeaderDemoHttpURLConnection that does the same thing as >> LongResponseHeaderDemoHttpClient, just using HttpURLConnection instead of >> HttpClient. When I run LongResponseHeaderDemoServer in one terminal and >> LongResponseHeaderDemoHttpURLConnection in another terminal, this is what >> happens: >> > >> > $ java LongResponseHeaderDemoHttpURLConnection >> > Exception in thread "main" java.lang.NegativeArraySizeException: >> -1610612736 >> > at >> java.base/sun.net.www.MessageHeader.mergeHeader(MessageHeader.java:526) >> > at >> java.base/sun.net.www.MessageHeader.parseHeader(MessageHeader.java:481) >> > at >> java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:804) >> > at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:726) >> > at >> java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1688) >> > at >> java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589) >> > at java.base/java.net.URL.openStream(URL.java:1161) >> > at >> LongResponseHeaderDemoHttpURLConnection.main(LongResponseHeaderDemoHttpURLConnection.java:12) >> > >> > So HttpURLConnection doesn't handle things gracefully either, but at >> least it doesn't OOM. That seems like a bug, too, but perhaps less severe. >> > >> > For reference, here's my java version: >> > >> > $ java -version >> > openjdk version "21.0.2" 2024-01-16 LTS >> > OpenJDK Runtime Environment Corretto-21.0.2.13.1 (build 21.0.2+13-LTS) >> > OpenJDK 64-Bit Server VM Corretto-21.0.2.13.1 (build 21.0.2+13-LTS, >> mixed mode, sharing) >> > >> > Can anyone check my work, and maybe reproduce? And ideally, can someone >> with more knowledge than me about java.net.http.HttpClient and/or >> java.net.HttpURLConnection please comment? Is this real, or have I made a >> mistake somewhere along the way? If it's real, what's next? A bug report? >> > >> > Andy Boothe >> > Email: andy.boo...@gmail.com >> > Mobile: (979) 574-1089 >> >> <LongResponseHeaderDemoHttpClient.java> > <LongResponseHeaderDemoHttpURLConnection.java> > <LongResponseHeaderDemoServer.java> > > >