[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768380#comment-17768380
 ] 

Michael Lee commented on HTTPCLIENT-2300:
-----------------------------------------

[~olegk] The fix (changing AbstractCharResponseConsumer's default charset from 
UTF_8 to US_ASCII) works.

But it also causes the response body retrieved by AbstractCharResponseConsumer 
to be different from that retrieved by SimpleResponseConsumer because 
SimpleBody's default charset is still US_ASCII.

Somehow, [String(byte[] bytes, Charset 
charset)|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#String-byte:A-java.nio.charset.Charset-]
 is able to create a string from an UTF_8 byte array even if US_ASCII is the 
specified charset, although the non-ASCII characters become corrupted in the 
resulting string.

If the byte stream to char stream conversion code in HttpCore is not able to 
handle an UTF_8 byte stream as an US_ASCII byte stream, should this fix also 
update the default charset of other classes in the 
org.apache.hc.client5.http.async.methods package (i.e. SimpleBody and 
AbstractCharPushConsumer) to ensure that their behavior is consistent?

Thanks.

> AbstractCharDataConsumer throws java.nio.charset.MalformedInputException for 
> an URI that SimpleResponseConsumer can handle
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-2300
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2300
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (async)
>    Affects Versions: 5.2.1
>         Environment: openjdk version "17.0.8.1" 2023-08-24
> OpenJDK Runtime Environment Temurin-17.0.8.1+1 (build 17.0.8.1+1)
> OpenJDK 64-Bit Server VM Temurin-17.0.8.1+1 (build 17.0.8.1+1, mixed mode, 
> sharing)
>            Reporter: Michael Lee
>            Priority: Minor
>             Fix For: 5.2.2, 5.3-alpha2
>
>         Attachments: sample.zip
>
>
> HttpAsyncClient is able to retrieve the response body of 
> [https://www.videolan.org/vlc/] using a SimpleResponseConsumer but not a 
> trivial subclass of AbstractCharResponseConsumer. Using the latter, 
> AbstractCharDataConsumer.consume throws an exception. Excerpt:
>  
> {{java.nio.charset.MalformedInputException: Input length = 1}}
> {{    at 
> java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)}}
> {{    at 
> org.apache.hc.core5.http.nio.entity.AbstractCharDataConsumer.checkResult(AbstractCharDataConsumer.java:103)}}
> {{    at 
> org.apache.hc.core5.http.nio.entity.AbstractCharDataConsumer.consume(AbstractCharDataConsumer.java:156)}}
> {{    at 
> org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec$1.consume(HttpAsyncMainClientExec.java:243)}}
> {{    at 
> org.apache.hc.core5.http2.impl.nio.ClientH2StreamHandler.consumeData(ClientH2StreamHandler.java:235)}}
>  
> On the other hand, the same code works fine for other URLs such as 
> [https://httpbin.org/]
>  
> Attached HttpAsyncClientTests.java illustrates the issue. The method 
> testSimple using SimpleResponseConsumer works fine, but the method 
> testStreaming using a trivial subclass of AbstractCharResponseConsumer does 
> not.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to