[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922871#comment-17922871
 ] 

Hiran Chaudhuri edited comment on HTTPCLIENT-2360 at 2/1/25 5:22 PM:
---------------------------------------------------------------------

Now it is getting interesting. I agree the code looks promising.

Here is a small Netbeans/Maven project I created (The project is zipped and 
attached to this issue.). log4j2 is configured such that httpclient logs all 
wire communication to the console.

The project creates three small files with national special characters both in 
the filename as well as in the content. Then it posts these files to the Apache 
website. (I do not care about the server's response). On the console I see this 
stuff:
{code:java}
...
23:49:44.244 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-0 >> 
"Content-Disposition: form-data; name="document"; 
filename="LEGACY_[0xffffffe4][0xfffffff6][0xfffffffc][0xffffffdf]?.test"[\r][\n]"
...

...
23:49:44.388 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-1 >> 
"Content-Disposition: form-data; name="document"; 
filename="STRICT_[0xffffffe4][0xfffffff6][0xfffffffc][0xffffffdf]?.test"[\r][\n]"
...

...
23:49:44.466 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-2 >> 
"Content-Disposition: form-data; name="document"; 
filename="EXTENDED_%C3%A4%C3%B6%C3%BC%C3%9F%E1%9C%A3.test"[\r][\n]"
...
  {code}
 

As mentioned above:
 * I do not see any `filename*` header field.
 * I do not see the UTF-8'' prefix for UTF-8 data.


was (Author: hiranchaudhuri):
Now it is getting interesting. I agree the code looks promising.

Here is a small Netbeans/Maven project I created (The project is zipped and 
attached to this issue.). log4j2 is configured such that httpclient logs all 
wire communication to the console.

The project creates three small files with national special characters both in 
the filename as well as in the content. Then it posts these files to the Apache 
website. (I do not care about the server's response). On the console I see this 
stuff:
{code:java}
...
23:49:44.244 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-0 >> 
"Content-Disposition: form-data; name="document"; 
filename="LEGACY_[0xffffffe4][0xfffffff6][0xfffffffc][0xffffffdf]?.test"[\r][\n]"
...

...
23:49:44.388 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-1 >> 
"Content-Disposition: form-data; name="document"; 
filename="STRICT_[0xffffffe4][0xfffffff6][0xfffffffc][0xffffffdf]?.test"[\r][\n]"
...

...
23:49:44.466 [main] DEBUG org.apache.hc.client5.http.wire - http-outgoing-2 >> 
"Content-Disposition: form-data; name="document"; 
filename="EXTENDED_%C3%A4%C3%B6%C3%BC%C3%9F%E1%9C%A3.test"[\r][\n]"
...
  {code}
 

As mentioned above:
 * I do not see any `filename*` header field.
 * I do not see the UTF-8'' markup for UTF-8 content.

> rfc6266 support in MIME multipart
> ---------------------------------
>
>                 Key: HTTPCLIENT-2360
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2360
>             Project: HttpComponents HttpClient
>          Issue Type: Improvement
>          Components: HttpClient (classic)
>    Affects Versions: 5.4.1
>            Reporter: Hiran Chaudhuri
>            Priority: Minor
>              Labels: volunteers-wanted
>             Fix For: Stuck
>
>         Attachments: TestHttpClient.zip
>
>
> The following code creates a bad HTTP request:
>  
> {code:java}
> File file = ...
> FileBody fileBody = new FileBody(sourceFile);
> MultipartEntityBuilder builder = MultipartEntityBuilder.create();
> builder.setMode(HttpMultipartMode.LEGACY);
> builder.addPart("document", fileBody);
> HttpEntity requestEntity = builder.build();
> URI endpoint = ...
> HttpPost httpPost = new HttpPost(endpoint.toString());
> httpPost.setEntity(requestEntity);
> httpClient.execute(httpPost, getHttpClientContext(), ...); {code}
> The request contains the file and a {{Content-Disposition}} header. Inside 
> this header the filename is contained. So far so good.
>  
> But as filesystems go international and support all kinds of characters, the 
> {{filename}} header needs to be encoded in ISO-8859-1 or rfc5987 applies. But 
> in reality HttpClient 5.4.1 uses UTF-8 encoding, which can break other 
> servers trying to parse the request.
> I like to use this client a lot. Please enhance it to follow 
> https://www.rfc-editor.org/rfc/rfc6266#page-5  which allows to use UTF-8 
> encoding in the {{filename*}} header. Or even better, HttpClient fills both 
> headers so the server can pick.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to