[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922794#comment-17922794
 ] 

Hiran Chaudhuri commented on HTTPCLIENT-2360:
---------------------------------------------

>From the enum HttpMultiPartMode there exist three settings.
 * LEGACY
Here I saw UTF-8 characters in the filename header, which the server dislikes
 * STRICT
Here again I saw UTF-8 characters in the filename header. The server still does 
not like this.
 * EXTENDED
Here I saw URL-encoded special characters in the filename header. The server 
accepted the requests but the characters were not decoded.

As much as I understand the idea of 
[https://www.rfc-editor.org/rfc/rfc6266#page-5] is not implemented. It would 
use an additional header besides `filename`. It would be `filename*`, and it's 
content would be url-encoded but prefixed such that a server knows the charset 
in use.

So if I wanted to be compliant, how should the code look like?

> rfc6266 support in MIME multipart
> ---------------------------------
>
>                 Key: HTTPCLIENT-2360
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2360
>             Project: HttpComponents HttpClient
>          Issue Type: Improvement
>          Components: HttpClient (classic)
>    Affects Versions: 5.4.1
>            Reporter: Hiran Chaudhuri
>            Priority: Minor
>              Labels: volunteers-wanted
>             Fix For: Stuck
>
>
> The following code creates a bad HTTP request:
>  
> {code:java}
> File file = ...
> FileBody fileBody = new FileBody(sourceFile);
> MultipartEntityBuilder builder = MultipartEntityBuilder.create();
> builder.setMode(HttpMultipartMode.LEGACY);
> builder.addPart("document", fileBody);
> HttpEntity requestEntity = builder.build();
> URI endpoint = ...
> HttpPost httpPost = new HttpPost(endpoint.toString());
> httpPost.setEntity(requestEntity);
> httpClient.execute(httpPost, getHttpClientContext(), ...); {code}
> The request contains the file and a {{Content-Disposition}} header. Inside 
> this header the filename is contained. So far so good.
>  
> But as filesystems go international and support all kinds of characters, the 
> {{filename}} header needs to be encoded in ISO-8859-1 or rfc5987 applies. But 
> in reality HttpClient 5.4.1 uses UTF-8 encoding, which can break other 
> servers trying to parse the request.
> I like to use this client a lot. Please enhance it to follow 
> https://www.rfc-editor.org/rfc/rfc6266#page-5  which allows to use UTF-8 
> encoding in the {{filename*}} header. Or even better, HttpClient fills both 
> headers so the server can pick.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to