outlandishlizard commented on PR #619:
URL: 
https://github.com/apache/httpcomponents-client/pull/619#issuecomment-2727681863

   > > who may be several wrappers downstream, will have any idea that they 
need to escape these values-- it's not a normal application security 
consideration at all.
   > 
   > This is a responsibility of people who wrap the library and enforce a 
security model or provide a UI, not that of the library. Again, do not pin it 
on us.
   
   I have read what you said several times. The above quote seems to me to say 
that you believe it is the responsibility of your users to escape the boundary 
values, piercing the abstraction layer of "send a multipart message please" and 
requiring a full scan of the request body in order to ensure it complies with 
the specific quirks of multipart encoding this PR implements. 
   
   It seems to me that this is both onerous on the user, and substantially less 
performant than using a random boundary value from a secure random source 
(there will be a constant-time versus linear time cost breakpoint for certain-- 
for small enough multipart requests the linear scan probably wins!). 
   
   With regard to your comments about a "false sense of security":
   
   The entire point of using a random value is to make the chance of a 
collision of message content and boundary content negligible; the sense of 
security provided is in no way false. The spec allows for up to 70 characters 
of boundary; these characters are selected from a set of 75 potential options, 
meaning that we have a total of:
   
    75^70 =
   
179592599813797960985749775445106096943740867154224363601035145662298829808338299928818803698205019969691420556046068668365478515625
 
   
   potential valid boundary delimiters, making the chance of a robust random 
scheme producing a collision negligible, in the cryptographic sense of the word.
   
   So, we are left with two options for providing users a robust path forward:
   1. Add a requirement that users of this library perform a linear scan on 
their message bodies for a library specific boundary identifier, in exchange 
for a potential small performance boost for the library itself. Users who fail 
to implement a linear scan will encounter some messages that are reliably 
mangled by the library.
   2. Use a random implementation, with a potential very small performance hit. 
Assuming a trillion requests per second are processed by the library around the 
world, the sun will have exploded before a collision occurs.
   3. Entirely abdicate the responsibility for boundary generation as a 
library, and require the boundary be supplied by the user. This is logically 
consistent with the stance I'm seeing from you that the user should be 
responsible for boundary encoding, although I personally think it's not a very 
nice user experience.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to