outlandishlizard commented on PR #619: URL: https://github.com/apache/httpcomponents-client/pull/619#issuecomment-2727681863
> > who may be several wrappers downstream, will have any idea that they need to escape these values-- it's not a normal application security consideration at all. > > This is a responsibility of people who wrap the library and enforce a security model or provide a UI, not that of the library. Again, do not pin it on us. I have read what you said several times. The above quote seems to me to say that you believe it is the responsibility of your users to escape the boundary values, piercing the abstraction layer of "send a multipart message please" and requiring a full scan of the request body in order to ensure it complies with the specific quirks of multipart encoding this PR implements. It seems to me that this is both onerous on the user, and substantially less performant than using a random boundary value from a secure random source (there will be a constant-time versus linear time cost breakpoint for certain-- for small enough multipart requests the linear scan probably wins!). With regard to your comments about a "false sense of security": The entire point of using a random value is to make the chance of a collision of message content and boundary content negligible; the sense of security provided is in no way false. The spec allows for up to 70 characters of boundary; these characters are selected from a set of 75 potential options, meaning that we have a total of: 75^70 = 179592599813797960985749775445106096943740867154224363601035145662298829808338299928818803698205019969691420556046068668365478515625 potential valid boundary delimiters, making the chance of a robust random scheme producing a collision negligible, in the cryptographic sense of the word. So, we are left with two options for providing users a robust path forward: 1. Add a requirement that users of this library perform a linear scan on their message bodies for a library specific boundary identifier, in exchange for a potential small performance boost for the library itself. Users who fail to implement a linear scan will encounter some messages that are reliably mangled by the library. 2. Use a random implementation, with a potential very small performance hit. Assuming a trillion requests per second are processed by the library around the world, the sun will have exploded before a collision occurs. 3. Entirely abdicate the responsibility for boundary generation as a library, and require the boundary be supplied by the user. This is logically consistent with the stance I'm seeing from you that the user should be responsible for boundary encoding, although I personally think it's not a very nice user experience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org