ID:               48219
 User updated by:  carsten_sttgt at gmx dot de
 Reported By:      carsten_sttgt at gmx dot de
 Status:           Open
 Bug Type:         Feature/Change Request
 Operating System: *
 PHP Version:      5.*, 6CVS (2009-05-09)
 New Comment:

After a quick view to rfc1867.c, I found a lot of:
| #if HAVE_MBSTRING && !defined(COMPILE_DL_MBSTRING)

So I guess a correct behavior, according to rfc2616/rfc1867, is only
possible and working, if you have the mbstring extension, and if this is
not a shared extension. (why does this not work with a shared
extension?)

(can't test this, because this extension is always shared in my
installations.)

It's like bug #37860:
A HTTP UA is sending such a valid POST request and PHP is answering
with a status 200. And both, browser an script, must assume all is ok.
Instead the data is garbled.

In contrast to bug #37860, it's not defined to return a status 415,
(but maybe the best solution for now?).

In case of bug #37860, the return status 415 is defined for such
situation. But PHP is also not doing this :-/ Also a problem, if all
parts are thinking the POST request is OK.


Regards,
Carsten


Previous Comments:
------------------------------------------------------------------------

[2009-05-10 19:15:16] carsten_sttgt at gmx dot de

> And this is (I think :) related also to bug #37860

Yes, it's similar. BTW. I think bug #37860 is a feature request and
also a bug.
- Feature: It would be nice, if PHP is decoding the data
  if the coding is known (see rfc2616-sec3.5/-sec14.1.
   e.g. if the gzip-extension is loaded and
   "Content-Encoding: gzip" is set in the request
- Bug: if PHP can't/won't do this, it should raise/return a HTTP 
  status code of 415. (See rfc2616-sec14.11)


> Unfortunately this is a feature request so reclassifying as such. 

That's really something I was unsure about. See rfc2616-sec3.5. In
general:
| an HTTP user agent SHOULD follow the same or similar behavior
| as a MIME user agent would upon receipt of a multipart type.
| The MIME header fields within each body-part of a multipart
| message- body do not have any significance to HTTP beyond that
| defined by their MIME semantics.
Well, a MIME user agent must decode such data. Because of the "should"
in this statement, it /can/ be a feature request (but "should" is more
restrictive than a "may" / "optional".).

But same section rfc2616-sec3.5:
Note: The "multipart/form-data" type has been specifically
      defined for carrying form data suitable for processing
      via the POST request method, as described in RFC 1867 [15].

And in rfc1867 (or the newer rfc2388), Content-Transfer-Encoding is
explicit part of the rfc. So I think a HTTP software should know and
handle Content-Transfer-Encoding. Well, Perls' CGI.pm also is not doing
this ;-)

BTW:
In difference to the Content-Encoding, I can't see the
Content-Transfer-Encoding in the script. So that can be really a
problem. But using a Content-Transfer-Encoding is not usual (or is it
not usual, because Perl/PHP can't handle this?)


> btw. Fastest way to get this implemented is by providing a patch. :)

Yeah, if my C would be better... ;-)

------------------------------------------------------------------------

[2009-05-10 17:02:30] j...@php.net

btw. Fastest way to get this implemented is by providing a patch. :)

------------------------------------------------------------------------

[2009-05-10 17:01:57] j...@php.net

Unfortunately this is a feature request so reclassifying as such. 
And this is (I think :) related also to bug #37860 and maybe some
others 
I couldn't find. :)



------------------------------------------------------------------------

[2009-05-10 10:49:28] carsten_sttgt at gmx dot de

Description:
------------
Hallo,

In a HTTP POST request and Content-Type "multipart/form-data", each
part can have a Content-Transfer-Encoding, which is defined in RFC2045.
(See also HTML 4.01-sec17.13.4.2)

PHP only works with 7bit, 8bit and binary, because with these values,
the data is not transformed.

With base64 or quoted-printabled, the data is transformed (encoded),
and PHP should decode it (see also rfc2616-sec3.7.2 / rfc1867-sec3.3).

Just test the above example from RFC2388-sec4.5. That's also a problem,
if you upload a file with such a transfer encoding. After
move_uploaded_files, the content of such file is not really what you
aspect.

And in a script, which receives such data, I don't see (can't know), if
there was a Content-Transfer-Encoding for something in $_POST / $_FILES.
Maybe not usual, but a Client can use such a Content-Transfer-Encoding
at any time in a POST request.

Regards,
Carsten


Reproduce code:
---------------
Create a simple "test.php" in your DocumentRoot:
==================
<?php
var_dump($_POST);
?>
==================

Telnet to localhost:80 and send this request:
======================================================
POST http://localhost/test.php HTTP/1.0
Content-Length: 181
Content-Type: multipart/form-data; boundary=AaB03x

--AaB03x
Content-Disposition: form-data; name="field1"
Content-Type: text/plain;charset=windows-1250
Content-Transfer-Encoding: quoted-printable

Joe owes =80100.
--AaB03x--

======================================================



Expected result:
----------------
array(1) {
  ["field1"]=>
    string(14) "Joe owes €100."
}


Actual result:
--------------
array(1) {
  ["field1"]=>
    string(16) "Joe owes =80100."
}



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=48219&edit=1

Reply via email to