On 06/09/2016 02:56, Yasuo Ohgaki wrote:
Hi Rowan,

On Fri, Sep 2, 2016 at 7:37 PM, Rowan Collins <rowan.coll...@gmail.com> wrote:
This certainly makes sense. I guess the challenge is that in order to know
if data has been tampered, you need to have some knowledge of the expected
format. That expectation depends on what data you're expecting, which
depends - ultimately - on the domain objects being modelled.

More specifically, though, it depends on the interaction design - in an HTML
context, the forms being presented. So the validation needs knowledge of the
form controls - e.g. if a select box was shown, and the value is not from
the known list of options, the input has been tampered with.

BTW, I don't think everyone has to validate input very strict
manner. It is ok to validate like

<?php
// Define loose input validation


$get_def = array(
    // GET
   'id'  => $id_spec,
   'other_id' => $id_spec,
   'type1' => $alnum32b_spec, // Alpha numeric up to 32 bytes
   'type2'=> $alnum32b_spec,
);

$post_def = array(
    // POST
   'text1'    => $input_tag_spec, // <input type=text> default up to
                                              // 512 bytes text
   'text2'    => $input_tag_spec,
   'select1'    => $select_tag_spec, // <select> values default up to
                                                   // 64 bytes text
   'select2'    => $select_tag_spec,
   'radio1'      => $radio_tag_spec, // <radio> values default up to
                                                  // 64 bytes text
   'radio2'      => $radio_tag_spec,
   'submit'    => $submit_tag_spec,
   'textarea1k'   => $textarea1k_spec, // <textarea> values default up
                                                       // to 1K text,
allow newline
   'textarea100k' => $textare100k_spec, // <textarea> values default
                                                          // up to
100K text, allow newline
   'CSRF_TOKEN' => $alnum32b_spec, // Alpha numeric up to 32 bytes
   // and so on
);

$_GET = filter_require_var_array($_GET);
$_POST =filter_require_var_array($_POST);
?>

These "simple" examples are still very closely bound to the HTML form (or API definition, or whatever).

If a change is made to the form, even these simple rules need to be changed. Every time a field is added or removed, these validation rules need to be updated.

Or consider for example a select box which only ever contains integer IDs; the simple validation for this would be to reject non-numeric input as tampering. But if the UI changes to a fancy combo box autocomplete widget, non-numeric input might instead merit a user-friendly validation message.

You could just about guarantee that most fields will never need to accept control characters. But even newlines come and go - a "revision comment" field might be one line (the traditional wiki style) or many (common version control style).



3rd issue is location. Input data validation is better to be done as
soon as possible. When application accepts input, programmers
know what the possible inputs, and could cover all inputs. i.e.
Controller is the best place for input format validation.

You're mixing two things here, I think: one is *when* the validation is run (how soon in the execution pipeline); the other is *where* it is defined (which PHP class it is part of). I think what I'm getting at is that the rules should be *defined* in one place (avoid code duplication, ensure definitions are kept up to date as requirements change) even if they are *accessed* in more than one place.

The method $formDefinition->isSubmittedDataSane($_POST) could be implemented by generating, based on the set of fields expected, a spec for ext/filter. But by the time you've handled all the cases, implemented a bunch of custom callbacks for unsupported validation types, and customised the error message slightly, you might as well just implement the validation yourself.

So the challenge of any built-in filter module is this: if it's not doing the whole job of form handling and validation, what specific part of that task is it doing? And how does it fit with common ways of implementing the rest? Perhaps if we provided a *narrower* focus, the API could become simpler and more widely applicable.

For instance, if we set the very narrow aim of "provide an easy-to-use set of primitive tests for use in a validation filter", we could:
- remove all array handling (assume users are capable of using foreach())
- remove all support for custom filters (a single-variable custom filter does little more than call_user_func) - simplify the return possibilities (boolean: does this value pass this test?)
- remove some tests that are trivially implemented using other functions


How many of you are against the idea of this RFC?
(I don't think Rowan against basic idea, BTW)

I guess I'm against the idea of the RFC in the sense that it's aim is too broad: we cannot implement safe validation in the language, we can only give users the tools to do it. The RFC as it was proposed (and, I think, ext/filter in general) tries too hard to "do everything for you", without looking at where it would fit inside a larger application.


Regards,
--
Rowan Collins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to