On 06/09/2016 02:56, Yasuo Ohgaki wrote:
Hi Rowan,
On Fri, Sep 2, 2016 at 7:37 PM, Rowan Collins <rowan.coll...@gmail.com> wrote:
This certainly makes sense. I guess the challenge is that in order to know
if data has been tampered, you need to have some knowledge of the expected
format. That expectation depends on what data you're expecting, which
depends - ultimately - on the domain objects being modelled.
More specifically, though, it depends on the interaction design - in an HTML
context, the forms being presented. So the validation needs knowledge of the
form controls - e.g. if a select box was shown, and the value is not from
the known list of options, the input has been tampered with.
BTW, I don't think everyone has to validate input very strict
manner. It is ok to validate like
<?php
// Define loose input validation
$get_def = array(
// GET
'id' => $id_spec,
'other_id' => $id_spec,
'type1' => $alnum32b_spec, // Alpha numeric up to 32 bytes
'type2'=> $alnum32b_spec,
);
$post_def = array(
// POST
'text1' => $input_tag_spec, // <input type=text> default up to
// 512 bytes text
'text2' => $input_tag_spec,
'select1' => $select_tag_spec, // <select> values default up to
// 64 bytes text
'select2' => $select_tag_spec,
'radio1' => $radio_tag_spec, // <radio> values default up to
// 64 bytes text
'radio2' => $radio_tag_spec,
'submit' => $submit_tag_spec,
'textarea1k' => $textarea1k_spec, // <textarea> values default up
// to 1K text,
allow newline
'textarea100k' => $textare100k_spec, // <textarea> values default
// up to
100K text, allow newline
'CSRF_TOKEN' => $alnum32b_spec, // Alpha numeric up to 32 bytes
// and so on
);
$_GET = filter_require_var_array($_GET);
$_POST =filter_require_var_array($_POST);
?>
These "simple" examples are still very closely bound to the HTML form
(or API definition, or whatever).
If a change is made to the form, even these simple rules need to be
changed. Every time a field is added or removed, these validation rules
need to be updated.
Or consider for example a select box which only ever contains integer
IDs; the simple validation for this would be to reject non-numeric input
as tampering. But if the UI changes to a fancy combo box autocomplete
widget, non-numeric input might instead merit a user-friendly validation
message.
You could just about guarantee that most fields will never need to
accept control characters. But even newlines come and go - a "revision
comment" field might be one line (the traditional wiki style) or many
(common version control style).
3rd issue is location. Input data validation is better to be done as
soon as possible. When application accepts input, programmers
know what the possible inputs, and could cover all inputs. i.e.
Controller is the best place for input format validation.
You're mixing two things here, I think: one is *when* the validation is
run (how soon in the execution pipeline); the other is *where* it is
defined (which PHP class it is part of). I think what I'm getting at is
that the rules should be *defined* in one place (avoid code duplication,
ensure definitions are kept up to date as requirements change) even if
they are *accessed* in more than one place.
The method $formDefinition->isSubmittedDataSane($_POST) could be
implemented by generating, based on the set of fields expected, a spec
for ext/filter. But by the time you've handled all the cases,
implemented a bunch of custom callbacks for unsupported validation
types, and customised the error message slightly, you might as well just
implement the validation yourself.
So the challenge of any built-in filter module is this: if it's not
doing the whole job of form handling and validation, what specific part
of that task is it doing? And how does it fit with common ways of
implementing the rest? Perhaps if we provided a *narrower* focus, the
API could become simpler and more widely applicable.
For instance, if we set the very narrow aim of "provide an easy-to-use
set of primitive tests for use in a validation filter", we could:
- remove all array handling (assume users are capable of using foreach())
- remove all support for custom filters (a single-variable custom filter
does little more than call_user_func)
- simplify the return possibilities (boolean: does this value pass this
test?)
- remove some tests that are trivially implemented using other functions
How many of you are against the idea of this RFC?
(I don't think Rowan against basic idea, BTW)
I guess I'm against the idea of the RFC in the sense that it's aim is
too broad: we cannot implement safe validation in the language, we can
only give users the tools to do it. The RFC as it was proposed (and, I
think, ext/filter in general) tries too hard to "do everything for you",
without looking at where it would fit inside a larger application.
Regards,
--
Rowan Collins
[IMSoP]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php