Hi Rowan,

On Thu, Sep 8, 2016 at 2:58 AM, Rowan Collins <rowan.coll...@gmail.com> wrote:
>
>> 3rd issue is location. Input data validation is better to be done as
>> soon as possible. When application accepts input, programmers
>> know what the possible inputs, and could cover all inputs. i.e.
>> Controller is the best place for input format validation.
>
>
> You're mixing two things here, I think: one is *when* the validation is run
> (how soon in the execution pipeline); the other is *where* it is defined
> (which PHP class it is part of). I think what I'm getting at is that the
> rules should be *defined* in one place (avoid code duplication, ensure
> definitions are kept up to date as requirements change) even if they are
> *accessed* in more than one place.

This might be the largest difference.

To make something secure than it is now, adding additional security
layer is effective, not single location/code.

Good example is web application firewall(WAF). It's a independent
security layer that does whole bunch of checks for additional
security. WAF is proven to be useful for web app code vulnerabilities
such as JavaScript/SQL injections because it does checks independent
from application code and most apps do very poor validations.

Maintaining WAF rules is not easy task, especially when WAF rules are
white-list based. (All of security guidelines recommend whitelist
based approach.) IMHO, most WAF protections should be implemented in
apps because strict validations with WAF is too hard and too
inefficient.

>
> The method $formDefinition->isSubmittedDataSane($_POST) could be implemented
> by generating, based on the set of fields expected, a spec for ext/filter.
> But by the time you've handled all the cases, implemented a bunch of custom
> callbacks for unsupported validation types, and customised the error message
> slightly, you might as well just implement the validation yourself.
>
> So the challenge of any built-in filter module is this: if it's not doing
> the whole job of form handling and validation, what specific part of that
> task is it doing? And how does it fit with common ways of implementing the
> rest? Perhaps if we provided a *narrower* focus, the API could become
> simpler and more widely applicable.

My intention is to cover runtime validations required by DbC. DbC
validations are disabled for production systems, but some validations
must be executed at runtime, application level validations at least.

Even if there are some missing parts, the proposal is good enough to
start. IMO. I appreciate suggestions for improvements. It does not
have to be based on current filter module.

> For instance, if we set the very narrow aim of "provide an easy-to-use set
> of primitive tests for use in a validation filter", we could:
> - remove all array handling (assume users are capable of using foreach())
> - remove all support for custom filters (a single-variable custom filter
> does little more than call_user_func)
> - simplify the return possibilities (boolean: does this value pass this
> test?)
> - remove some tests that are trivially implemented using other functions

Suppose we have validation module. You are suggesting something like

$int = validate_int($var, $min, $max);
$bool = validate_bool($var, $allowed_bool_types);
// i.e. which type of bool 1/0, yes/no, on/off, true/false is allowed
// This isn't implemented. All of them are valid bools currently.
$str = validate_string($var, $min_len, $max_len);
$str = validate_string_encoding($var, $encoding);
$str = validate_string_chars($var, $allowed_chars);
$str = validate_string_regex($var, $regex);
$str = validate_string_degit($var, $min_len, $max_len);
$str = validate_string_callback($var, $callback);

Although it works, I prefer array definition because it's a lot easier
to write rule and efficient to execute.

$def = [
  'int_var'   => ['filter'=>FILTER_VALIDTE_INT, 'options'=>[$min, $max]],
  'bool_var' => ['filter'=>FILTER_VALIDATE_BOOL,
                      'options'=>$allowed_bool_types],
  'str_var'   => [
            ['filter' => FILTER_VALIDATE_STRING,
             'options' =>['min_bytes'=>$min_len, 'max_bytes'=>$max_len]],
            ['filter' => FILTER_VALIDATE_REGEX,
             'options' => ['regex' => $regex]],
            ['filter' => FILTER_VALIDATE_CALLBACK,
             'options' => ['callback' => $callback]],
  ]
];
$safe_input = filter_require_var_array($input, $def);

You can group definition easily with array. (Multiple filter support
is implemented by my patch) e.g.

$my_str_var_spec = [
            ['filter' => FILTER_VALIDATE_STRING,
             'options' =>['min_bytes'=>$min_len, 'max_bytes'=>$max_len]],
            ['filter' => FILTER_VALIDATE_REGEX,
             'options' => ['regex'=> $regex]],
            ['filter' => FILTER_VALIDATE_CALLBACK,
             'options' =>['callback' => $callback]],
];

then previous definition became

$def = [
  'int_var'   => ['filter'=>FILTER_VALIDTE_INT,
                      'options'=>[$min, $max]],
  'bool_var' => ['filter'=>FILTER_VALIDATE_BOOL,
                      'options'=>$allowed_bool_types],
  'str_var'   => $my_str_var_spec,
];

Rule reuse and centralizing validation rule is easy.

If you would like to build JavaScript validations on client side from
the definition, it's easy to build one because it's simple array
definition, not bunch of functions define validation rules.

>> How many of you are against the idea of this RFC?
>> (I don't think Rowan against basic idea, BTW)
>
>
> I guess I'm against the idea of the RFC in the sense that it's aim is too
> broad: we cannot implement safe validation in the language, we can only give
> users the tools to do it. The RFC as it was proposed (and, I think,
> ext/filter in general) tries too hard to "do everything for you", without
> looking at where it would fit inside a larger application.

There are many people who use filter module happily, why validation
cannot be implemented? Even external WAF does it. Divide and conquer
(input handling and logic handling), multiple layers of protections
works. We know interface is more stable than logic. Vulnerabilities
introduced often  when logic is changed. Input validation can mitigate
risks.

I didn't spend much time for this because I reused filter module
framework/code and didn't do refactoring. If it seemed I tried to
hard, filter module authors worked too hard. I spent more time to
write english rather than code :)

The proposal provides primitive tool, but not too primitive. It does
not handle complex form nor client side JavaScript
validations, but it could be used for these tasks. (I changed PR so
that exception could be optional) Those fancy exciting things are left
to user implementation.

Anyway, we have $_POST/$_GET/$_COOKIE/$_FILES/$_SERVER/$_ENV
as basic inputs. Input validation is #1 requirement for code security.
PHP _must_ have some tool that validates these easy and simple,
yet extensible.

Question would be what kind we'll have?

Simple functions? Different kind of array definition and validator
function? More comprehensive object based?   Suggestions are
appreciated. I don't mind implement it from scratch. Idea only
suggestion is welcomed!

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to