Hi Rowan, On Thu, Sep 8, 2016 at 2:58 AM, Rowan Collins <rowan.coll...@gmail.com> wrote: > >> 3rd issue is location. Input data validation is better to be done as >> soon as possible. When application accepts input, programmers >> know what the possible inputs, and could cover all inputs. i.e. >> Controller is the best place for input format validation. > > > You're mixing two things here, I think: one is *when* the validation is run > (how soon in the execution pipeline); the other is *where* it is defined > (which PHP class it is part of). I think what I'm getting at is that the > rules should be *defined* in one place (avoid code duplication, ensure > definitions are kept up to date as requirements change) even if they are > *accessed* in more than one place.
This might be the largest difference. To make something secure than it is now, adding additional security layer is effective, not single location/code. Good example is web application firewall(WAF). It's a independent security layer that does whole bunch of checks for additional security. WAF is proven to be useful for web app code vulnerabilities such as JavaScript/SQL injections because it does checks independent from application code and most apps do very poor validations. Maintaining WAF rules is not easy task, especially when WAF rules are white-list based. (All of security guidelines recommend whitelist based approach.) IMHO, most WAF protections should be implemented in apps because strict validations with WAF is too hard and too inefficient. > > The method $formDefinition->isSubmittedDataSane($_POST) could be implemented > by generating, based on the set of fields expected, a spec for ext/filter. > But by the time you've handled all the cases, implemented a bunch of custom > callbacks for unsupported validation types, and customised the error message > slightly, you might as well just implement the validation yourself. > > So the challenge of any built-in filter module is this: if it's not doing > the whole job of form handling and validation, what specific part of that > task is it doing? And how does it fit with common ways of implementing the > rest? Perhaps if we provided a *narrower* focus, the API could become > simpler and more widely applicable. My intention is to cover runtime validations required by DbC. DbC validations are disabled for production systems, but some validations must be executed at runtime, application level validations at least. Even if there are some missing parts, the proposal is good enough to start. IMO. I appreciate suggestions for improvements. It does not have to be based on current filter module. > For instance, if we set the very narrow aim of "provide an easy-to-use set > of primitive tests for use in a validation filter", we could: > - remove all array handling (assume users are capable of using foreach()) > - remove all support for custom filters (a single-variable custom filter > does little more than call_user_func) > - simplify the return possibilities (boolean: does this value pass this > test?) > - remove some tests that are trivially implemented using other functions Suppose we have validation module. You are suggesting something like $int = validate_int($var, $min, $max); $bool = validate_bool($var, $allowed_bool_types); // i.e. which type of bool 1/0, yes/no, on/off, true/false is allowed // This isn't implemented. All of them are valid bools currently. $str = validate_string($var, $min_len, $max_len); $str = validate_string_encoding($var, $encoding); $str = validate_string_chars($var, $allowed_chars); $str = validate_string_regex($var, $regex); $str = validate_string_degit($var, $min_len, $max_len); $str = validate_string_callback($var, $callback); Although it works, I prefer array definition because it's a lot easier to write rule and efficient to execute. $def = [ 'int_var' => ['filter'=>FILTER_VALIDTE_INT, 'options'=>[$min, $max]], 'bool_var' => ['filter'=>FILTER_VALIDATE_BOOL, 'options'=>$allowed_bool_types], 'str_var' => [ ['filter' => FILTER_VALIDATE_STRING, 'options' =>['min_bytes'=>$min_len, 'max_bytes'=>$max_len]], ['filter' => FILTER_VALIDATE_REGEX, 'options' => ['regex' => $regex]], ['filter' => FILTER_VALIDATE_CALLBACK, 'options' => ['callback' => $callback]], ] ]; $safe_input = filter_require_var_array($input, $def); You can group definition easily with array. (Multiple filter support is implemented by my patch) e.g. $my_str_var_spec = [ ['filter' => FILTER_VALIDATE_STRING, 'options' =>['min_bytes'=>$min_len, 'max_bytes'=>$max_len]], ['filter' => FILTER_VALIDATE_REGEX, 'options' => ['regex'=> $regex]], ['filter' => FILTER_VALIDATE_CALLBACK, 'options' =>['callback' => $callback]], ]; then previous definition became $def = [ 'int_var' => ['filter'=>FILTER_VALIDTE_INT, 'options'=>[$min, $max]], 'bool_var' => ['filter'=>FILTER_VALIDATE_BOOL, 'options'=>$allowed_bool_types], 'str_var' => $my_str_var_spec, ]; Rule reuse and centralizing validation rule is easy. If you would like to build JavaScript validations on client side from the definition, it's easy to build one because it's simple array definition, not bunch of functions define validation rules. >> How many of you are against the idea of this RFC? >> (I don't think Rowan against basic idea, BTW) > > > I guess I'm against the idea of the RFC in the sense that it's aim is too > broad: we cannot implement safe validation in the language, we can only give > users the tools to do it. The RFC as it was proposed (and, I think, > ext/filter in general) tries too hard to "do everything for you", without > looking at where it would fit inside a larger application. There are many people who use filter module happily, why validation cannot be implemented? Even external WAF does it. Divide and conquer (input handling and logic handling), multiple layers of protections works. We know interface is more stable than logic. Vulnerabilities introduced often when logic is changed. Input validation can mitigate risks. I didn't spend much time for this because I reused filter module framework/code and didn't do refactoring. If it seemed I tried to hard, filter module authors worked too hard. I spent more time to write english rather than code :) The proposal provides primitive tool, but not too primitive. It does not handle complex form nor client side JavaScript validations, but it could be used for these tasks. (I changed PR so that exception could be optional) Those fancy exciting things are left to user implementation. Anyway, we have $_POST/$_GET/$_COOKIE/$_FILES/$_SERVER/$_ENV as basic inputs. Input validation is #1 requirement for code security. PHP _must_ have some tool that validates these easy and simple, yet extensible. Question would be what kind we'll have? Simple functions? Different kind of array definition and validator function? More comprehensive object based? Suggestions are appreciated. I don't mind implement it from scratch. Idea only suggestion is welcomed! Regards, -- Yasuo Ohgaki yohg...@ohgaki.net -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php