Stanislav Malyshev wrote: >> output to browser, output to system (console/whatever else), sql, xml, >> streams, etc... all of them require special attentions. > > Hello, safe mode 2.0! :) > Seriously, I do not think tainting is made for that - and we will have a > ton of trouble trying to describe what is "safe for SQL" (is it for > MySQL? Oracle? DB2? sqlite? a ton of other SQLs each with own quirks and > quoting rules?) and what is "safe for output" (is it OK to output HTML > tags?). Tainting mode, as I see it, is meant to achieve exactly one > simple task - force you (as much as it can) to take explicit action on > sanitizing the parameters before they can do any harm. I do not think it > should make you use any specific way of sanitizing or check data for > anything specific - this is impossible without domain-specific > knowledge. This is task for filters and yes - for you as a developer. > Tainting mode only makes sure for you that you do you job. It's like > code coverage report - it doesn't make your code bug-free, it only > ensures you actually did some checking. And that's why it can be dumb > and not try to figure out what is safe for output and what is safe for xml. > >> I do not want the mode 3, for the reasons I explained earlier. I also > > Actually, I do. Especially if I had some legacy non-filtering > application which I wanted to secure. I would prefer to break it hard > and then assemble the pieces in the correct way, rather than play > find-the-next-hole.
This is actually exactly what the filter extension was designed to do and the difference between taint mode and ext/filter is just in the approach. The idea behind ext/filter is that an organization, like Yahoo! in my case, can provide a default filter that is deemed safe for both output and backends in common use within the organization. For us that means the default filter is extremely strict. It basically strips or encodes anything that could in any way be interesting to a web browser or the sql parsers we use. If your name is O'Henry then sorry, it is now O'Henry. For the most part existing third party applications run unchanged even with this strict default filter. At most there are one or two textarea fields in any one of the common open source apps out there that require special characters to get through unscathed. We have a number of large open source things including Drupal, Symfony and such running nicely under this extremely strict filter. It becomes quite obvious where to fix these and the big benefit for me is that the resulting code is easily auditable since I can just scan through the source code repository looking for these or stick a watcher on the commit mechanism that tells me anytime someone wants to access the raw data. Such a default filter is obviously not for everyone, but the big advantage over the taint mode approach is that you don't have to go through and look at every single use of Get/Post/Cookie/Env/Server since they are all strictly filtered by default so the silly address form will still work unchanged and the age selector will work. Yes, you will end up with html entities in the database, but if the app runs in that mode right from the start and is only accessed through the web, as long as a single quote is consistently written to the database as ' then technically the app is still working fine. You may of course want to put a database-specific filter into the database layer in order to better control how user data goes into the database and that is what the various filters are for. The trouble is rarely the places where developers remember to filter, it is usually where they forget. Both taint and filter (with a default filter) address that in slightly different ways with filter being aimed at production servers so you will sleep better at night regardless of whether your own developers mess up or a third party application someone lazily installed messes up. I see taint mode being more geared at developers trying to help that write code that will be more secure on unfiltered PHP installs. For filtered PHP installs taint won't help much, although the two need to know about each other. For example, if taint mode is on and you fetch the raw data from filter for an input variable, then that returned data should be tainted. Likewise, if a default filter is in place, the filtered input array data should be marked as untainted. That would give you an extra layer of protection against fetched raw data leaking into places it shouldn't. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php