Wietse Venema wrote:
Rasmus Lerdorf:
Soenke Ruempler wrote:
Hi Rasmus,
On 03/23/2008 03:32 PM, Rasmus Lerdorf wrote:
This is what the filter extension is for. You should be working with
escaped data by default and only poke a hole in your data firewall in
the few places where you need to work with the raw data. Doing it the
other way around is going to lead to all sorts of security issues.
Mhm. Isn't the the right paradigm to prepare variables at the time they
are passed into subsystems (sql, shell, html etc.)? So what do you mean
with "escaped data" here? html/xml escaped, sql escaped (which sql
system and which encoding?). Sounds a bit like magic_quotes reloaded
*hides*
It is, but it is magic_quotes done right. You apply a really strict
filter that makes your data safe for display and your backend by
default. The only place you can reliably do this this is at the point
the data enters your system.
Input fitering has valid uses, but protecting html/sql/shell/etc.
is not among them. Legitimate input like O'Reilly requires different
treatments depending on html/sql/shell/etc. context. It would be
incorrect to always insert a \, it would be incorrect to always
remove the ', and it would be incorrect to always reject the input.
You can also choose to never store the raw single quote and always work
with encoded data. Or, as I suggest, always filter it by default and in
the places where you want the raw quote back or you want it filtered for
a specific use, specify explicitly which filter you want to apply. It
is the data firewall approach. Filter everything by default with an
extremely strict filter and poke holes in your data firewall as
necessary. It also makes it easy to audit your code because you only
have to track look at the places where you have poked a hole.
Data flow control (a.k.a. taint support) can detect when output
isn't converted with the proper conversion function. This can be
done in reporting mode (my approach) or it can be done in "automatic
fixing" mode (other people). These different approaches make
different trade-offs between programmer effort and system overhead,
and avoid the data corruption that input filtering would introduce.
Having to do active checks on each use is extremely expensive. You said
yourself you suggest only enabling this during development. The data
firewall approach isn't actually all that different from the taint
approach. The big win is that there is no runtime checking necessary
and thus no performance hit.
-Rasmus
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php