--- [EMAIL PROTECTED] wrote:
> Recently I've been in the middle of trying to build defenses against 
> SQL injection on a site I'm working on (proactively, we haven't had a 
> problem). While this principle seems exactly right, I find it's not as 
> easy to implement as it sounds, and I'd argue that the results aren't 
> as absolute as you suggest, though you certainly have more experience 
> with it than I do so perhaps I'm missing something.

I would never argue that something is an absolute defense, but I would
characterize my recommendation as a best practice.

> The problem is that there are some well-defined attacks with 
> protections against them that can be logically defended. But there is 
> no list of all possible attacks, so I'm not sure it's really possible 
> to say "you're protected against SQL injection" at some point. Do you 
> feel differently? If so I'd be interested to hear why.

The reason why is the difference in approach. If any approach depends on
exhaustive knowledge of all possible attacks, the approach is
fundamentally flawed and could never be considered secure. There is only
one you, and there are an unlimited number of potential attackers. You
cannot hope to second guess every single one of them.

> I agree with you that checking for valid characters is safer than 
> checking for malicious characters, but even the former is not absolute.

Not absolute in what sense? Making sure something is valid is pretty
absolute; the only possible flaws are flaws in "making sure something is
valid." For example, I feel confident that no one can show me a string
that I would consider a valid first name that is also an SQL injection
attack.

> Also it is not possible to make the set of characters with syntactic
> significance have no overlap with the set of valid input characters --
> a single quote used as an apostrophe is the obvious example, so
> checking for valid characters may still leave characters in the data
> that could also be part of an attack.

I would never suggest that you should not escape data properly according
to your database of choice. In fact, I included a very helpful link that
addresses this, and I will include it again:

http://phundamentals.nyphp.org/PH_storingretrieving.php

If you are using MySQL, there is a nice function that escapes your data
for you:

http://www.php.net/mysql_escape_string

If you make sure data is valid and then properly escape it for use in an
SQL statement, you're adhering to what I am suggesting is a best practice
against SQL injection. This is under the assumption that you surround all
literal values with single quotes.

> As for specifics, at the moment I am simply forcing every element of 
> _POST to be truncated to a known maximum length, then run through 
> strip_tags, stripslashes, and htmlspecialchars (in that order) before I 
> use it.

This doesn't work for everyone. I can think of several examples where
users would be submitting HTML and/or PHP code. I wouldn't want to delete
some of their data.

I applaud your efforts in data filtering, because almost all PHP
vulnerabilities that I read about are a result of the author completely
failing to perform any data filtering at all (which is inexcusable).
However, might I suggest that you take a slightly different approach.
Verify that the data is exactly what you expect it to be, and then escape
and/or encode it when necessary.

For example, for storing valid data, use mysql_escape_string() or an
equivalent function for your database of choice. For displaying valid
data, use htmlentities(). If you want some user-submitted tags
interpreted, you can use str_replace() to convert those HTML entities back
(this makes sure that only specific uses of specific tags are
interpreted).

For unvalidated data, do nothing with it until you have validated it with
your data filtering logic. A good software architecture should make it
easy for the developer to keep up with this (naming conventions are also
very helpful for this).

> Then every input form element is validated against an appropriate
> regexp depending on the type of input expected. I also use
> mysql_real_escape_string on all strings prior to writing them to
> the database, and I use single quotes around all integer values.
> If you're game, I'm curious if you see any flaws in this approach.

I'm always game. :-)

This actually sounds like a strong approach to me. I assume that you
surround all data in an SQL statement with single quotes (not just integer
values). In fact, this is almost exactly what I am suggesting. I do not
think you have an SQL injection vulnerability, unless what your code does
strays from this description somehow.

Also, if your applications never allow the user to submit HTML or PHP,
stripping tags is fine. But, you might be interested in letting your
regular expression catch this, so that you can log attacks. Attackers
certainly profile your applications - why not profile their attacks? It
can potentially help us all.

> I am still contemplating whether there is any value to running input
> through htmlspecialchars, or whether I should instead simply be using 
> htmlentities on output.

I prefer htmlentities(), but I think this is a small point.

> I also haven't looked at what this does to nested attacks of various
> kinds and whether there is a way to use multiple iterations or escapes
> in the input data to bypass the filtering (pointers to articles which
> discuss this would be welcome).

The point of escaping or encoding would be lost if it didn't work for all
possible data. I know of no articles for this, nor can I think of anyone
who would bother writing one. :-)

Anyway, I hope that helps.

Chris

=====
Chris Shiflett - http://shiflett.org/

PHP Security - O'Reilly
     Coming mid-2004
HTTP Developer's Handbook - Sams
     http://httphandbook.org/
PHP Community Site
     http://phpcommunity.org/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to