--- [EMAIL PROTECTED] wrote: > Recently I've been in the middle of trying to build defenses against > SQL injection on a site I'm working on (proactively, we haven't had a > problem). While this principle seems exactly right, I find it's not as > easy to implement as it sounds, and I'd argue that the results aren't > as absolute as you suggest, though you certainly have more experience > with it than I do so perhaps I'm missing something.
I would never argue that something is an absolute defense, but I would characterize my recommendation as a best practice. > The problem is that there are some well-defined attacks with > protections against them that can be logically defended. But there is > no list of all possible attacks, so I'm not sure it's really possible > to say "you're protected against SQL injection" at some point. Do you > feel differently? If so I'd be interested to hear why. The reason why is the difference in approach. If any approach depends on exhaustive knowledge of all possible attacks, the approach is fundamentally flawed and could never be considered secure. There is only one you, and there are an unlimited number of potential attackers. You cannot hope to second guess every single one of them. > I agree with you that checking for valid characters is safer than > checking for malicious characters, but even the former is not absolute. Not absolute in what sense? Making sure something is valid is pretty absolute; the only possible flaws are flaws in "making sure something is valid." For example, I feel confident that no one can show me a string that I would consider a valid first name that is also an SQL injection attack. > Also it is not possible to make the set of characters with syntactic > significance have no overlap with the set of valid input characters -- > a single quote used as an apostrophe is the obvious example, so > checking for valid characters may still leave characters in the data > that could also be part of an attack. I would never suggest that you should not escape data properly according to your database of choice. In fact, I included a very helpful link that addresses this, and I will include it again: http://phundamentals.nyphp.org/PH_storingretrieving.php If you are using MySQL, there is a nice function that escapes your data for you: http://www.php.net/mysql_escape_string If you make sure data is valid and then properly escape it for use in an SQL statement, you're adhering to what I am suggesting is a best practice against SQL injection. This is under the assumption that you surround all literal values with single quotes. > As for specifics, at the moment I am simply forcing every element of > _POST to be truncated to a known maximum length, then run through > strip_tags, stripslashes, and htmlspecialchars (in that order) before I > use it. This doesn't work for everyone. I can think of several examples where users would be submitting HTML and/or PHP code. I wouldn't want to delete some of their data. I applaud your efforts in data filtering, because almost all PHP vulnerabilities that I read about are a result of the author completely failing to perform any data filtering at all (which is inexcusable). However, might I suggest that you take a slightly different approach. Verify that the data is exactly what you expect it to be, and then escape and/or encode it when necessary. For example, for storing valid data, use mysql_escape_string() or an equivalent function for your database of choice. For displaying valid data, use htmlentities(). If you want some user-submitted tags interpreted, you can use str_replace() to convert those HTML entities back (this makes sure that only specific uses of specific tags are interpreted). For unvalidated data, do nothing with it until you have validated it with your data filtering logic. A good software architecture should make it easy for the developer to keep up with this (naming conventions are also very helpful for this). > Then every input form element is validated against an appropriate > regexp depending on the type of input expected. I also use > mysql_real_escape_string on all strings prior to writing them to > the database, and I use single quotes around all integer values. > If you're game, I'm curious if you see any flaws in this approach. I'm always game. :-) This actually sounds like a strong approach to me. I assume that you surround all data in an SQL statement with single quotes (not just integer values). In fact, this is almost exactly what I am suggesting. I do not think you have an SQL injection vulnerability, unless what your code does strays from this description somehow. Also, if your applications never allow the user to submit HTML or PHP, stripping tags is fine. But, you might be interested in letting your regular expression catch this, so that you can log attacks. Attackers certainly profile your applications - why not profile their attacks? It can potentially help us all. > I am still contemplating whether there is any value to running input > through htmlspecialchars, or whether I should instead simply be using > htmlentities on output. I prefer htmlentities(), but I think this is a small point. > I also haven't looked at what this does to nested attacks of various > kinds and whether there is a way to use multiple iterations or escapes > in the input data to bypass the filtering (pointers to articles which > discuss this would be welcome). The point of escaping or encoding would be lost if it didn't work for all possible data. I know of no articles for this, nor can I think of anyone who would bother writing one. :-) Anyway, I hope that helps. Chris ===== Chris Shiflett - http://shiflett.org/ PHP Security - O'Reilly Coming mid-2004 HTTP Developer's Handbook - Sams http://httphandbook.org/ PHP Community Site http://phpcommunity.org/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php