Rasmus Lerdorf wrote: > Well, I actually have years of experience taking apps and making them > run under my strict default filter. And it tends to not be very many > changes, if any at all. In the O'Reilly case it gets changed to > O'Reilly which for a pure web app is fine. If all input > consistently gets changed the same way then you can store O'Reilly > in the backend and a search will still find it since the search query > itself will be encoded the same way. If you have non web tools working > with the same backend data, then you may have a requirement to store it > raw, in which case you'd need to poke a hole in your data firewall.
Rasmus, I'm sure these techniques work very well in practice. However, it's important to note that it's still an optimization, a step down from an "ideal" standard which would involve keeping raw data in the database. In theory, the data in its purest form, with no extraneous escaping, would be stored. In practice, most data will be used in a web context and thus, as you note, escaping it as ' is perfectly acceptable. I've always advocated storing both the pure data and the escaped version (in a kind of cache) in the database, because if you store just the escaped version you don't have any easy way (besides decoding) to get the raw version back. Of course, this doubles the storage requirement. -- Edward Z. Yang GnuPG: 0x869C48DA HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php