Excerpts from Mike Bayer's message of 2015-03-09 10:26:37 -0700: > > Clint Byrum <cl...@fewbar.com> wrote: > > > Excerpts from David Stanek's message of 2015-03-08 11:18:05 -0700: > >> On Sun, Mar 8, 2015 at 1:37 PM, Mike Bayer <mba...@redhat.com> wrote: > >> > >>> can you elaborate on your reasoning that FK constraints should be used > >>> less > >>> overall? or do you just mean that the client side should be mirroring the > >>> same > >>> rules that would be enforced by the FKs? > >> > >> I don't think he means that we will use them less. Our SQL backends are > >> full of them. What Keystone can't do is rely on them because not all > >> implementations of our backends support FKs. > > > > Note that they're also a huge waste of SQL performance. It's _far_ cheaper > > to scale out application servers and garbage-collect using background jobs > > like pt-archiver than it will ever be to scale out a consistent data-store > > and do every single little bit of house keeping in real time. So even > > on SQL backends, I'd recommend just disabling and dropping FK constraints > > if you expect any more than the bare minimum usage of Keystone. > > Im about -1000 on disabling foreign key constraints. Any decision based on > “performance” IMHO has to be proven with benchmarks. Foreign keys on modern > databases like MySQL and Postgresql do not add overhead to any significant > degree compared to just the workings of the Python code itself (which means, > a benchmark here should be illustrating a tangible impact on the python > application itself). OTOH, the prospect of a database with failed > referential integrity is a recipe for disaster. >
So I think I didn't speak clearly enough here. The benchmarks are of course needed, but there's a tipping point when write activity gets to a certain level where it's cheaper to let it get a little skewed and correct asynchronously. This is not unique to SQL, this is all large scale distributed systems. There's probably a super cool formula for it too, but roughly it is (num_trans_per_s * cost_of_fk_check_per_trans) versus (error_cost * error_rate)+(cost_find_all_errors/seconds_to_find_all_errors) So it's not really something I think one can blindly accept as "better", but rather something that one needs to calculate for themselves. You say cost_of_fk_check_per_trans is negligible, but that has been measured as not true in the past: http://www.percona.com/blog/2010/09/20/instrumentation-and-the-cost-of-foreign-keys/ That article demonstrates that the FK adds lock contention in InnoDB. There's more. With NDB (MySQL cluster) it's an 18% performance hit on raw throughput: http://johanandersson.blogspot.com/2013/06/benchmarking-performance-impact-of.html Though that could be artificially inflated due to being a raw benchmark. Now, where that point is with Keystone I don't know. The point is, if you write the code relying on the existence, Keystone becomes a vertically scaling app that cannot ever scale out beyond whatever that limit is. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev