On 08/18/2013 10:33 PM, Joe Gordon wrote:
An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.
+1, I am hoping this will provide noticeable performance benefits while
being agnostic of what DB back-end is being used. With the way we use
SQLALchemy being 25x slower then MySQL we have lots of room for
improvement (see http://paste.openstack.org/show/44143/ from
def compute_node_get_all(context):
return model_query(context, models.ComputeNode).\
Well, yeah... I suppose if you are attempting to create 115K objects in
memory in Python (Need to collate each ComputeNode model object and each
of its relation objects for Service and Stats) you are going to run into
some performance problems. :)
Would be interesting to see what the performance difference would be if
you instead had dicts instead of model objects and did something like
this instead (code not tested, just off top of head...):
# Assume a method to_dict() that takes a Model
# and returns a dict with appropriate empty dicts for
# relationship fields.
qr = session.query(ComputeNode).join(Service).join(Stats)
results = {}
for record in qr:
node_id = record.ComputeNode.id
service_id = record.Service.id
stat_id = record.ComputeNodeStat.id
if node_id not in results.keys():
results[node_id] = to_dict(record.ComputeNode)
if service_id not in results[node_id]['services'].keys():
results[node_id]['services'][service_id] = to_dict(record.Service)
if stat_id not in results[node_id]['stats'].keys():
results[node_id]['stats'][stat_id] = to_dict(record.ComputeNodeStat)
return results
Whether it would be any faster than SQLAlchemy's joinedload...
Besides that, though, probably is a good idea to look at even the
existence of DB calls that potentially do that kind of massive query
returning as A Bad Thing...
OpenStack-dev mailing list