[openstack-dev] [Congress] data-source renovation

Tim Hinrichs Tue, 29 Jul 2014 15:04:30 -0700

Hi all,

As I mentioned in a previous IRC, when writing our first few policies I had 
trouble using the tables we currently use to represent external data sources 
like Nova/Neutron.


The main problem is that wide tables (those with many columns) are hard to use. 
 (a) it is hard to remember what all the columns are, (b) it is easy to 
mistakenly use the same variable in two different tables in the body of the 
rule, i.e. to create an accidental join, (c) changes to the datasource drivers 
can require tedious/error-prone modifications to policy.

I see several options.  Once we choose something, I’ll write up a spec and 
include the other options as alternatives.


1) Add a preprocessor to the policy engine that makes it easier to deal with 
large tables via named-argument references.

Instead of writing a rule like

p(port_id, name) :-
    neutron:ports(port_id, addr_pairs, security_groups, extra_dhcp_opts, 
binding_cap, status, name, admin_state_up, network_id, tenant_id, binding_vif, 
device_owner, mac_address, fixed_ips, router_id, binding_host)

we would write

p(id, nme) :-
    neutron:ports(port_id=id, name=nme)

The preprocessor would fill in all the missing variables and hand the original 
rule off to the Datalog engine.

Pros: (i) leveraging vanilla database technology under the hood
      (ii) policy is robust to changes in the fields of the original data b/c 
the Congress data model is different than the Nova/Neutron data models
Cons: (i) we will need to invert the preprocessor when showing 
rules/traces/etc. to the user
      (ii) a layer of translation makes debugging difficult

2) Be disciplined about writing narrow tables and write 
tutorials/recommendations demonstrating how.

Instead of a table like...
neutron:ports(port_id, addr_pairs, security_groups, extra_dhcp_opts, 
binding_cap, status, name, admin_state_up, network_id, tenant_id, binding_vif, 
device_owner, mac_address, fixed_ips, router_id, binding_host)

we would have many tables...
neutron:ports(port_id)
neutron:ports.addr_pairs(port_id, addr_pairs)
neutron:ports.security_groups(port_id, security_groups)
neutron:ports.extra_dhcp_opts(port_id, extra_dhcp_opts)
neutron:ports.name(port_id, name)
...

People writing policy would write rules such as ...

p(x) :- neutron:ports.name(port, name), ...

[Here, the period e.g. in ports.name is not an operator--just a convenient way 
to spell the tablename.]

To do this, Congress would need to know which columns in a table are sufficient 
to uniquely identify a row, which in most cases is just the ID.

Pros: (i) this requires only changes in the datasource drivers; everything else 
remains the same
      (ii) still leveraging database technology under the hood
      (iii) policy is robust to changes in fields of original data
Cons: (i) datasource driver can force policy writer to use wide tables
      (ii) this data model is much different than the original data models
      (iii) we need primary-key information about tables

3) Enhance the Congress policy language to handle objects natively.

Instead of writing a rule like the following ...

p(port_id, name, group) :-
    neutron:ports(port_id, addr_pairs, security_groups, extra_dhcp_opts, 
binding_cap, status, name, admin_state_up, network_id, tenant_id, binding_vif, 
device_owner, mac_address, fixed_ips, router_id, binding_host),
    neutron:ports.security_groups(security_group, group)

we would write a rule such as
p(port_id, name) :-
    neutron:ports(port),
    port.name(name),
    port.id(port_id),
    port.security_groups(group)

The big difference here is that the period (.) is an operator in the language, 
just as in C++/Java.

Pros:
(i) The data model we use in Congress is almost exactly the same as the data 
model we use in Neutron/Nova.

(ii) Policy is robust to changes in the Neutron/Nova data model as long as 
those changes only ADD fields.

(iii) Programmers may be slightly more comfortable with this language.

Cons:

(i) The obvious implementation (changing the engine to implement the (.) 
operator directly is quite a change from traditional database technology.  At 
this point, that seems risky.

(ii) It is unclear how to implement this via a preprocessor (thereby leveraging 
database technology).  The key problem I see is that we would need to translate 
port.name(...) into something like option (2) above.  The difficulty is that 
TABLE could sometimes be a port, sometimes be a network, sometimes be a subnet, 
etc.

(iii) Requires some extra syntactic restrictions to ensure we don't lose 
decidability.

(iv) Because the Congress and Nova/Neutron models are the same, changes to the 
Nova/Neutron model can require rewriting policy.



Thoughts?
Tim
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Congress] data-source renovation

Reply via email to