On 05/14/2012 04:15 PM, Doug Hellmann wrote: > > > On Fri, May 11, 2012 at 3:55 PM, Loic Dachary <l...@enovance.com > <mailto:l...@enovance.com>> wrote: > > > > - The interesting metadata for a resource may depend on the type of > > resource. Do we need separate "tables" for that or can we normalize > > somehow? > > - How do we map a resource to the correct version of its metadata at > > any given time? Timestamps seem brittle. > > - Do we need to reflect the metadata in the aggregation API? > > > Hi, > > I started a new thread for the "metadata" topic. I suspect it deserves > it. Although I was reluctant to acknowledge that the metadate should be > stored by the metering, yesterday's meeting made me realize that it was > mandatory. The compelling reason ( for me ;-) is that it would make it much > more difficult to implement a billing system if the metering does not provide > a simple way to extract metadata and display it in a human readable way (or > meaningfull to accountants ?) . > > I see two separate questions : > > a) how to store and query metadata ? > b) what is the semantic of metadata for a given resource ? > > My hunch is that there will never be a definitive answer to b) and that > the best we can do is to provide a format and leave the semantic to the > documentation of the metering system, explaining the metadata of a resource. > > Regarding the storage of the metadata, the metering could listen / poll > events creating / updating / deleting a given resource and store a history > log indexed by the resource id. Something like: > > { meter_type: TTT, > resource_id: RRR, > metadata: [{ version: VVVV, > timestamp: TIME1, > payload: PAYLOAD1 }, > { version: VVVV, > timestamp: TIME3, > payload: PAYLOAD2 }] > } > > With PPP being the resource dependant metadata that depends on the type > of the resource. And the metadata array being an ordered list of the > successive states of the resource over time. The VVV version accounting for > changes in the format of the payload. > > The query would be : > > GET /resource/<meter_type>/<resource_id>/<TIME2> > > and it would return PAYLOAD1 if TIME2 is in the range [TIME1,TIME3[ > > I'm not sure why you think "timestamp is brittle". Maybe I'm missing > something. > > > Each set of metering data will need to be associated with the appropriate > metadata from the resource at the time the metering information was > collected. The rate of change of metadata and metering events are different, > though, so the timestamps of the metadata records are unlikely to match > exactly with the values in the metering records. Depending on the clock > resolution, it would be possible to have metadata changes and meter data with > the same timestamp, resulting in an incorrect association. Indeed, good point. > > We can work around that by maintaining proper foreign key references using > the metadata version field as you describe in the schema above (so the > resource id and metadata version value point to the correct metadata record). > It will make recording the metering data less efficient because we will need > to determine the current version for the resource metadata, but we can > optimize that eventually through indexes and caching. > > Aggregation will also need to take the metadata version into account, so > everywhere in the list of queries we say "by resource_id" we need to change > that to "by resource_id and version". I added the idea of a format version for when the payload format changes and tried to write down a description of the metadata storage matching this thread in the wiki.
http://wiki.openstack.org/EfficientMetering?action=diff&rev2=80&rev1=78 What do you think ? -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp