On Mar 16, 2011 5:34 PM, "C. Michael Pilato" <cmpil...@collab.net> wrote: > > On 03/16/2011 01:17 PM, Greg Stein wrote: > > On Wed, Mar 16, 2011 at 12:59, C. Michael Pilato <cmpil...@collab.net> wrote: > >> ... > >> to manage at least the "read" subset of these operations. But I find myself > >> wondering if we wouldn't be better served by having a properties table with > >> rows for, I dunno: wc_id, local_relpath, property_name, property_value. > >> ... > >> Was this considered when we moved the properties into the database? If so, > >> why didn't we take this approach? Should we consider it now? Should we > >> punt it to 1.8? > > > > It was considered. Hyrum and I figured it would be best to use a skel > > and avoid a join. We assumed it is the rare case that we need a single > > property, rather than some/all of the properties. > > > > If you want to experiment with another table and a JOIN, then I would > > recommend waiting until 1.8 to do that. If we find that properties in > > their current form are killing us, then we can discuss further. > > > > My understanding is that # queries is our concern at the moment, > > rather than skel-unpacking. > > > > Cheers, > > -g > > Thanks for the background, Greg. > > It's definitely number-of-queries that I'm thinking about here, too. > > I'm *not* concerned about the pure cost of mere skel-unpacking. It's more > that because properties aren't first-class citizens in the schema, we have > to trade what could be a single statement: > > "Go add/change the prop/val pair FOO=BAR on every path at or under > TARGET"
I think we want to optimize for reads over writes. And so I think avoiding a join will be better. > > into a one-at-a-time, many-statements approach: > > "for PATH1, read it's properties skel, parse the skel, set FOO=BAR in > its propset, re-skel-ify, and update the skel; now do that for PATH2, > whose resulting skel won't necessarily be the same as PATH1's; now do it > for PATH3..." > > Even when just reading properties, our best option is to read the whole > property set for chunks of files/dirs, and then immedately throw out all the > properties we don't care about. With a purer bit of relation in the schema > for properties, these queries get simpler and waste less intermediate memory. The average number of properties per node is small. I wouldn't worry about memory. If you find a problem, then I think we can fix it in 1.8. If you find something horrendous, then maybe 1.7. But do we have any indication of a problem here? > > -- C-Mike > > (PS: Happy birthday!) Thanks! :-) Cgeers, -g > > -- > C. Michael Pilato <cmpil...@collab.net> > CollabNet <> www.collab.net <> Distributed Development On Demand >