On 12.03.2011 01:11, Mark Phippard wrote: > I am glad you sent this because I was getting ready to send an email > to see if anyone is looking into the suggestions you have made here. > I think we have to get this work done soon. We cannot release with > performance like it is. How do we define the scope of the work that > needs to be done so that we can divide and conquer and get these > changes in place?
>From the various performance-related threads, I can extract two major areas of improvement: * transactional writing of changes to the wc-db (especially during checkout/update), where database updates are currently done via many tiny transactions. * reducing the number of queries performed during wc read/scan operations. The first task doesn't look at all trivial from where I'm standing. I admit to being blissfully ignorant of how checkout/update twiddle the wc-db. For the second task, I think the first order of business is to change the wc-db tree crawler to do one query instead of zillions, or at least, where several queries are required, to do them all in one transaction. The first step requires "someone" to catalog all the places where it's used, and how the results are being filtered by the callers; based on that, we can come up with a reasonably small set of parameters that would make that crawling query useful for all the myriad uses the crawler currently has. This part is more or less linear; afterwards, callers can probably be updated as needed. Then there's the NODES/ACTUAL_NODE merge. That one is a bit tricky, but I think I can see a way to create (and use) such a merged table whilst still keeping the current separate ones in use. The problem is that doing this would essentially double the size of the wc-db, *but* it would allow a gradual transition to the merged table, and the transition would not have to be complete before the 1.7 release. (There's also a "proof-of-concept" path that requires creating a view that simulates such a merged table, but it has two serious drawbacks: (1) a view is read-only, so data modifications could only happen through the original tables; and (2) the query for such a view would be prohibitively slow so there'd be no way to keep track of possible performance improvements.) -- Brane