On Thu, Jul 21, 2005 at 08:13:22PM +0100, Neil Williams wrote: > On Thursday 21 July 2005 12:57 am, Chris Shoemaker wrote: > > Ok, this was my point. I completely understand that you can get a > > very quick boolean answer to the question "has anything in the book > > changed?" by checking each collection's dirty flag. But think about > > *how* you'd have to create a list of all dirty entities for the case > > where the task is to commit just the 1 split that changed in my > > example. > > I had been - and it could be solved but I'd have to formalise the idea first. > I'm not sure what is the real-world use for such an API. I can see it as a > fallback for a failed write but that isn't particularly common. I can see it > for incremental storage systems but we don't use those yet. (SQL aside as > that can do this via a separate mechanism).
If by "incremental storage system" you mean something that commits only what has changed, then we're on the same page. (Incidentally, even "immediate-commit" systems sometimes fallback to "delayed-commit" systems when they're in "offline" mode.) > > > ISTM (It seems to me) there are 3 options: 1) You can't do > > that; you must commit all 100000 Splits. 2) You can do that just > > fine, but you must do a linear search through 100000 Splits to find > > the 1 that changed. or 3) You start at the dirty book, and perform > > the tree search I described before. > > Derek's point stands: The book knows nothing about the tree. There is no tree > within the book, it only exists in *our* conceptualisation of the > relationships between objects. All the book knows about are collections and > collections are not linked to each other - only objects link to other > objects. > > Now it *could* be possible for the collection to keep a GList of changed > entities in it's own collection. The question is, is it worth doing? I think not. But this isn't what is required to be able to perform incremental updates with out the linear searches. As you say, it's predictive, not retrospective. > > Keep in mind that all existing mechanisms are retrospective - not much is > done > until the question is asked. Storing a GList of modified entities would have > to be predictive: whether you need it or not, it would be being done. This > isn't just storing a single boolean value that covers tens of thousands of > entities, the GList would store each modified entity and could get incredibly > long in some cases. It may only be storing a pointer to the entity or maybe > it's GUID (as the type is already determined by the collection), but that > will mount up. It is conceivable with the SQL query dialog that I've got > planned for after G2, that the user could update every single instance of one > object type in one operation. > > > The time cost difference between > > 2) and 3) can be arbitrarily large. > > I think it would be too large to inflict on all users at all times for the > odd > occasion that it might be useful. I think you may misunderstand. Both the linear search and the tree search are retrospective, and the cost of the linear search for dirty instances of all types will *always* be equal to or greater than the tree search, and usually (in the cases where not everything is dirty) it will be MUCH greater. Proof: To find all the dirty instances of one type with a linear search where at least one instance is dirty in a collection by type, you must check every instance in the collection. With a tree search you need not check any instance whose referent hasn't been marked as "containing something dirty". > > > I can see that QSF only needs to handle lists of uniformly typed > > entities. However, if there's no way to ask "are there dirty > > Transactions in this Account" > > The Account is marked dirty but the entities responsible are not currently > identifiable. > > > , then *every* selection of a subset of > > Splits for commiting will require a linear search through *all* > > Splits. Does that seem like a problem to you? > > It would if I could see a need to identify only these entities. Right, well, obviously, if you don't mind committing 100000 splits when only one has changed, then the cost of finding the one doesn't really matter. > Currently, I can only see this as a solution in search of a problem. Maybe you're right, but let me play devil's advocate: I don't know the current state of the backends, but imagine this scenario: Backend is remote server, and connection to server goes down. What happens? One option is that GC prevents the user from continuing to edit the data on the screen. Option two is that GC alerts the user that the connection went down and that changes will be committed to the server when the connection comes back, if ever. Let's say we want option two. The user adds/changes some splits and the connection comes back so we want to commit what has changed. But how? Several options: 1) We cached the changes as they were made (as you describe in your "predictive" method.) We just clear the cache. 2) We just send the entire Split collection to the backend and let it figure out what changed. 3) We do a linear search through the Split collection to find the few changes and commit those. 4) We do a tree search that finds that only one Account is marked as "contains dirty Splits" so our linear search through Splits is only through that Account's Splits instead of all Splits. We find the changes and commit them. Any of those options would work. But if this is something that happens often, 2) and 3) will probably be unacceptably expensive. Maybe GC will never have to address this issue because it will never support an "offline" mode with a remote backend. If it does, 4) will be easy to implement as long as instances store a reference to their "parent", like Split does. The implementation is simply to do the same thing to the parent's "contains something dirty" flag as you currently want to do to the Collections "dirty" flag. -chris _______________________________________________ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel