Re: [Lazarus] Sorting BufferDataset

Joost van der Sluis via lazarus Fri, 17 Apr 2020 04:26:23 -0700

Op 13-04-2020 om 13:59 schreef Santiago A. via lazarus:

No need to create indexes or dropping indexes or changing fields of anindex. Just leave IndexName blank and play with IndexFieldNames (that infact changes the fields of index ''). and if you set IndexFieldNames toblank it restores the original table order.
That make of TbufDataset a useful component, not a nightmare. It needsdesperately documentation.


Roflol. I'm happy to hear that.

And, yes, it desperately needs documentation.

How can I help with doc?

There is the official-style documentation, you can retrieve it from svnhere: https://svn.freepascal.org/svn/fpcdocs/trunk

You can send patches to the bug-tracker. Normally a class is onlyincluded into the official documentation, once it is fully documented.You can send patches to the bug-tracker. Questions to the fpc-develmailinglist. And Lazarus has a plugin to edit these kind ofdocumentation-files. (Look for fpdoc, I don't know the exact name of theLazarus plugin)

But, if I where you, I would start in the Wiki. Because people needbackground-information regarding the TBufDataset.

I've written this before, but I think it is hard to find, so here again.Some background-information, which explains the problems you had.

You can freely modify, edit, improve it. And maybe start somedocumentation with it. Or maybe decide that it is too much of abackground-story. That's also fine.

The indices of TBufDataset are not just lists with references to thedata. They contain the data itself! There are multiple descendants ofTBufIndex, and each of them has their own way of storing data.

This is because I never could decide upon how to store the data: in alinked list of records, or an array (one contagious block) of records.Or any variants thereof.

In the end, only the TDoubleLinkedBufIndex made it into production.(Nowadays there is a unidirectional variant, but this is irrelevant for now)

When a TBufDataset stores it data through TDoubleLinkedBufIndex, eachrecord in memory starts with a header. This header has a link to theprior and next records, *for all different indexes*. So when there aretwo indices, the default one and one spare, for each and every recordyou know what the prior (and next) record is in both indexes. (When youlook at the code: the header consists of a list of TBufRecLinkItem-records)

The spare (hidden) index is there to make it easy to order the datasetbased on a list of fieldnames, as you already discovered.

It is impossible to add a new index once the data is stored in memory,because this would mean that the memory of all records has to beallocated with space for one more index. While this might sound doable,it is not in practice, because all kinds of mechanisms demand that thedataset-data in memory stays at the same position (pointer).

So for each index, memory has to be allocated with the size two pointersfor every record for each index.

This is why you have to provide the amount of indexes you want to use,before the data gets read. (+2, one for the main index, and one hidden)


The hidden index is rebuild each time the IndexFieldNames is changed.

Note that the double-linked buffers have advantages, but also somedisadvantages. Adding and even inserting a record is easy. Finding whereto insert a record in a sorted index, is not. You always have to loopthrough the records to find one particular record. This also holds forthe lookup-functions of the TBufDataset.

And there is the issue of the RecNo property. If you want to know whatthe current record-number is, TBufDataset has to start with the firstrecord in the index, and loop through all records until it reaches thecurrent record. Maybe that the current record-number is cached nowadays.I do know that the record-count is cached. But what is important toremember: do *not* use the recordcount propery of a TBufDataset. It'sEvil. Do not make your logic depend on it.


So, that's it for now. I hope some of you found it interesting.

(Oh, and note that TDataset itself also buffers the data, so aTBufDataset has *two* buffers, holding the same data. The TDatasetbuffer is fairly small, though. The default is 10 records, iirc. Butwhen you attach a TDataSource to a TDataset, the TDataSource tells theTDataset how many records it needs at least. So when on of theTDataSources need 200 records, TDataset will cache 200 records. Thiscould be used by a grid-view, for example, wich shows 200 records at atime. Also note that in a uni-directional dataset, this buffer build inTDataset itself is not uni-directional. That's why you can see data in adata-grid bound to a unidirectional TDataset.)


Regards,

Joost.
--
_______________________________________________
lazarus mailing list
lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Re: [Lazarus] Sorting BufferDataset

Reply via email to