On Thu, 10 Aug 2000 05:03:38 +0200, Bart Lateur wrote: [description of a mechanism for storing sparse arrays:] >Imagine >that it will be traversed based upon the groups of bits in the array >index. Say, with 32 bit indices, subdivided into 4 bytes. You can start >with the lower byte, which can give you one of 256 pointers to the next >table -- or null (zero), if that segment is completely empty and the >next table nonexistent. Repeat with the second byte, get another table >pointer, or null. Repeat and follow the pointer, at most 4 times in >total. In the last pointer table, a nonzero value points to the thing >itself. Stike that. you can start with the *higher* byte, instead of the lower, and this mechanism could even be used to store ordinary one-dimensional arrays, if you must. (I am *not* familiar with the current system behind arrays; reading the source to extract the idea doesn't sound smart. If somebody could point me to an explanation on how it works, I would be much obliged.) I thought that starting with the lower byte would nicely split all indices up into groups. It will; but there's absolutely no advantage in it. You'll always get a lot of tables. The number of required steps is always the same anyway. The only advantage would be ease of implementation: do a bitwise AND to get the lower byte, and do a shift to the right to get at the next chunk. I think that the common case would be that used array places will be used in chunks, for example, indices 10000 till 10100 are in use, and the rest is empty. If you start with the higher byte, there will be a lot more NULL pointers, i.e. no 1k pointer block attached. For more randomly distributed indices, the pointer blocks would still be largely empty. The situation would be much the same, as far as I can gather. But: the number of steps need not be a fixed number! Say that the array descriptor contains a field containing the number of steps, for example, 3 if the largest index is above 65k but below 16 million. (It is the number of bytes that this largest index can fit in.) You can traverse the tree with a simple loop. Say that you start with an ordinary array of 250 items. This first into one byte, so you just need one step, and one 1k block to hold the pointers to the array items. So it's almost as fast as it can be, just one indirection and you've got your pointer to the scalar value. Now it turns out that you need an array index of 300, beyond the 255 limit. What do you do? You allocate another 1k block, clear it, and put a pointer to the original block in slot #0. Increment the steps field to 2, and set the tree root pointer to the address of this new block. Tada! You can now simply allocate and clear a new 1k block for the second page, array indices 256 to 511, and add the pointer to slot #1 of the root block. All in all, you need just 3 1k blocks to hold the whole tree, and extending it is a simple algorythm. That's why I said it can be used for ordinary arrays as well. Problems? Yes: first of all, doing a splice isn't as simple as it might have been without this tree structure. You'll need lots of copying of chunks of pointers between the 1k blocks, a few per block. Secondly, the subdivision of the index into bit groups isn't as easy (or as fast) as before. A simple "mask and shift" won't do. A "roll and mask" would; but that's in assembler. C doesn't support rolls (shift left, but move top byte of register into lower byte), AFAIK. -- Bart.