I found another off-by-one error in my Picksplit() algorithm and the GiST index 
contains one leaf tuple for each row in the table now.  The error was to start 
from 1 instead of 0 when assigning the entries.  Thanks to everyone for your 
help.

For the record, this is the only GiST index I know of where the keys are over 
2000 bytes in size.  So GiST definitely handles large keys.  Perhaps the 
maximum size for intarray could be increased.

On Nov 23, 2010, at 4:01 PM, Yeb Havinga wrote:

> On 2010-11-23 20:54, Peter Tanski wrote:
>> On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote:
>>>>>> j = 0;
>>>>>> for (i = FirstOffsetNumber; i<  maxoff; i = OffsetNumberNext(i)) {
>>>>>>   FPrint* v = deserialize_fprint(entv[i].key);
>>>>> Isn't this off by one?  Offset numbers are 1-based, so the maxoff
>>>>> computation is wrong.
>>> The first for loop of all others compare with i<= maxoff instead of i<  
>>> maxoff.
>> You are right: I am missing the last one, there.  (During a memory-debugging 
>> phase entv[entryvec-n - 1] was always invalid, probably as a memory 
>> overwrite error but I fixed that later and never changed it back.)
>> 
>> On the other hand, there are two problems:
>> 
>> 1. the maximum size on a GiST page is 4240 bytes, so I cannot add a 
>> full-size Datum using this kind of hash-key setup (the base Datum size is 
>> 4230 bytes on a 64-bit machine).  The example test cases I used were smaller 
>> in order to get around that issue: they are 2326 bytes base size.
>> 
>> 2. Even after fixing the Picksplit() loop, the dropped-leaf problem still 
>> manifests itself:
> I noticed an n_entries intialization in one of your earlier mails that might 
> also be a source of trouble. I was under the impression that gistentryvectors 
> have n-1 entries (not n-2 as you say), because the first element (0 / 
> InvalidOffsetNumber) must be skipped. E.g. entryvec->n = 5. This means that 
> there are 4 entries, which are in array positions 1,2,3,4.
> 
> btw: interesting topic, audio fingerprinting!
> 
> regards,
> Yeb Havinga
> 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to