On 10/9/07, Gokulakannan Somasundaram <[EMAIL PROTECTED]> wrote: > > > > On 10/8/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > > > > Gokulakannan Somasundaram wrote: > > > I am always slightly late in understanding things. Let me > > try > > > to understand the use of DSM. It is a bitmap index on whether all the > > tuples > > > in a particular block is visible to all the backends, whether a > > particular > > > block contains tuples which are invisible to everyone. But i think > > this will > > > get subjected to the same limitations of Bitmap index. Even Oracle > > suggests > > > the use of Bitmap index for only data warehousing tables, where the > > Bitmap > > > indexes will be dropped and recreated after every bulk load. This is > > not a > > > viable alternative for OLTP transactions. > > > > Well, it's not quite the same as a bitmap index, though both use a > > bitmap. You didn't quite get into details on what the limitations are > > and why it wouldn't be suitable for OLTP, but I don't see any > > significant problems. > > > > > But i think i am late in the game > > > as i haven't participated in those discussions > > > > Better late than never :). > > > > > One Bitmap index block usually maps to lot of blocks in the > > heap. > > > So locking of one page to update the DSM for update/delete/insert > > would hit > > > the concurrency. But again all these are my observation w.r.t oracle > > bitmap > > > indexes. May be i am missing something in DSM. > > > > Yeah, the DSM page could become a contention bottleneck. My current > > thinking is that we'd have a flag in the heap page header, that would be > > > > set together with the bit in the DSM. When the flag in the page header > > is set, you don't need to lock and update the DSM because you know the > > bit is already set. Vacuum would have to clear both the DSM bit and the > > flag. > > > It matters to us, where the index scan will goto. If the Index Scan is > going to touch DSM for understanding visibility(This might degrade the > performance of some of the index scans, if they have to wait to acquire the > share lock, and learn that they have to goto the heap to understand their > visibility requirements.) In the mean while, if the vacuum, > inserts/updates/deletes are holding the BUFFER_EXCLUSIVE lock on that, this > would hurt the Select transactions. Since there is only one bit per block in > the DSM(best case), there might be one DSM block per 8000 table blocks. All > the transactions which are accessing the 8000 blocks will be waiting on this > one DSM block. If we are going to update the Heap page header and asking > the Indexscan to refer to that, then there is no reduction in random I/Os. > Can't we say that if the snapshot info is embedded with index, we can avoid > all these difficulties? Most importantly it won't affect the performance of > current postgres in any way. > > > Let's take up Retail Vacuuming again. The User defined > > function > > > which would return different values at different time can be > > classified as > > > non-deterministic functions. We can say that this index cannot be > > created > > > on a non-deterministic function. This is the way it is implemented in > > > Oracle. What they have done is they have classified certain built-in > > > operators and functions as deterministic. Similarly they have > > classified a > > > few as non-deterministic operators and functions. Can we follow a > > similar > > > approach? > > > > We already do. A function must be marked as IMMUTABLE in order to use it > > in an index expression. But we can't enforce that the user defined > > function really behaves like an immutable function should. If someone > > creates a user-defined function in C that calls the C random() function, > > we can't stop it. > > > A function is said to be deterministic, if it returns the same value, > irrespective of how many times, it is invoked. I think this definition > clearly puts the random function under the non-deterministic category. If we > have such a classification, do you think we can resolve this issue? >
If we frame a set of guidelines/test procedure, do you think it might solve the issue? Even, if we don't allow this type of indexing to anything other than built-in deterministic functions, i feel it would serve most of the indexing requirements. As I said earlier, using an index like that will of course lead to bogus > > results. But it won't currently cause any server crashes or more serious > > > > corruption. > > > > One more final word on unique indexes. Whenever we are doing an update, > there will be insertions into the unique indexes which will trigger table > lookups. Ofcourse there is more probability, that the table block would be > in memory(un-pinned). Still contention for a shared resource is avoided, if > the snapshot info is stored with the indexes. > > Let me get one more clarification, what would be type of performance > results with this implementation, that would encourage the hackers community > to accept the extra maintenance overhead. > > Thanks, > Gokul. > >