On Wed, 19 Dec 2018 at 04:24, Tom Lane <t...@sss.pgh.pa.us> wrote: > > David Rowley <david.row...@2ndquadrant.com> writes: > > I recently stumbled upon the following code in ginfast.c: > > while (collector->ntuples + nentries > collector->lentuples) > > { > > collector->lentuples *= 2; > > collector->tuples = (IndexTuple *) repalloc(collector->tuples, > > sizeof(IndexTuple) * collector->lentuples); > > } > > I agree that's pretty stupid, but I wonder whether we should also try > harder in the initial-allocation path. Right now, if the initial > pass through this code sees say 3 tuples to insert, it will then work > with 3 * 2^n entries in subsequent enlargements, which is likely to > be pretty inefficient. Should we try to force the initial allocation > to be a power of 2?
Yeah, I think that's a good idea. > Also, do we need to worry about integer overflow? Good point. I didn't think of it before, but the previous code would have ensured that we got an ERROR before any overflow could have occurred as the repalloc() would fail once the allocation request went beyond MaxAllocSize. However, I think an overflow is sufficiently unlikely that we don't need to do any batching, but we should keep it an error and not a crash. I propose the attached. If we hit the overflow case, we'll still attempt the repalloc(), but it'll fail, just as it would have before, just without the needless calls. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
fix_gin_fast_insert_v2.patch
Description: Binary data