On Tue, Jul 30, 2019 at 8:20 PM Binguo Bao <djydew...@gmail.com> wrote: > > John Naylor <john.nay...@2ndquadrant.com> 于2019年7月29日周一 上午11:49写道: >> >> 1). For every needle comparison, text_position_next_internal() >> calculates how much of the value is needed and passes that to >> detoast_iterate(), which then calculates if it has to do something or >> not. This is a bit hard to follow. There might also be a performance >> penalty -- the following is just a theory, but it sounds plausible: >> The CPU can probably correctly predict that detoast_iterate() will >> usually return the same value it did last time, but it still has to >> call the function and make sure, which I imagine is more expensive >> than advancing the needle. Ideally, we want to call the iterator only >> if we have to. >> >> In the attached patch (applies on top of your v5), >> text_position_next_internal() simply compares hptr to the detoast >> buffer limit, and calls detoast_iterate() until it can proceed. I >> think this is clearer. > > > Yes, I think this is a general scenario where the caller continually > calls detoast_iterate until gets enough data, so I think such operations can > be extracted as a macro, as I did in patch v6. In the macro, the > detoast_iterate > function is called only when the data requested by the caller is greater than > the > buffer limit.
I like the use of a macro here. However, I think we can find a better location for the definition. See the header comment of fmgr.h: "Definitions for the Postgres function manager and function-call interface." Maybe tuptoaster.h is as good a place as any? >> I also noticed that no one updates or looks at >> "toast_iter.done" so I removed that as well. > > > toast_iter.done is updated when the buffer limit reached the buffer capacity > now. > So, I added it back. Okay. >> 2). detoast_iterate() and fetch_datum_iterate() return a value but we >> don't check it or do anything with it. Should we do something with it? >> It's also not yet clear if we should check the iterator state instead >> of return values. I've added some XXX comments as a reminder. We >> should also check the return value of pglz_decompress_iterate(). > > > IMO, we need to provide users with a simple iterative interface. > Using the required data pointer to compare with the buffer limit is an easy > way. > And the application scenarios of the iterator are mostly read operations. > So I think there is no need to return a value, and the iterator needs to > throw an > exception for some wrong calls, such as all the data have been iterated, > but the user still calls the iterator. Okay, and see these functions now return void. The orignal pglz_decompress() returned a value that was check against corruption. Is there a similar check we can do for the iterative version? >> 3). Speaking of pglz_decompress_iterate(), I diff'd it with >> pglz_decompress(), and I have some questions on it: >> >> a). >> + srcend = (const unsigned char *) (source->limit == source->capacity >> ? source->limit : (source->limit - 4)); >> >> What does the 4 here mean in this expression? > > > Since we fetch chunks one by one, if we make srcend equals to the source > buffer limit, > In the while loop "while (sp < srcend && dp < destend)", sp may exceed the > source buffer limit and read unallocated bytes. Why is this? That tells me the limit is incorrect. Can the setter not determine the right value? > Giving a four-byte buffer can prevent sp from exceeding the source buffer > limit. Why 4? That's a magic number. Why not 2, or 27? > If we have read all the chunks, we don't need to be careful to cross the > border, > just make srcend equal to source buffer limit. I've added comments to explain > it in patch v6. That's a good thing to comment on, but it doesn't explain why. This logic seems like a band-aid and I think a committer would want this to be handled in a more principled way. >> Is it possible it's >> compensating for this bit in init_toast_buffer()? >> >> + buf->limit = VARDATA(buf->buf); >> >> It seems the initial limit should also depend on whether the datum is >> compressed, right? Can we just do this: >> >> + buf->limit = buf->position; > > > I'm afraid not. buf->position points to the data portion of the buffer, but > the beginning of > the chunks we read may contain header information. For example, for > compressed data chunks, > the first four bytes record the size of raw data, this means that limit is > four bytes ahead of position. > This initialization doesn't cause errors, although the position is less than > the limit in other cases. > Because we always fetch chunks first, then decompress it. I see what you mean now. This could use a comment or two to explain the stated constraints may not actually be satisfied at initialization. >> b). >> - while (sp < srcend && dp < destend) >> ... >> + while (sp + 1 < srcend && dp < destend && >> ... >> >> Why is it here "sp + 1"? > > > Ignore it, I set the inactive state of detoast_iter->ctrl to 8 in patch v6 to > achieve the purpose of parsing ctrl correctly every time. Please explain further. Was the "sp + 1" correct behavior (and why), or only for debugging setting ctrl/c correctly? Also, I don't think the new logic for the ctrl/c variables is an improvement: 1. iter->ctrlc is intialized with '8' (even in the uncompressed case, which is confusing). Any time you initialize with something not 0 or 1, it's a magic number, and here it's far from where the loop variable is used. This is harder to read. 2. First time though the loop, iter->ctrlc = 8, which immediately gets set back to 0. 3. At the end of the loop, iter->ctrl/c are unconditionally set. In v5, there was a condition which would usually avoid this copying of values through pointers. >> 4. Note that varlena.c has a static state variable, and a cleanup >> function that currently does: >> >> static void >> text_position_cleanup(TextPositionState *state) >> { >> /* no cleanup needed */ >> } >> >> It seems to be the detoast iterator could be embedded in this state >> variable, and then free-ing can happen here. That has a possible >> advantage that the iterator struct would be on the same cache line as >> the state data. That would also remove the need to pass "iter" as a >> parameter, since these functions already pass "state". I'm not sure if >> this would be good for other users of the iterator, so maybe we can >> hold off on that for now. > > > Good idea. I've implemented it in patch v6. That's better, and I think we can take it a little bit farther. 1. Notice that TextPositionState is allocated on the stack in text_position(), which passes both the "state" pointer and the "iter" pointer to text_position_setup(), and only then sets state->iter = iter. We can easily set this inside text_position(). That would get rid of the need for other callers to pass NULL iter to text_position_setup(). 2. DetoastIteratorData is fixed size, so I see no reason to allocate it on the heap. We could allocate it on the stack in text_pos(), and pass the pointer to create_detoast_iterator() (in this case maybe a better name is init_detoast_iterator), which would return a bool to tell text_pos() whether to pass down the pointer or a NULL. The allocation of other structs (toast buffer and fetch iterator) probably can't be changed without more work. >> 5. Would it be a good idea to add tests (not always practical), or >> more Assert()'s? You probably already know this, but as a reminder >> it's good to develop with asserts enabled, but never build with them >> for performance testing. > > > I've added more Assert()'s to check iterator state. Okay. > BTW, I found that iterators come in handy for json/jsonb's find field value > or get array elements operations. > I will continue to optimize the json/jsonb query based on the detoast > iterator patch. That will be an interesting use case. There are other aspects of the patch I should investigate, but I'll put that off for another time. Commitfest is over, but note that review can happen at any time. I'll continue to do so as time permits. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services