Re: Improve eviction algorithm in ReorderBuffer

Masahiko Sawada Mon, 11 Mar 2024 22:24:01 -0700

On Fri, Mar 8, 2024 at 12:58 PM Peter Smith <smithpb2...@gmail.com> wrote:
>
> On Thu, Mar 7, 2024 at 2:16 PM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> >
> > On Tue, Mar 5, 2024 at 3:28 PM Peter Smith <smithpb2...@gmail.com> wrote:
> > >
>
> > > 4a.
> > > The comment in simplehash.h says
> > >  *   The following parameters are only relevant when SH_DEFINE is defined:
> > >  *   - SH_KEY - ...
> > >  *   - SH_EQUAL(table, a, b) - ...
> > >  *   - SH_HASH_KEY(table, key) - ...
> > >  *   - SH_STORE_HASH - ...
> > >  *   - SH_GET_HASH(tb, a) - ...
> > >
> > > So maybe it is nicer to reorder the #defines in that same order?
> > >
> > > SUGGESTION:
> > > +#define SH_PREFIX bh_nodeidx
> > > +#define SH_ELEMENT_TYPE bh_nodeidx_entry
> > > +#define SH_KEY_TYPE bh_node_type
> > > +#define SH_SCOPE extern
> > > +#ifdef FRONTEND
> > > +#define SH_RAW_ALLOCATOR pg_malloc0
> > > +#endif
> > >
> > > +#define SH_DEFINE
> > > +#define SH_KEY key
> > > +#define SH_EQUAL(tb, a, b) (memcmp(&a, &b, sizeof(bh_node_type)) == 0)
> > > +#define SH_HASH_KEY(tb, key) \
> > > + hash_bytes((const unsigned char *) &key, sizeof(bh_node_type))
> > > +#include "lib/simplehash.h"
> >
> > I'm really not sure it helps increase readability. For instance, for
> > me it's readable if SH_DEFINE and SH_DECLARE come to the last before
> > #include since it's more obvious whether we want to declare, define or
> > both. Other simplehash.h users also do so.
> >
>
> OK.
>
> > > 5.
> > > + *
> > > + * If 'indexed' is true, we create a hash table to track of each node's
> > > + * index in the heap, enabling to perform some operations such as 
> > > removing
> > > + * the node from the heap.
> > >   */
> > >  binaryheap *
> > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void 
> > > *arg)
> > > +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> > > + bool indexed, void *arg)
> > >
> > > BEFORE
> > > ... enabling to perform some operations such as removing the node from 
> > > the heap.
> > >
> > > SUGGESTION
> > > ... to help make operations such as removing nodes more efficient.
> > >
> >
> > But these operations literally require the indexed binary heap as we
> > have an assertion:
> >
> > void
> > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
> > {
> >     bh_nodeidx_entry *ent;
> >
> >     Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> >     Assert(heap->bh_indexed);
> >
>
> I didn’t quite understand -- the operations mentioned are "operations
> such as removing the node", but binaryheap_remove_node() also removes
> a node from the heap. So I still felt the comment wording of the patch
> is not quite correct.


Now I understand your point. That's a valid point.

>
> Now, if the removal of a node from an indexed heap can *only* be done
> using binaryheap_remove_node_ptr() then:
> - the other removal functions (binaryheap_remove_*) probably need some
> comments to make sure nobody is tempted to call them directly for an
> indexed heap.
> - maybe some refactoring and assertions are needed to ensure those
> *cannot* be called directly for an indexed heap.
>

If the 'index' is true, the caller can not only use the existing
functions but also newly added functions such as
binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about
something like below?

 * If 'indexed' is true, we create a hash table to track each node's
 * index in the heap, enabling to perform some operations such as
 * binaryheap_remove_node_ptr() etc.

>
> And, here are some review comments for v8-0002.
>
> ======
> 1. delete_nodeidx
>
> +/*
> + * Remove the node's index from the hash table if the heap is indexed.
> + */
> +static bool
> +delete_nodeidx(binaryheap *heap, bh_node_type node)
> +{
> + if (!binaryheap_indexed(heap))
> + return false;
> +
> + return bh_nodeidx_delete(heap->bh_nodeidx, node);
> +}
>
> 1a.
> In v8 this function was changed to now return bool, so, I think the
> function comment should explain the meaning of that return value.
>
> ~
>
> 1b.
> I felt the function body is better expressed positively: "If this then
> do that", instead of "If not this then do nothing otherwise do that"
>
> SUGGESTION
> if (binaryheap_indexed(heap))
>   return bh_nodeidx_delete(heap->bh_nodeidx, node);
>
> return false;
>
> ~~~
>
> 2.
> +static void
> +replace_node(binaryheap *heap, int index, bh_node_type new_node)
> +{
> + bool found PG_USED_FOR_ASSERTS_ONLY;
> +
> + /* Quick return if not necessary to move */
> + if (heap->bh_nodes[index] == new_node)
> + return;
> +
> + /*
> + * Remove overwritten node's index. The overwritten node's position must
> + * have been tracked, if enabled.
> + */
> + found = delete_nodeidx(heap, heap->bh_nodes[index]);
> + Assert(!binaryheap_indexed(heap) || found);
> +
> + /*
> + * Replace it with the given new node. This node's position must also be
> + * tracked as we assume to replace the node by the existing node.
> + */
> + found = set_node(heap, new_node, index);
> + Assert(!binaryheap_indexed(heap) || found);
> +}
>
> 2a.
> /Remove overwritten/Remove the overwritten/
> /replace the node by the existing node/replace the node with the existing 
> node/
>
> ~
>
> 2b.
> It might be helpful to declare another local var...
> bh_node_type cur_node = heap->bh_nodes[index];
>
> ... because I think it will be more readable to say:
> + if (cur_node == new_node)
> + return;
>
> and
>
> + found = delete_nodeidx(heap, cur_node);

As for changes around delete_nodeidx(), I've changed the
delete_nodeidx() to return nothing as it would not be helpful much and
seems confusing. I've simplified replace_node() logic accordingly.

I'll update 0003 patch to address your comment and submit the updated
version patches.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Improve eviction algorithm in ReorderBuffer

Reply via email to