>
>
> > I believe this choice was primarily about simplifying the code (similar
> to why we have a n+1
> > offsets instead of just n in the list/varchar representations (even
> though n=0 is always 0)). In both
> > situations, you don't have to worry about writing special code (and a
> condition) for the boundary
> > condition inside tight loops (e.g. the last few bytes need to be handled
> differently since they
> > aren't word width).
>
> Sounds reasonable.  It might be worth illustrating this with a
> concrete example.  One scenario that this scheme seems useful for is a
> creating a new bitmap based on evaluating a predicate (i.e. all
> elements >X).  In this case would it make sense to make it a multiple
> of 16, so we can consistently use SIMD instructions for the logical
> "and" operation?
>

Hmm... interesting thought. I'd have to look but I also recall some of the
newer stuff supporting even wider widths. What do others think?


> I think the spec is slightly inconsistent.  It says there is 6 bytes
> of overhead per entry but then follows: "with the smallest byte width
> capable of representing the number of types in the union."  I'm
> perfectly happy to say it is always 1, always 2, or always capped at
> 2.  I agree 32K/64K+ types is a very unlikely scenario.  We just need
> to clear up the ambiguity.
>

Agreed. Do you want to propose an approach & patch to clarify?

Reply via email to