On 9/4/24 3:08 PM, Tom Lane wrote:
Nathan Bossart <nathandboss...@gmail.com> writes:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata.  I've attached a short patch to add one for
pg_index, which resolves the issue cited here.  This passes "check-world"
and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index).  I
haven't spent too much time investigating possible circularity issues, but
I'll note that none of the system indexes presently use the indexprs and
indpred columns.

Yeah, the possibility of circularity seems like the main hazard, but
I agree it's unlikely that the entries for system indexes could ever
need out-of-line storage.  There are many other things that would have
to be improved before a system index could use indexprs or indpred.

Agreed on the unlikeliness of that, certainly in the short-to-mid term. The impetus driving this is dealing with a data type that can be quite large, and it's unlikely system catalogs will be dealing with anything of that nature, or requiring very long expressions that couldn't be encapsulated in a different way.

Just to be fair, in the case I presented there's an argument that what I'm trying to do is fairly inefficient for an expression, given I'm passing around an additional several KB payload into the query. However, we'd likely have to do that anyway for this problem space, and the overall performance hit is negligible compared to the search relevancy boost.

I'm working on a much more robust test, but using a known 10MM 768-dim dataset and two C-based quantization functions (one using the expression), I got a 3% relevancy boost with a 2% reduction in latency and throughput. On some other known datasets, I was able to improve relevancy 40% or more, though given they were initially returning with 0% relevancy in some cases, it's not fair to compare performance numbers.

There are other ways to solve the problem as well, but allowing for the larger expression gives users more choices in how they can approach it.

Jonathan

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to