RE: [PoC] Partition path cache

2024-10-28 Thread Bykov Ivan
Hello > This sounds like an interesting idea, I like it because it omit the needs for > "global statistics" effort for partitioned table since it just use the first > partition it knows. Of couse it has its drawback that "first" > partition can't represent other partitions. This method uses glo

Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2024-11-26 Thread Bykov Ivan
Hi, all! It seems I have found a bug in the Query ID calculation. Problem === In some cases, we could have same IDs for not identical query trees. For two structurally similar query subnodes, we may have query trees that look like this: QueryA->subNodeOne = Value X; QueryA->subNodeTwo =

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-07 Thread Bykov Ivan
Hello! Last time, I forgot to attach the patches. The problem still persists in the 17.3 release. Solution One The simplest way to fix the problem is to place the scalar field used in the query ID calculation between similar subnodes. A patch for this solution is attached below (00

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-06 Thread Bykov Ivan
Here is 0001-Query-ID-Calculation-Fix-Variant-A.patch and 0001-Query-ID-Calculation-Fix-Variant-B.patch 0001-Query-ID-Calculation-Fix-Variant-B.patch Description: 0001-Query-ID-Calculation-Fix-Variant-B.patch 0001-Query-ID-Calculation-Fix-Variant-A.patch Description: 0001-Query-ID-Calculation

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-11 Thread Bykov Ivan
Hello! >> Variant B is not acceptable IMO as it adds a whole bunch of >> null-terminators unnecessarily. For example, in a simple "select 1", >> (expr == NULL) is true 19 times, so that is an extra 19 bytes. > Variant B is not acceptable here. Could we improve Variant B? I was thinking about

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-11 Thread Bykov Ivan
Here is bug description from https://www.postgresql.org/message-id/flat/ca447b72d15745b9a877fad7e258407a%40localhost.localdomain Problem === In some cases, we could have same IDs for not identical query trees. For two structurally similar query subnodes, we may have query trees that look like

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-15 Thread Bykov Ivan
Hello! > Since then, I see that Ivan > has already submitted a patch that accounts for NULL nodes and adds a > byte to the jumble buffer to account for NULLs. This seems quite clean > and simple. However, Sami seems to have concerns about the overhead of > doing this. Is that warranted at all? Pot

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-25 Thread Bykov Ivan
Hello, David! As I can see, your patch has the same idea as my v2-0001-Query-ID-Calculation-Fix-Variant-B.patch from [1]. I think it would be better to extract the jumble buffer update with hash calculation into a function (CompressJumble in my patch). This will result in fewer changes to the s

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-17 Thread Bykov Ivan
Hello, Michael! > So, here is attached a counter-proposal, where we can simply added a > counter tracking a node count in _jumbleNode() to add more entropy to > the mix, incrementing it as well for NULL nodes. It definitely looks like a more reliable solution than my variant, which only counts NU

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

2025-03-26 Thread Bykov Ivan
Hello, David! > I see you opted to only jumble the low-order byte of the pending null > counter. I used the full 4 bytes as I don't think the extra 3 bytes is > a problem. Witrh v7 the memcpy to copy the pending_nulls into the > buffer is inlined into a single instruction. It's semi-conceivable >