[PATCH] llvmjit: always add the simplifycfg pass

Pierre Ducroquet Wed, 07 Jan 2026 07:09:17 -0800

Hi

While reading the code generated by llvmjit, I realized the number of LLVM 
basic blocks used in tuple deforming was directly visible in the generated 
assembly code with the following code:
0x723382b781c1: jmp 0x723382b781c3
0x723382b781c3: jmp 0x723382b781eb
0x723382b781c5: mov -0x20(%rsp),%rax
0x723382b781..: ... .....
0x723382b781e7: mov %cx,(%rax)
0x723382b781ea: ret
0x723382b781eb: jmp 0x723382b781ed
0x723382b781ed: jmp 0x723382b781ef
0x723382b781ef: jmp 0x723382b781f1
0x723382b781f1: jmp 0x723382b781f3
0x723382b781f3: mov -0x30(%rsp),%rax
0x723382b781..: ... ......
0x723382b78208: mov %rcx,(%rax)
0x723382b7820b: jmp 0x723382b781c5
That's a lot of useless jumps, and LLVM has a specific pass to get rid of 
these. The attached patch modifies the llvmjit code to always call this pass, 
even below jit_optimize_above_cost.


On a basic benchmark (a simple select * from table where f = 42), this 
optimization saved 7ms of runtime while using only 0.1 ms of extra optimization 
time.

Regards
Pierre Ducroquet

From 786ac5ff403603d23152a5584b1d3b61e2ae6b2a Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet <[email protected]>
Date: Wed, 7 Jan 2026 15:43:19 +0100
Subject: [PATCH] llvmjit: always use the simplifycfg pass

The simplifycfg pass will remove empty or unreachable LLVM basic blocks,
and merge blocks together when possible.
This is important because the tuple  deforming code will generate a lot of
basic blocks, and previously with O0 we did not run this pass, thus creating
this kind of (amd64) machine code:
   0x723382b781c1:      jmp    0x723382b781c3
   0x723382b781c3:      jmp    0x723382b781eb
   0x723382b781c5:      mov    -0x20(%rsp),%rax
   0x723382b781..:      ...    .....
   0x723382b781e7:      mov    %cx,(%rax)
   0x723382b781ea:      ret
   0x723382b781eb:      jmp    0x723382b781ed
   0x723382b781ed:      jmp    0x723382b781ef
   0x723382b781ef:      jmp    0x723382b781f1
   0x723382b781f1:      jmp    0x723382b781f3
   0x723382b781f3:      mov    -0x30(%rsp),%rax
   0x723382b781..:      ...    ......
   0x723382b78208:      mov    %rcx,(%rax)
   0x723382b7820b:      jmp    0x723382b781c5

This is not efficient at all, and triggering the simplifycfg pass ends up
tacking a few hundreds micro seconds while possibly saving much more time
during execution. On a basic benchmark, I saved 7ms on query runtime while
using 0.2ms on extra JIT compilation overhead
---
 src/backend/jit/llvm/llvmjit.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 49e76153f9a..208d0d1bc48 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -633,6 +633,11 @@ llvm_optimize_module(LLVMJitContext *context, LLVMModuleRef module)
 	{
 		/* we rely on mem2reg heavily, so emit even in the O0 case */
 		LLVMAddPromoteMemoryToRegisterPass(llvm_fpm);
+		/*
+		 * the tuple deforming generates a lot of basic blocks,
+		 * simplify them even with O0
+		 */
+		LLVMAddCFGSimplificationPass(llvm_fpm);
 	}
 
 	LLVMPassManagerBuilderPopulateFunctionPassManager(llvm_pmb, llvm_fpm);
@@ -675,7 +680,7 @@ llvm_optimize_module(LLVMJitContext *context, LLVMModuleRef module)
 	if (context->base.flags & PGJIT_OPT3)
 		passes = "default<O3>";
 	else
-		passes = "default<O0>,mem2reg";
+		passes = "default<O0>,mem2reg,simplifycfg";
 
 	options = LLVMCreatePassBuilderOptions();
 
-- 
2.43.0

[PATCH] llvmjit: always add the simplifycfg pass

Reply via email to