Hi everyone,
I’m Ismail, a compiler engineer intern at Apple. As a part of my internship,
I'm adding Fast Conditional Breakpoints to LLDB, using code patching.
Currently, the expressions that power conditional breakpoints are lowered
to LLVM IR and LLDB knows how to interpret a subset of it. If that fails,
the debugger JIT-compiles the expression (compiled once, and re-run on each
breakpoint hit). In both cases LLDB must collect all program state used in
the condition and pass it to the expression.
The goal of my internship project is to make conditional breakpoints faster by:
1. Compiling the expression ahead-of-time, when setting the breakpoint and
inject into the inferior memory only once.
2. Re-route the inferior execution flow to run the expression and check whether
it needs to stop, in-process.
This saves the cost of having to do the context switch between debugger and
the inferior program (about 10 times) to compile and evaluate the condition.
This feature is described on the [LLDB Project
page](https://lldb.llvm.org/status/projects.html#use-the-jit-to-speed-up-conditional-breakpoint-evaluation).
The goal would be to have it working for most languages and architectures
supported by LLDB, however my original implementation will be for C-based
languages targeting x86_64. It will be extended to AArch64 afterwards.
Note the way my prototype is implemented makes it fully extensible for other
languages and architectures.
## High Level Design
Every time a breakpoint that holds a condition is hit, multiple context
switches are needed in order to compile and evaluate the condition.
First, the breakpoint is hit and the control is given to the debugger.
That's where LLDB wraps the condition expression into a UserExpression that
will get compiled and injected into the program memory. Another round-trip
between the inferior and the LLDB is needed to run the compiled expression
and extract the expression results that will tell LLDB to stop or not.
To get rid of those context switches, we will evaluate the condition inside
the program, and only stop when the condition is true. LLDB will achieve this
by inserting a jump from the breakpoint address to a code section that will
be allocated into the program memory. It will save the thread state, run the
condition expression, restore the thread state and then execute the copied
instruction(s) before jumping back to the regular program flow.
Then we only trap and return control to LLDB when the condition is true.
## Implementation Details
To be able to evaluate a breakpoint condition without interacting with the
debugger, LLDB changes the inferior program execution flow by overwriting
the instruction at which the breakpoint was set with a branching instruction.
The original instruction(s) are copied to a memory stub allocated in the
inferior program memory called the __Fast Conditional Breakpoint Trampoline__
or __FCBT__. The FCBT will allow us the re-route the program execution flow to
check the condition in-process while preserving the original program behavior.
This part is critical to setup Fast Conditional Breakpoints.
```
Inferior Binary Trampoline
| . | +-------------------------+
| . | | |
| . | +--------->+ Save RegisterContext |
| . | | | |
+-------------------------+ | +-------------------------+
| | | | |
| Instruction | | | Build Arguments Struct |
| | | | |
+-------------------------+ | +-------------------------+
| +-----------+ | |
| Branch to Trampoline | | Call Condition Checker |
| +<----------+ | |
+-------------------------+ | +-------------------------+
| | | | |
| Instruction | | | Restore RegisterContext |
| | | | |
+-------------------------+ | +-------------------------+
| . | | | |
| . | +----------+ Run Copied Instructions |
| . | | |
| . | +-------------------------+
```
Once the execution reaches the Trampoline, several steps need to be taken.
LLDB relies on its UserExpressions to JIT these more complex conditional
expressions. However, since the execution will be handled by the debugged
program, LLDB will generate some code ahead-of-time in theTrampoline that
will allow the inferior to initialize the expression's argument structure.
Generating the condition checker as well as the code to initialize
the argument structure of each breakpoint hit is handled by
__BreakpointInjectedSite__ class, which builds the conditional expression for
all the BreakpointLocations, emits the `$__lldb_expr` function, and relocates
variables in the `$__lldb_arg` structure.
BreakpointInjectedSites are created in the __Process__ if the user enables
the `-I | --inject-condition` flag when setting or modifying a breakpoint.
Because the __FCBT__ is architecture specific, BreakpointInjectedSites will
only be available when a target has added support to it, in the matching
Architecture Plugin.
Several parts of lldb have to be modified to implement this feature:
- **Breakpoint**: Added BreakpointInjectedSite, and helper functions to the
related class (Breakpoint, BreakpointLocation,
BreakpointSite, BreakpointOptions)
- **Plugins**: Added ObjectFileTrampoline for the unwinding
Added x86_64 ABI support (FCBT setup & safety checks)
- **Symbol**: Changed `FuncUnwinders` and `UnwindPlan` to support FCBT
- **Target**: Added BreakpointInjectedSite creation to `Process` to insert
the jump to the FCBT
Added the Trampoline module creation to `ABI` for the
unwinding
### Breakpoint Option
Since Fast Conditional Breakpoints are still under development, they will not
be on by default, but rather we will provide a flag to 'breakpoint set" and
"breakpoint modify" to enable the feature. Note that the end-goal is to have
them as a default and only fallback to regular conditional breakpoints on
unsupported architectures.
They can be enabled when using `-I | --inject-condition` option. These options
can also be enabled using the Python Scripting Bridge public API, using the
`InjectCondition(bool enable)` method on an __SBBreakpoint__ or
__SBBreakpointLocation__ object.
This feature is intended to be used with condition expression
(`-c <expr> | --condition <expr>`), but also other conditions types such as:
- Thread ID (`-t <thread-id> | --thread-id <thread-id>`)
- Thread Index (`-x <thread-index> | --thread-index <thread-index>`)
- Thread Queue Name
### Trampoline
To be able to inject the condition, we need to re-route the debugged program's
execution flow. This parts is handled in the __Trampoline__, a memory stub
allocated in the inferior that will contain the condition check while
preserving the program's original behavior.
The trampoline is architecture specific and built by lldb. To have the
condition evaluation work out-of-place, several steps need to be completed:
1. Save all the registers by pushing them to the stack
2. Build the `$__lldb_arg` structure by calling a injected UtilityFunction
3. Check the condition by calling the injected UserExpression and execute a
trap if the condition is true.
4. Restore register context
5. Rewrite and run original copied instructions operands
All the values needed for the steps can be computed ahead of time, when the
breakpoint is set (i.e: size of the allocation, jump address, relocation ...).
Since the x86_64 ISA has variable instruction size, LLDB moves enough
instructions in the trampoline to be able to overwrite them with a jump to the
trampoline. Also, the allocation region for the trampoline might be too far
away for a single jump, so we might need to have several branch island before
reaching the trampoline (WIP).
### BreakpointInjectedSite
To handle the Fast Conditional Breakpoint setup, LLDB uses
__BreakpointInjectedSites__ which is a sub-class of the BreakpointSite class.
BreakpointInjectedSites uses different `UserExpression` to resolve variables
and inject the condition checker.
#### Condition Checker
Because a BreakpointSite can have multiple BreakpointLocations with different
conditions, LLDB need first iterate over each owner of the BreakpointSite and
gather all the conditions. If one of the BreakpointLocations doesn't have a
condition or the condition is not set to be injected, the
BreakpointInjectedSite will behave as a regular BreakpointSite.
Once all the conditions are fetched, LLDB will create a __UserExpression__
with the injected trap instruction.
When a trap is hit, LLDB uses the __BreakpointSiteList__, a map from a trap
address to a BreakpointSite to identify where to stop. To allow LLDB to catch
the injected trap at runtime, it will disassemble the compiled expression and
scan for the trap address. The injected trap address is then added to LLDB's
__BreakpointSiteList__.
When generated, this is what the condition checker looks like:
```cpp
void $__lldb_expr(void *$__lldb_arg)
{
/*lldb_BODY_START*/
if (condition) {
__builtin_debugtrap();
};
/*lldb_BODY_END*/
}
```
#### Argument Builder
The conditional expression will often refer to local variables, and the
references to these variables need to be tied to the instances of them in the
current frame.
Usually the expression evaluator invokes the __Materializer__ which fetches
the variables values and fills the `$__lldb_arg` structure. But since we don't
want to switch contexts, LLDB has to resolve used variables by generating code
that will initialize the `$__lldb_arg` pointer, before running the condition
checker.
That's where the __Argument Builder__ comes in.
The argument builder uses an `UtilityFunction` to generate the
`$__lldb_create_args_struct` function. It is called by the Trampoline
before the condition checker, in order to resolve variables used in the
condition expression.
`$__lldb_create_args_struct` will fill the `$__lldb_arg` in several steps:
1. It takes advantage of the fact that LLDB saved all the registers to the
stack and map them in an `register_context` structure.
```cpp
typedef struct {
// General Purpose Registers
} register_context;
```
2. Using information from the variable resolver, it allocates a memory stub
that will contain the used variable addresses.
3. Then, it will use the register values and the collected metadata to
compute the used variable address and write that into the
newly allocated structure.
4. Finally the allocated structure is returned to the trampoline, which will
pass it as an argument to the injected condition checker.