scottconstable wrote:

Hi @maurer, I can honestly say that I appreciate your attention to detail. 
While your observation that "attackers often have direct control of 
intentionally passed arguments" is true in general, I do not see evidence of 
this among indirect calls in the Linux kernel. 77% of all indirect call 
parameters are pointers, and AFAIK user-mode code cannot exert arbitrary 
control over pointers within the kernel (especially given that x86 Linux 
kernels enable SMAP and SMEP when it is available on the CPU). That leaves the 
other 23% of indirect call parameters to consider, though I admit I don't have 
a concrete breakdown of what fraction of these *might* be attacker controllable.

But let's assume for the sake of discussion that all of the non-pointer 
indirect call parameters are attacker controllable. And because we don't know 
which dead registers at a call site *might* be attacker-controllable, let's 
similarly assume for the sake of discussion that all of these dead registers 
are attacker-controllable. I built the table below to estimate for the 
arity-enhanced scheme a kind of exploitability score, i.e., the expected number 
of attacker-controlled parameters times the frequency of function types of that 
arity. For example, suppose that a hash collision between two function types 
occurs within the arity-1 partition. The expected number of attacker-controlled 
parameters at the first function is 0.23, and likewise for the second. So the 
total number of attacker-controlled parameters is 0.46, times the relative 
frequency of arity-1 function types (roughly 26%) gives a weighted 
exploitability of 0.105138035.

| Arity | Expected # of attacker-controlled args | Weighted exploitability |
| --------------- | --------------- | ----------------------------------------|
| 0 | 0 | 0 |
| 1 | 0.46 | 0.105138035 |
| 2 | 0.92 | 0.318536183 |
| 3 | 1.38 | 0.322375493 |
| 4 | 1.84 | 0.197281482 |
| 5 | 2.3 | 0.109483628 |
| 6+ | 2.76 | 0.093409153 |

Total exploitability score for the arity enhancement: **1.146223975**

A similar analysis can be applied to the existing 32-bit hash scheme. For 
example, suppose a 1-arity function collides with a 2-arity function. Then the 
expected number of attacker-controlled parameters that would be live at the 
mis-matched target would be 0.23 if the 2-arity function calls the 1-arity 
function, or 1.23 if the 1-arity function calls the 2-arity function (because 
the second parameter at the target would be a dead register at the caller). 
This data is captured in the table below, where the columns and rows refer to 
the respective arities of the colliding function types.

|       | 0     | 1     | 2     | 3     | 4     | 5     | 6+ |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| **0** | 0     |       |       |       |       |       | 
| **1** | 1     | 0.46  |       |       |       |       | 
| **2** | 2     | 1.46  | 0.92  |       |       |       |
| **3** | 3     | 2.46  | 1.92  | 1.38  |       |       | 
| **4** | 4     | 3.46  | 2.92  | 2.38  | 1.84  |       | 
| **5** | 5     | 4.46  | 3.92  | 3.38  | 2.84  |  2.3  | 
| **6+**        | 6     |  5.46 | 4.92  | 4.38  | 3.84  | 3.3   | 2.76

Assuming that a collision occurs, this table shows the probability density of 
the arities (note that the probabilities do sum to 1):

|         | 0           | 1           | 2           | 3           | 4           
| 5           | 6+          |
|---------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| **0**   | 8.61406E-06 |             |             |             |             
|             |             |
| **1**   | 0.00134164  | 0.052240106 |             |             |             
|             |             |
| **2**   | 0.00203238  | 0.15827159  | 0.119878662 |             |             
|             |             |
| **3**   | 0.001371251 | 0.106786156 | 0.161764743 | 0.054571497 |             
|             |             |
| **4**   | 0.000629365 | 0.049011785 | 0.074245381 | 0.050093506 | 0.011495742 
|             |             |
| **5**   | 0.000279419 | 0.021759723 | 0.032962663 | 0.022239974 | 0.010207511 
| 0.00226591  |             |
| **6+**  | 0.000198662 | 0.015470786 | 0.023435882 | 0.015812236 | 0.007257363 
| 0.003222046 | 0.001145409 |

Then the weighted exploitability for each collision is:

|         | 0           | 1           | 2           | 3           | 4           
| 5           | 6+         |
|---------|-------------|-------------|-------------|-------------|-------------|-------------|------------|
| **0**   | 0           | 0           | 0           | 0           | 0           
| 0           | 0          |
| **1**   | 0.00134164  | 0.024030449 | 0           | 0           | 0           
| 0           | 0          |
| **2**   | 0.00406476  | 0.231076521 | 0.110288369 | 0           | 0           
| 0           | 0          |
| **3**   | 0.004113752 | 0.262693944 | 0.310588307 | 0.075308666 | 0           
| 0           | 0          |
| **4**   | 0.002517459 | 0.169580776 | 0.216796512 | 0.119222544 | 0.021152165 
| 0           | 0          |
| **5**   | 0.001397093 | 0.097048366 | 0.129213637 | 0.075171112 | 0.02898933  
| 0.005211593 | 0          |
| **6+**  | 0.001191971 | 0.084470491 | 0.115304537 | 0.069257593 | 0.027868274 
| 0.010632751 | 0.00316133 |

Total exploitability score for the existing 32-bit hash: **2.201693943**

So, if you accept my reasoning, then a collision in the 32-bit scheme is 
expected to be roughly twice as exploitable as a collision in the 29-bit scheme 
with 3 arity bits. On the other hand, the 29-bit scheme might produce twice as 
many collisions.

https://github.com/llvm/llvm-project/pull/117121
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to