clayborg wrote:

> Could you please explain to me why we use different rules in symbol lookups? 
> Namely:
> 
> * 
> [ClangExpressionDeclMap::GetSymbolAddress](https://github.com/llvm/llvm-project/pull/102835/files#diff-5d2da8306a4f4991885836925979f188658789adc8041c37811c243f2cdca24c)
>  doesn't search in the ModuleList if a module is given even if the search 
> provides no results.
> * 
> [SymbolContext::FindBestGlobalDataSymbol](https://github.com/llvm/llvm-project/pull/102835/files#diff-da1a374f5098c39acfebae4b87a261f143a842e613082c2296de7ee28df12a33)
>  searches in the module first, then checks the results; if exactly one 
> external or internal symbol is found returns it. Otherwise, if more than 1 is 
> found it produces an error. But if nothing is found in the module it searches 
> in the module list and repeats the checks.
>   So theoretically there could be multiple results in the module list that 
> would trigger an error, but we don't check for them.
>   Also, it seems an external symbol takes precedence over an internal symbol. 
> If we found an internal symbol in the module, we still could find an external 
> symbol in the module list, but we don't search for it. Is this correct? Does 
> an internal symbol of the module actually take precedence over an external 
> symbol from the module list?
> * 
> [IRExecutionUnit::FindInSymbols](https://github.com/llvm/llvm-project/pull/102835/files#diff-717a850c4315286c025e2739ebe9dacbf27e626b7679c72479b05c996d721112)
>  looks similar to the previous one, but actually very different. It returns 
> early if and only if the found symbol is external(see 
> LoadAddressResolver::Resolve), otherwise it does a full search in the module 
> list. And only then if an external symbol isn't found - returns the first 
> internal symbol(if any).
> 
> It's hard to generalize the optimization since all the callers post-process 
> the results differently.

Yes, each use case is specialized. Since this is the case, maybe the callback 
mechanism should be used and we give back the results starting with the `hint` 
module first, and then proceed to the entire module list. If return value of 
the callback will say "keep going" or "stop". This allows clients to pick and 
choose what they do.  Unless we can break down the changes into something that 
fits all of these. 


Most of these lookups are trying to figure out the right thing to do given a 
symbol context (module, compile unit, function, inlined function etc), and it 
falls back onto the module list from the target (one for each of the shared 
libraries loaded into your process) to find results. 

If you think about it, starting with the current module makes sense because 
that is where the expression is being evaluated (in some stack frame whose PC 
is in a module). So if we find a symbol in the current module, we can go with 
it. If we can't find a symbol in the current module, then we want to go with 
the right one, but there shouldn't be duplicate symbols if those symbols are 
exported. If a user types `printf(...)`, and lets say we don't find it in our 
current module, then we should look for an external symbol that matches, and if 
there is more than one external symbol for `printf`, that would be an error 
because we wouldn't know which one to call. But if we have a local `printf` 
function in the current module, we might want that one to take precedence. Of 
course if `printf` is a method in a class, and our expression is being run from 
another method in the class, then we should call the `<class>::printf`. But if 
things fall back to finding symbols, then we need to have some way to resolve 
ambiguous symbols. So this is the system we came up with.

If you have an expression like `my_global`, you really want to look at the 
compile unit your expression is being run from, and if that compile unit has a 
global or static variable named `my_global`, we should pick that one. If there 
isn't a match in the current compile unit, then we should search in the current 
module. If it has one, that is most likely the global the user wants. If there 
isn't one in the current module, then we must search around in all modules and 
if we find more than one, lets say we find a global variable from `liba.so` 
where the variable is exported, and another from `libb.so` that is a static 
variable, then we should pick the global variable to try and match the 
visibility that a program would have to a global variable from another shared 
library. The public one wins. But it we have two libraries that both have a 
exported `my_global` variables, we don't know which to pick, so we return an 
error.

`IRExecutionUnit::FindInSymbols` might be the function that is called to search 
for symbols in the first place and `ClangExpressionDeclMap::GetSymbolAddress` 
might be used to get the address of a symbol by name. Why? Because after we 
evaluated an expression, we need to hook things up and tell the code 
generator/linker where this value is. If I have things stated correctly above, 
I am not sure I do, then we want to make sure we find the original symbol 
returned by `IRExecutionUnit::FindInSymbols`. But I will defer to the 
expression parser experts.



https://github.com/llvm/llvm-project/pull/102835
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to