[Lldb-commits] [lldb] [LLDB] Add Lexer (with tests) for DIL (Data Inspection Language). (PR #123521)

Pavel Labath via lldb-commits Mon, 27 Jan 2025 04:37:31 -0800

https://github.com/labath commented:


There are some simplifications (and one rewrite :P) that I'd like to talk 
about, but I think we're not far.

The main thing that's bothering me is this identifier vs. keyword issue (and 
for this part, I'd like to loop in @jimingham (at least)). 

Your implementation takes a very strict view of what's a possible identifier 
(it must consist of a very specific set of characters, appearing in a specific 
order, and it must not be a keyword). In contrast, the current "frame variable" 
implementation basically treats anything it doesn't recognise (including 
whitespace) as a variable name (and it has no keywords):
```
(lldb) v a*b
error: no variable named 'a*b' found in this frame
(lldb) v "a b"
error: no variable named 'a b' found in this frame
(lldb) v 123
error: no variable named '123' found in this frame
(lldb) v namespace
error: no variable named 'namespace' found in this frame
```

Now, obviously, in order to expand the language, we need to restrict the set of 
variable names, but I don't think we need to do it so aggressively. I don't 
think anyone will complain if we make it harder for him to access a variable 
called `a*b`, but for example, `namespace`, and `💩` are perfectly valid 
variable names in many languages ([one of 
them](https://godbolt.org/z/vjfhj6dfM) is C).

For this reason, I think it'd be better to have a very liberal definition of 
what constitutes a possible variable name (identifier). Instead of a 
prescribing a character sequence, I think it be better to say that anything 
that doesn't contain one of the characters we recognize (basically: operators) 
is an identifier. IOW, to do something like `frame variable` does right now.

As for keywords, I think it'd be best to have as few of them as possible, and 
for the rest, to treat their keyword-ness as context-dependent whereever 
possible. What I mean by that is to recognise them as keywords only within 
contexts where such usage would be legal. The `namespace` keyword is a prime 
example of that. I *think* the only place where this can appear as a keyword in 
the context of the DIL is within the `(anonymous namespace)` group. If that's 
the case, then I think we should be able to detect that and disambiguate the 
usage, so that e.g. `v namespace` prints the variable called `namespace` and `v 
(anonymous namespace)::namespace` prints the variable called `namespace` in an 
anonymous namespace (in an evil language which has anonymous namespaces but 
allows you to use variables called `namespace`).

The way to do that is probably to *not* treat the string "namespace" as a 
keyword at the lexer level, but to recognize it later, when we have more 
context available.

What do you all think?

https://github.com/llvm/llvm-project/pull/123521
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [LLDB] Add Lexer (with tests) for DIL (Data Inspection Language). (PR #123521)

Reply via email to