On Thu, Mar 13, 2025 at 2:30 PM Markus Armbruster <arm...@redhat.com> wrote:

> John Snow <js...@redhat.com> writes:
>
> > On Thu, Mar 13, 2025, 11:57 AM Markus Armbruster <arm...@redhat.com>
> wrote:
> >
> >> John Snow <js...@redhat.com> writes:
> >>
> >> > On Thu, Mar 13, 2025 at 10:41 AM Markus Armbruster <arm...@redhat.com
> >
> >> > wrote:
> >> >
> >> >> John Snow <js...@redhat.com> writes:
> >> >>
> >> >> > On Thu, Mar 13, 2025 at 2:47 AM Markus Armbruster <
> arm...@redhat.com> wrote:
> >> >> >
> >> >> >> John Snow <js...@redhat.com> writes:
> >> >> >>
> >> >> >> > This patch does three things:
> >> >> >> >
> >> >> >> > 1. Record the current namespace context in pending_xrefs so it
> can be
> >> >> >> >    used for link resolution later,
> >> >> >> > 2. Pass that recorded namespace context to find_obj() when
> resolving a
> >> >> >> >    reference, and
> >> >> >> > 3. Wildly and completely rewrite find_obj().
> >> >> >> >
> >> >> >> > cross-reference support is expanded to tolerate the presence or
> absence
> >> >> >> > of either namespace or module, and to cope with the presence or
> absence
> >> >> >> > of contextual information for either.
> >> >> >> >
> >> >> >> > References now work like this:
> >> >> >> >
> >> >> >> > 1. If the explicit reference target is recorded in the domain's
> object
> >> >> >> >    registry, we link to that target and stop looking. We do
> this lookup
> >> >> >> >    regardless of how fully qualified the target is, which
> allows direct
> >> >> >> >    references to modules (which don't have a module component
> to their
> >> >> >> >    names) or direct references to definitions that may or may
> not belong
> >> >> >> >    to a namespace or module.
> >> >> >> >
> >> >> >> > 2. If contextual information is available from qapi:namespace or
> >> >> >> >    qapi:module directives, try using those components to find a
> direct
> >> >> >> >    match to the implied target name.
> >> >> >> >
> >> >> >> > 3. If both prior lookups fail, generate a series of regular
> expressions
> >> >> >> >    looking for wildcard matches in order from most to least
> >> >> >> >    specific. Any explicitly provided components (namespace,
> module)
> >> >> >> >    *must* match exactly, but both contextual and entirely
> omitted
> >> >> >> >    components are allowed to differ from the search result.
> Note that if
> >> >> >> >    more than one result is found, Sphinx will emit a warning (a
> build
> >> >> >> >    error for QEMU) and list all of the candidate references.
> >> >> >> >
> >> >> >> > The practical upshot is that in the large majority of cases,
> namespace
> >> >> >> > and module information is not required when creating simple
> `references`
> >> >> >> > to definitions from within the same context -- even when
> identical
> >> >> >> > definitions exist in other contexts.
> >> >> >>
> >> >> >> Can you illustrate this this examples?
> >> >> >>
> >> >> >
> >> >> > do wha?
> >> >>
> >> >> Sorry, I went into the curve too fast.
> >> >>
> >> >> The stuff under "References now work like this" confuses me.  I
> guess it
> >> >> describes a series of lookups to try one after the other.
> >> >>
> >> >> I understand a cross-reference consists of namespace (optional),
> module
> >> >> (optional), name, and role.
> >> >>
> >> >> Let's assume role is "any" for simplicity's sake.
> >> >>
> >> >> Regarding "1. If the explicit ...":
> >> >>
> >> >>     What is a reference's "explicit reference target"?  Examples
> might
> >> >>     help me understand.
> >> >>
> >> >
> >> > explicit lookup: `QMP:block-core:block-dirty-bitmap-add`
> >> >
> >> > If that explicit target matches an object in the object database
> >> > *directly*, we match immediately and don't consider other potential
> >> > targets. This also applies to things like modules, e.g.
> `QMP:block-core`
> >> > even though the "module" is absent (it IS the module)
> >> >
> >> > We always search for the explicit target no matter how un/fully
> qualified
> >> > it is.
> >> >
> >> >
> >> >>
> >> >>     What is "recorded in the domain's object registry"?
> >> >>
> >> >
> >> > domain.objects{} - essentially a record of every ObjectDefinition's
> >> > "fullname" - the return value from QAPIDefinition._get_fqn().
> >> >
> >> >
> >> >>
> >> >>     Can you show me a reference where this lookup succeeds?
> >> >>
> >> >
> >> > `QMP:block-core`
> >> > `QMP:block-core.block-dirty-bitmap-add`
> >>
> >> So, for this lookup to work, the reference must either be of the form
> >> NAMESPACE:MODULE and resolve to that module in that namespace, or of the
> >> form NAMESPACE:MODULE:DEFN and resolve to that definition in that module
> >> in that namespace.  Correct?
> >>
> >
> > Yes.
> >
> >
> >> These a "fully qualified names (FQN)" in your parlance, right?
> >>
> >
> > More or less, though as you found below...
> >
> >
> >> Note that the first form is syntactically indistinguishable from
> >> NAMESPACE:DEFN, i.e. a reference to a definition that specifies the
> >> namespace, but not the module.
> >>
> >> If the NAMESPACE:MODULE interpretation resolves, we never try the
> >> NAMESPACE:DEFN interpretation, because that happens in later steps.
> >> Correct?
> >>
> >> The first form is fully qualified only if it resolves as FQN.  So,
> >> whether such a reference is fully qualified is not syntactically
> >> decidable.  Hmm.
> >>
> >
> > You're right. There is a weirdness here. I might need to do some more
> > thinking to make sure it isn't theoretically a problem, but in practice,
> > right now, it isn't.
>
> Not a blocker, but please do your thinking :)
>
> > Stay tuned, I guess.
> >
> >
> >> >> Regarding "2. If contextual information ...":
> >> >>
> >> >>     I guess "contextual information" is the context established by
> >> >>     qapi:namespace and qapi:module directives, i.e. the current
> >> >>     namespace and module, if any.
> >> >>
> >> >
> >> > Yep!
> >> >
> >> >
> >> >>
> >> >>     If the cross reference lacks a namespace, we substitute the
> current
> >> >>     namespace.  Same for module.
> >> >>
> >> >>     We then use that "to find a direct match to the implied target
> >> >>     name".  Sounds greek to me.  Example(s) might help.
> >> >>
> >> >
> >> > If namespace or module is missing from the link target, we try to
> fill in
> >> > the blanks with the contextual information if present.
> >> >
> >> > Example, we are in the block-core section of the QEMU QMP reference
> manual
> >> > document and we reference `block-dirty-bitmap-add`. With context, we
> are
> >> > able to assemble a fully qualified name:
> >> > "QMP:block-core.block-dirty-bitmap-add`. This matches an item in the
> >> > registry directly, so it matches and no further search is performed.
> >>
> >> We try this lookup only when the reference lacks a namespace and we are
> >> "in" a namespace, or when it lacks a module and we are "in" a module.
> >> Correct?
> >>
> >
> > or both: if we provided only a name but the context has both a namespace
> > and module.
>
> So my or is inclusive :)
>
> > essentially the algorithm splits the explicit target into (ns, mod, name)
> > and for any that are blank, we try to fill in those blanks with context
> > info where available. Sometimes you have neither explicit nor contextual
> > info for a component.
> >
> > Then we do a lookup for an exact match, in order;
> >
> > 1. explicit target name, whatever it was
>
> Fully qualified name.
>

Yes, for lookup to succeed it should be fully qualified, though if the
target text is "ns:module", that's actually going to succeed here, too.


>
> If lookup succeeds, we're done.
>
> If lookup fails, we're also done.
>

If lookup fails, we actually continue on to #2, but whether or not this
does anything useful depends on whether or not the original target text was
fully qualified or not. If it was, #2 searches with the exact same text and
will fail again and proceed to #3, where because we had a fully qualified
name, none of the search conditions apply and we then just exit.

(It just lacks an early return, but abstractly, if lookup on #1 fails with
a fully qualified name, we are indeed done.)

If lookup fails because it wasn't actually fully qualified, then #2 has
some gaps to try to fill.


>
> *Except* for the ambiguous form NAMESPACE:MYSTERY.  If lookup fails for
> that, the name is not fully qualified after all.  Probably.  Maybe.  We
> assume it's missing a module, and proceed to 2.
>
> I'm mostly ignoring this exception from now on to keep things simple.
>
> > 2. FQN using contextual info
>
> Partially qualified name, but context can fill the gaps.
>
> If lookup succeeds, we're done.
>
> Else, we proceed to 3.
>

That's right.


>
> > and we stop after the first hit - no chance for multiple results here,
> just
> > zero-or-one each step.
> >
> > i.e. any explicitly provided information is never "overwritten" with
> > context, context only fills in the blanks where that info was not
> provided.
> >
> > If neither of these work, we move on to fuzzy searching.
> >
> >
> >> We then subsitute current namespace / module for the lacking one(s), and
> >> try the same lookup as in 1.  Correct?
> >>
> >
> > Yes!
> >
> >
> >> If we have a reference of the form MYSTERY, it could either be a
> >> reference to module MYSTERY in the current namespace, or to definition
> >> MYSTERY in the current namespace and module.  How do we decide?
> >>
> >
> > fqn a: NS:MYSTERY
> > fqn b: NS:MOD:MYSTERY
> >
> > Given we have a current ns/mod context, it's going to pick the second
> one.
> >
> > Hm. Maybe it ought to be ambiguous in this case... I'll have to revise
> > this. (question is: how soon do we need it?)
>
> While we should try to put this on a more solid foundation, it is not a
> blocker.
>
> >> >> Regarding "3. If both prior lookups fail ...":
> >> >>
> >> >>     I guess we get here when namespace or module are absent, and
> >> >>     substituting the current namespace or module doesn't resolve.  We
> >> >>     then substitute a wildcard, so to speak, i.e. look in all
> namespaces
> >> >>     / modules, and succeed if we find exactly one resolution.  Fair?
> >> >>
> >> >
> >> > More or less, though the mechanics are quite a bit more complex than
> your
> >> > overview (and what I wrote in qapi-domain.rst.) We can get here for a
> few
> >> > reasons:
> >> >
> >> > (1) We didn't provide a fully qualified target, and we don't have full
> >> > context to construct one.
>
> We skipped 1. because not fully qualified, and we skipped 2. because
> context can't fill the gaps.
>

we tried #1 and failed, then we tried #2 and failed again.


>
> >> >                           For example, we are not "in" a namespace
> and/or
> >> > not "in" a module. This is quite likely to happen when writing simple
> >> > references to a definition name from outside of the transmogfrified
> QAPI
> >> > documentation, e.g. from qapi-domain.rst itself, or
> dirty-bitmaps.rst, etc.
>
> Yes.
>
> >> > (2) We didn't provide a fully qualified target, and we are referencing
> >> > something from outside of the local context. For example, we are "in"
> a
> >> > module but we are trying to link to a different module's definition.
> e.g.
> >> > we are in QMP:transaction and we reference `block-dirty-bitmap-add`.
> The
> >> > implied FQN will be QMP:transaction.block-dirty-bitmap.add, which
> will not
> >> > resolve.
>
> We skipped 1. because not fully qualified, and we failed 2. because
> context filled the gaps, but lookup failed.
>

Failed #1 and Failed #2.


>
> >> >
> >> > The fuzzy search portion has an order of precedence for how it
> searches -
> >> > and not all searches are tried universally, they are conditional to
> what
> >> > was provided in the reference target and what context is available.
> >> >
> >> > 1. match against the explicitly provided namespace (module was not
> >> > explicitly provided)
> >>
> >> Look for the name in all of the namespace's modules?
> >>
> >
> > Yeah. search for "ns:*.name" and "ns:name" basically.
>
> Got it.
>
> >> > 2. match against the explicitly provided module (namespace was not
> >> > explicitly provided)
> >>
> >> Look for the name in all modules so named in all namespaces?
> >>
> >
> > Yes.
>
> Got it.
>
> >> > 3. match against the implied namespace (neither namespace/module was
> >> > explicitly provided)
> >>
> >> ?
> >>
> >
> > User gave `foo`, but we have a namespace from context, so we look for
> > ns:*.foo or ns:foo.
>
> Got it.
>
> Detail I had not considered until now: a namespace contains modules that
> contain definitions, but it also contains definitions directly.
>
> I can't recall offhand how schema.py represents this.  I'll figure it
> out and report back.
>

I think it gets charged to a module named "qapi-schema". Silly, but it
doesn't really break anything.


> >> > 4. match against the implied module (neither namespace/module was
> >> > explicitly provided)
> >>
> >> ?
> >>
> >
> > User gave `foo`, but we have a module from context, so we search for
> > *:mod.foo and mod.foo
>
> Got it.
>
> >> > 5. match against the definition name only, from anywhere (neither
> >> > namespace/module was explicitly provided)
> >>
> >> Look for the name anywhere?
> >>
> >> I need examples :)
> >>
> >
> > user gave `foo`. search for any qapi definition in all modules and
> > namespaces for anything with the name "foo". The broadest possible
> search.
> >
> > Doesn't search for stuff outside of the QAPI domain directly, but be
> aware
> > when using `references` that all domains are consulted, so it may in fact
> > match something else from somewhere else, though not by any doing of the
> > qapi domain.
> >
> > i.e. :qapi:any:`foo` will definitely only search using the rules laid out
> > in this patch/thread, but `foo` will consult all domains (and whine if
> more
> > than one result between all domains is identified.)
>
> Got it, I think.
>
> >> > The searches are performed in order: if a search returns zero
> results, the
> >> > next search is tried. If any search returns one or more results, those
> >> > results are returned and we stop searching down the list. The priority
> >> > order ensures that any explicitly provided information is *always*
> used to
> >> > find a match, but contextually provided information is merely a
> "hint" and
> >> > can be ignored for the sake of a match.
> >> >
> >> > If find_obj() as a whole returns zero results, Sphinx emits a warning
> for a
> >> > dangling reference. if find_obj() as a whole returns multiple results,
> >> > Sphinx emits a warning for the ambiguous cross-reference.
> >> >
> >> > QEMU errors out on any such warnings under our normal build settings.
> >> >
> >> > Clear as mud?
> >>
> >> Clearer, but not quite mud, yet.
> >>
> >
> > Ultimately, Search in this order and stop at any point any of these
> > searches return at least one result:
> >
> > 1. Explicitly provided name, whatever it is
> > 2. FQN using contextual info
> > 3. Explicitly provided NS; any module
> > 4. Explicitly provided module; any NS
> > 5. Contextual NS; any module
> > 6. Contextual module; any NS
> > 7. any NS/module.
> >
> > with searches 3-7 being conditional only when the criteria are met:
> >
> > 3. Must have explicit NS (and no explicit module)
> > 4. Must have explicit module (and no explicit NS)
> > 5. Must have contextual NS (must not have explicit NS nor module)
> > 6. Must have contextual module (must not have explicit NS nor module)
> > 7. Must have neither explicit NS nor module.
>

I should point out that:

- #3 and #4 are mutually exclusive
- #5 and #6 are mutually exclusive
- #3/#4 and #5/#6 are mutually exclusive
- Only one of #3/#4/#5/#6 will happen for a given reference.
- If we choose #5 or #6, #7 may also happen. If we choose #3 or #4, #7 will
never happen.

I bet I can make the code here cleaner by combining some of these, it's
just a lot of context on the input to keep in mind while designing the
conditionals, so I wrote it out very "explicitly" to help myself make sense
of it ...


> >
> > In summary:
> >
> > * Anything explicitly provided *must* match. This information is *never*
> > ignored.
> > * Anything implicitly provided through context will never overwrite
> > anything explicitly provided. This is used as a *hint* in the event
> > multiple candidates are found, but results are allowed to deviate when
> > stronger matches are not located.
> >
> > i.e. contextual information is used to *prefer* a result from that
> context,
> > but is not used to *limit* you to that context.
> >
> > by contrast, explicit info is used to directly filter and restrict
> search.
>
> Makes sense.
>
> > (With maybe a bug or two for trying to find module names in some
> > circumstances. Will have to check that over...)
>
> Thank you!
>
>
You're welcome!

Reply via email to