On 07/23/16 03:45, Michal Meloun wrote:
Dne 21.07.2016 v 17:53 Nathan Whitehorn napsal(a):
On PowerPC, GENERIC64 supports FDT systems (some IBM hardware), OFW
systems (Macs, some IBM hardware), systems with no device trees at
all (old-style PS3), and systems with a mixture of device tree and no
device tree (new-style PS3). On these, there is a mixture of "real"
interrupts and GPIO-type interrupts. There is no limitation that this
be used only for device-tree-type systems.
The system requires two things: an interrupt domain key and an
arbitrary unique byte string describing the interrupt. When running
with a device tree, these are set to the phandle of the
interrupt-parent and the contents of the device tree interrupt
specifier, respectively, and the system was of course developed with
that in mind. But they don't need to be, and often aren't. You could
make the domain an element of an enum (where "ACPI" is a choice, for
instance -- this is what PS3 does), or set it to a pointer to a
device_t, or really anything you like. Similarly, the interrupt
specifier is totally free-form.
Yes, I agree. and i think that we followed the same direction. But i
see two problems with this approach.
- in some cases (OFW, device_t) you don't have control over domain
key value, so you cannot guarantee its uniqueness.
Of course, probability of collision is low, but it is.
We could solve this in a number of ways, for example widening to 64
bits, or adding another value (domain type, for example). You could also
make an acpi_bus_map_intr() to go with the OFW one that connect in some
machine-dependent code if you have fundamentally incompatible bus
enumeration mechanisms that you expect to exist simultaneously -- but,
of course, no systems like this seem to actually exist, so the problem
is both easily solved and totally theoretical.
- within ofw_bus_map_intr() (or later - at the time when byte string
must be decoded) you are not able (easily) to differentiate
between different formats, thus you are not able to select
appropriate decoder. The GPIO controller, for example,
must be able to handle interrupts defined by standard OFW property,
or by <device_t, pin number> pair concurrently.
In principle, you could solve that as above, or by registering a second
interrupt domain for the same controller.
In practice, it doesn't matter since, in the GPIO case, for example, the
GPIO controller is never itself also a normal interrupt controller (i.e.
the GIC and GPIO controller are always different devices). As such, the
theoretical does not occur in practice.
For this reason we makes domain key composite, in our model, the
domain key consist of "domain key type"
and "domain key value". This composite key guarantees uniqueness and
it also allows to select proper parser for byte string.
Yes, but this solves what is a nonexistant problem by making the system
substantially less flexible and more invasive. Which is not a good tradeoff.
This is, imho, only one difference between us.
One of many, yes.
You could, for instance, set it to one of the structures introduced
in r301453 if you wanted to.
I would have zero problems with changing the prototype of the
existing dev/ofw function to something more generic in name, like:
bus_map_intr(device_t dev, uintptr_t iparent, size_t intrlen, void *intr)
instead of the existing equivalent:
ofw_bus_map_intr(device_t dev, phandle_t iparent, int icells, pcell_t
*intr)
Our bus_map_intr() method is not indeed as replacement of
ofw_bus_map_intr(). Its evolution of "how we can store more complex
data to resource list (from bus enumerator) and transfer it to
bus_allocate_resource() and/or to bus_setup_intr()" in driver
independent way. We found no reasonable way to do it, so we postponed
reading of properties to bus_allocate_resource() time.
Right, but that is (a) a solved problem with ofw_bus_map_intr() and (b)
this code doesn't solve it as completely. What does it let you do that
the existing code does not? There has not been a single concrete example
of something anyone wants to do on actual hardware so far in this
discussion.
But now jhb@ gives us alternative and I must say that this looks like
a clean and elegant way how to make this (assuming that we can expand
resource_list_entry by pointer to alloc_resource_args)
Except that jhb@'s suggestion doesn't actually work for interrupts for
the reasons I and others have pointed out. We could make such a system,
but it would be a different one.
By this, one byte string in OFW encoding can describe one IRQ and
exactly same string in UEFI encoding can describe different IRQ.
Or, in reverse, OFW and UEFI can describe same (and compatible) IRQ
by two different strings.
This is exact reason, why we discards virtual IRQ idea and I think
that this fact is root issue of this clash.
Probably it doesn't make sense to talk about others, unless we can
find consensus on this.
You have the larger problem if you end up in this situation that you
are enumerating the hardware by two different and incompatible
techniques. There simply is no way to solve this unless you either
(a) segregate the system into an ACPI-enumerated domain and an
OF-enumerated domain, in which case the problem vanishes, (b) discard
one enumeration, which is what arm64 does and will always do,
according to Andrew in another post, or (c) make some incredibly
complex merging code that would naturally handle interrupts with
everything else. So I don't think this is an actual, real problem.
I think that above proposed solution resolves this gracefully.
Assuming you are talking about jhb's decoration plan in the bus
hierarchy, it doesn't actually work for interrupts at all because of
lateral connections and circular dependencies.
2. It partially duplicates the functionality of
OFW_BUS_MAP_INTR(), but is both problematically more general and
less flexible (it has requirements on timing of PIC attachment
vs. driver resource allocation)
OFW_BUS_MAP_INTR() can parse only OFW based data and expect that
parsed data are magicaly stored within the call.
The new method, bus_map_intr(), can parse data from multiple
sources (OFW, UEFI / ACPI, synthetic[gpio device + pin number]).
It also returns parsed data back to caller.
That is not true. It works as long as you can specify the interrupt
state as a 32-bit key of some kind for the PIC and a string of
arbitrary data, which works with all of those. You could even make
the interrupt data be a pointer to exactly the structs you have
chosen to define here.
Nope, in heterogeneous world, same string can describe two different
IRQs and/or two different strings can describe single IRQ in
compatible manner.
Can you give *any* concrete example of this that doesn't involve
mixed ACPI/FDT enumeration of a single system where devices appear in
both trees, which doesn't actually ever happen?
GPIO - its interrupt function can be accessed by regular "interrupts"
property, or it can be derived from GPIO pin.
The GPIO controller has one or more interrupts assigned to it that are
part of its parent's interrupt domain. It cascades them to one or more
virtual interrupts that belong to its own domain. Where is the ambiguity
or complication here?
3. It is not fully transparent to end code. Since it happens at
bus_alloc_resource() time, it is complicated to get the
appropriate values for IRQs constructed by composite techniques
(interrupt-map vs. interrupts vs. hand allocation vs. PCI
routing, for example).
I don't see any limitation - can you be more exact? Why is not
transparent? Why is more complicated ?
Suppose that a PCI device adds more IRQs to its resource list or
modifies the ordering. How is whatever bus layer supposed to do
something sensible at allocation time? It requires that RID numbers
mean something to the parent bus after assignment, which is not
guaranteed by anything and is, in more than handful of cases I
think of, not true in practice.
Sure. And since the new code allows delivering resources in RL, so I
don't see any limitation here.
It indexed mapping by RID and then searches interrupt lists by that
to get the interrupt-parent. This is fundamentally a broken design if
the child needs to, say, add a second interrupt to its RL on a
different interrupt-parent.
?? I don't understand. The new code doesn't need this.
int bus_generic_map_intr(device_t dev, device_t child, int *rid,
rman_res_t *start, rman_res_t *end,
rman_res_t *count, struct intr_map_data
**imd);
This works either by rid or by rman_res_t. In the event of a
self-assigned interrupt from the child, which is a fairly common case,
neither of these values are meaningfully parseable by the bus parent,
especially if the interrupt is on a different controller. The main issue
is that this code goes back to the device tree according to the
*parent*'s idea of how it should be parsed and there are many cases
(GPIO interrupts, for example) when the *child* needs to amend the list
in a different way.
It is much easier to do this correctly at bus attach time when
the resource lists are made (how PPC does it).
I don't agree. I don't agree. Making this at bus attach time leads
into complicated 'virtual' IRQ infrastructure, with many
unresolved corner cases.
Which unresolved corner cases? This has been working correctly on a
number of platforms in both FreeBSD and Linux for many years.
Nope, it doesn't work for ARM yet (for GPIO interrupts for example)
and Linux uses EPROBE_DEFER mechanism for a long time.
See: http://lxr.free-electrons.com/source/drivers/base/platform.c#L87
There is some missing code on ARM (probably about 30 minutes of work
to make it match PowerPC) to make it work in an ideal case, sure, but
there is no reason you could not go out right now, with the existing
code, and implement GPIO interrupts by declaring the GPIO driver as
an interrupt controller.
Can you give any concrete case of something that doesn't work?
GPIO again. How you can allocate interrupt associated with given gpio
pin installed by "cd-gpios = <&gpio TEGRA_GPIO(V, 2) GPIO_ACTIVE_LOW>;"
You parse that as an interrupt on a interrupt domain associated with the
GPIO controller referenced by &gpio. In pseudo-code:
int irq = ofw_bus_map_intr(dev, <&gpio>, ncells, <TEGRA_GPIO(V, 2)
GPIO_ACTIVE_LOW>);
The GPIO controller, meanwhile, has registered an interrupt domain for
<&gpio> and is asked to decode and configure the interrupt defined by
<TEGRA_GPIO(V, 2) GPIO_ACTIVE_LOW>, which it knows how to parse. This is
simple and straightforward.
(1) is easy to fix without API changes, but (2) and (3) are
fundamental architectural problems that will bite us immediately
down the road and cause a permanent schism between OF support on
different platforms.
Let me describe how this is handled on PowerPC (Linux on PPC
solves the problem the same way). When constructing a resource
list, bus drivers that construct them from OF properties call
ofw_bus_map_intr() with the interrupt parent phandle and the
array of cells corresponding to the interrupt. This thunks
immediately to nexus, which connects to code in intr_machdep.c.
Code there assigns a unique made-up virtual IRQ and returns it,
caching the interrupt parent ID and opaque interrupt data (if the
same string of data reappears later, you get back the same
virtual IRQ of course).
When PIC drivers attach and register themselves with the
interrupt handling layer, all the interrupts for that PIC are
passed to it along with the virtual IRQ. The PIC driver is
supposed to know what its interrupt data mean, which can be
safely guaranteed, and it presents the assigned virtual IRQ
number to the kernel when dispatching interrupts. (IRQs
configured after PIC attachment are passed through immediately).
This accomplishes the following things:
1. Parsing interrupt data is moved to the PIC driver, which is
the only place it can be done safely.
I don't see anything different comparing with INTRNG.
What I am advocating *is* INTRNG, at least as originally conceived
and implemented.
2. There is no ordering requirement on PIC attachment vs. the
attachment of anything else.
I think thats is not a true - PIC must exist before
bus_alloc_resource() / bus_setup_intr() is called.
It does not with the IRQ mapping infrastructure. Interrupts are set
up at PIC attachment, whenever that occurs.
Assuming that bus_alloc_resource and bus_setup_intr() are close
thorougher and in linear piece of code, can i assume that you can
call bus_setup_intr()
without PIC attached ?
Yes.
So driver can request and/or setup any random IRQ and gets success
from bus_alloc_resource() or bus_setup_intr()?
Do you think that is this right behavior?
Yes. And it is a behavior required by newbus.
3. Changes are extremely minimal relative to the "standard"
interrupt flow: you only have to patch code that is already
directly dealing with OF interrupts.
I don't see anything different comparing with INTRNG.
Again, this was the original INTRNG architecture and is already
implemented. As such, there are *no* changes required on ARM to get
it. bus_map_intr() adds a bunch of new code, in parallel with the
old code that also solves the problem, to no purpose.
So, on PPC, how i can get interrupt for GPIO pin described by this
property:
https://svnweb.freebsd.org/base/head/sys/gnu/dts/arm/tegra124-jetson-tk1.dts?revision=295436&view=markup#l1691
The GPIO controller registers itself as an interrupt domain and
decodes those strings as IRQ specifiers. When interrupts are
configured, it does whatever it needs to do to configure them
appropriately and dispatches them to the kernel when they occur. It's
a pretty trivial cascaded interrupt configuration. And, since the
VIRQ code does not need the interrupt controller attached in advance,
you don't need to worry about attach order of wherever &gpio points
and the SDHCI driver.
4. It happens at bus enumeration time, when results can be
guaranteed self-consistent.
Where do you see any potential source of inconsistency in INTRNG?
See the example above about modified interrupt lists. There is also
no obvious way for a child device to construct an interrupt not
assigned to it by the parent device from device tree properties
without knowing in some detail what kind of interrupt needs to be
built.
5. It combines naturally with ofw_bus_lookup_imap() and friends
in the interrupt-map case (e.g. for PCI).
Again, I don't see anything different. Proper parsing of interrupt
property is not a problem of INTRNG (but must be fixed, of course).
But it is *already* fixed by the standard code that already exists.
You are introducing a less-functional parallel code path here.
NO, its not fixed, at least not for ARM.
Why not, concretely? I'm happy to write whatever code is missing if
there's a bug. It can't be more than a few tens of lines in arm/intr.c.
Interrupt maps are not covered by current ARM code.
This suggests otherwise:
files.arm:dev/ofw/ofwpci.c optional fdt pci
But it's quite possible that code is not being used as widely as it
should, since it's a new introduction to the ARM tree. That the existing
code is not being used broadly enough is hardly a reasonable to invent
something new.
I'm not sure what the right path forward is, but this code needs
to be fixed. The PowerPC code is fully MI, and was the template
for the original INTRNG, so it shouldn't be too bad to replace.
-Nathan
So, new INTRNG:
- Introduces new more general bus method that can parse interrupt
configuration
data from any source. Is this step backward?
Yes, since it is more general in some sense, while simultaneously
handling fewer cases than code that already exists and is implemented.
- Old INTRNG and PPC code stores unparsed and/or parsed interrupt
data in
INTRNG and each consumer must query for them. This data sharing
also causes
significant locking issues. New INTRNG stores interrupt
configuration data into
given resource, so each relevant bus method can access it
immediately.
Is this step backward?
Which locking issues? And yes, it is.
- New INTRNG is not OFW centric, it can works with virtually
unlimited number
of configuration data sources. Is this step backward?
Also yes, because it makes the interrupt handles less opaque, which
makes the infrastructure less flexible.
- New INTRNG correctly uses standard system infrastructure. Real
IRQ number
is reserved in rman within bus_alloc_resource() call, interrupt
HW is
configured (only!) within bus_setup_intr() call. Is this step
backward?
The "real" IRQ number is not well defined always, so requiring that
is a step backwards, yes.
- New INTRNG completely eliminates huge and not always working virtual
IRQ concept.
When does it "not always work"? It seems to, in fact, always work
on multiple platforms and have for a long time in the face of all
kinds of totally bizarre topologies and system architectures.
Don’t take me bad, I’m open to any change. But no, at this time,
I’m not ready to completely revert someone else's work – although
I am a co-author.
I would urge, in the strongest possible terms, that this be backed
out from stable/11 at least. We can add the new API back for 11.1
if we want it, but we totally lose the ability to change it later
in the stable/11 cycle if it stays in now.
-Nathan
The API is part of still unstable, experimental INTRNG, so its not
fixed we we can change it at any time, I think.
But yes, we forget to wrap new bus_map_intr() method (and
associated code) by #ifdef INTRNG. Is this sufficient for you?
For HEAD, yes. I would like it out of stable/11 entirely until this
discussion converges.
-Nathan
The current code (in stable/11) works and is tested. I simple cannot
commit any untested change for stable tree mainly if is in BETA2
stage. And I cannot fully test the requested change, at this time i
have access to single ARM platform.
But we're in the same situation - both have the same commit bit,
neither one of us is the author of the disputed commit and neither of
us are not able to fully test it.
So feel free to commit what you want, if you have courage to commit
untested code. I haven't it...\
Michal
Well, the code isn't actually used anywhere in stable/11, or HEAD, so it
can't possibly be either tested or important for the functionality of
any current code. As such, I will revert from HEAD on Monday and request
an MFC from re@ following that if the author has not appeared by that time.
-Nathan
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"