> From: Michael S. Tsirkin <[email protected]>
> Sent: Monday, April 3, 2023 2:02 PM

> > Because vqs involve DMA operations.
> > It is left to the device implementation to do it, but a generic wisdom
> > is not implement such slow work in the data path engines.
> > So such register access vqs can/may be through firmware.
> > Hence it can involve a lot higher latency.
> 
> Then that wisdom is wrong? tens of microseconds is not workable even for
> ethtool operations, you are killing boot time.
> 
Huh.
What ethtool latencies have you experienced? Number?

> I frankly don't know, if device vendors are going to interpret "DMA" as "can
> take insane time" then maybe we need to scrap the whole admin vq idea and
> make it all memory mapped like Jason wanted, so as not to lead them into
> temptation?

DMA happens for all types of devices for control and data path.
Can you point to any existing industry specification and real implementation 
that highlights such timing requirements.
This will be useful to understand where these requirements come from.

Multiple device implementors do not see memory mapped registers as way forward.
Discussed many times.
There is no point in going that dead end.

> Let me try again.
> 
> Modern host binds to modern interface. It can use the PF normally.
> Legacy guest IOBAR accesses to VF are translated to transport vq accesses.
> 
I understand this part.
Transport VQ is on the PF, right? (Nothing but AQ, right?)

It can work in VF case with trade-off compared to memory mapped registers.
A lightweight hypervisor cannot benefit from this which wants to utilize this 
for transitional PF too.
So providing both the options is useful.

Again, I want to emphasize that register read/write over tvq has merits with 
trade-off.
And so the mmr has merits with trade-off too.

Better to list them and proceed forward.

Method-1: VF's register read/write via PF based transport VQ
Pros:
a. Light weight registers implementation in device for new memory region window

Cons:
a. Higher DMA read/write latency
b. Device requires synchronization between non legacy memory mapped registers 
and legacy regs access via tvq
c. Can only work with the VF. Cannot work for thin hypervisor, which can map 
transitional PF to bare metal OS
(also listed in cover letter)

Method-2: VF's register read/write via MMR (current proposal)
Pros:
a. Device utilizes the same legacy and non-legacy registers.
b. an order of magnitude lower latency due to avoidance of DMA on register 
accesses
(Important but not critical)

> > No. Interrupt latency is in usec range.
> > The major latency contributors in msec range can arise from the device side.
> 
> So you are saying there are devices out there already with this MMR hack
> baked in, and in hardware not firmware, so it works reasonably?
It is better to not assert a solution a "hack", when you are still trying to 
understand the trade-offs of multiple solutions and when you are yet to fully 
review all requirements.
(and when solution is also based on an offline feedback from you!)

No. I didn't say that device is out there.
However, large part of the proposed changes is done based on real devices (and 
not limited to virtio).

Regarding tvq, I have some idea on how to improve the register read/writes so 
that its optimal for devices to implement.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to