Hi,
I am working on implementing a virtio-snd vdev. During my work on different
aspects of the implementation I stumbled across the following issues with the
virtio sound device specification (that is part of the virtio 1.3 draft). In
some cases additional clarifications might be enough to address the issue, in
others additional features might be required. Below is a list of the issues,
with some suggestions on how the issues could be addressed.
PCM playback and capture
Unclear semantics of PCM_STOP/PCM_START operations
The virtio-snd specification is not clear on the expectations on how the
PCM_STOP control request is to be handled by the vdev, in terms of whether to:
* drop the remaining buffered audio in the host pcm stream and return
pending io messages to the guest driver
* drop the remaining buffered audio in the host pcm stream and keep pending
io messages in order to be able to prime the host pcm stream in case PCM_STOP
is followed by PCM_START; return pending io messages only when receiving
PCM_RELEASE
* drain the remaining buffered audio in the host pcm stream and send a
PCM_STOP response upon drain completion, return pending io messages to the
guest driver as they are completed (before PCM_STOP response); drain capability
might or might not be available for the host pcm stream
* start draining the remaining buffered audio in the host pcm stream and
send a PCM_STOP response immediately, return pending io messages to the guest
driver as they are completed (after PCM_STOP response); if receiving
PCM_RELEASE before drain completion hold off the response until all pending io
messages are completed and returned to the guest driver; drain capability might
or might not be available for the host pcm stream
* pause the host pcm stream and maintain buffered audio / pending io
messages as is; pause capability might or might not be available for the host
pcm stream
If the thinking is that the vdev on the host can decide on how to implement
PCM_STOP based on the host stream capabilities, the guest driver has no
information about the choice made by the vdev. In our opinion the choice made
by the vdev in implementing PCM_STOP dictates what the possible transitions
after PCM_STOP can be (if PCM_STOP is implemented as pause, PCM_START and
PCM_RELEASE are possible transitions, but if PCM_STOP is implemented as drain,
the only valid transition is PCM_RELEASE and if PCM_STOP is implemented as drop
and all pending io messages have been returned to the guest driver, again the
only valid transition seems to be PCM_RELEASE), but the guest driver is the one
issuing the next request and it might at the best be able to make an educated
guess how the vdev implemented PCM_STOP based on whether all the pending io
messages are returned or not to the guest driver. To make such a guess
possible, more clarifications would be required in the spec to identify which
of the scenarios listed above are considered by the spec.
Note 1: The reference virtio-snd linux/android guest driver explicitly doesn't
support the ALSA SNDRV_PCM_INFO_DRAIN_TRIGGER feature flag, which might
indicate there was no intent to support drain semantics.
Note 2: The reference virtio-snd linux/android guest driver explicitly does
support the ALSA SNDRV_PCM_INFO_PAUSE feature flag, which seems to indicate
intent to support pause semantics, along with drop semantics.
Note 3: The Android Audio HAL and tinyalsa/tinyplay don't have any notion of
drain or pause. At the end of a playback session there is a wait for previously
buffered data to complete playing using the tinyalsa pcm_wait() function, and
then a PCM_STOP request is issued after the buffered audio has completed
playing. Drop semantics are assumed for PCM_STOP in this context.
Information on period alignment and max audio buffer size
No information about period alignment and min/max audio buffer/period size is
currently available in struct virtio_snd_pcm_info. Ideally the period size used
for io messages should match the period size used on the host. That will
guarantee consistency in timing, precise accounting of available room to write
more data or available data to read, across host and guest. A workaround is
possible, where the device will fail a PCM_SET_PARAMETERS request that
specifies an unsupported period alignment or audio buffer/period size and the
guest is forced to use a period and buffer size that is known to be supported
by the host device (this can be done in an Android guest, by hard coding the
android audio HAL period size and the number of periods and thus effectively
the audio buffer size that is used for the virtio sound devices in the guest;
in android and linux guests it can be done also by setting the virtio_snd
module parameters, to the effect of locking down the period and buffer size).
General Issues
* There is no statement regarding the endianness of PCM data exchanged via
txq and rxq queues and the virtio sound PCM format definitions do not include
endianness.
* There is currently no requirement for the guest driver to enqueue an io
message only after it has completed writing data to it (playback) or reading
data from it (capture). The linux/android guest driver currently enqueues an io
message for playback immediately after calling snd_pcm_period_elapsed().
snd_pcm_period_elapsed() makes it possible for the guest client app to write
new data, but upon the return of this function there is no guarantee that the
guest client app has already written new data. If the vdev dequeues the io
message immediately and copies the PCM data for that io request before the
guest client app has actually written new data to it, it results in
out-of-sequence/stale data being played back.
* There is no statement how the guest driver should react to a control
request failing.
* For PCM control requests the spec details what the "possible
transitions" are after each control request. It doesn't specify whether this
applies only to successful control requests, or irrespective of the success of
the control requests. For instance for PCM_START only PCM_STOP is listed as a
valid transition. What happens if PCM_START fails though? Is PCM_STOP still a
valid transition? Is a repeat of PCM_START valid, since the previous PCM_START
failed? Is a PCM_RELEASE required in order for the guest driver to recover the
io request descriptors that were sent to the vdev before PCM_START? Is a
PCM_STOP required in this case, in order to be able to send PCM_RELEASE
(because of valid sequences considerations)? It was seen that the reference
linux/android guest driver ignores a PCM_START failure and still sends io
requests after the failed PCM_START.
* Should a failed control request be retried with a maximum retry count?
Here the danger is that retrying forever without any wait in between retries
might result in a high CPU load. Should there be a dedicated status value that
tells the guest not to retry a failed request (or to use a error handling
sequence involving RELEASE and PREPARE)?
* There is no statement how the guest driver should react to an io message
failing
* Should it stop sending io messages and just send PCM_STOP? Or retry
sending io requests hoping that they will succeed? Should there be a maximum
retry count? Here the danger is that retrying forever without any wait in
between might result in a high CPU load.
* The spec seems to indicate that PCM_PREPARE after PCM_STOP or
PCM_SET_PARAMS after PCM_STOP would be invalid, still the linux/android guest
driver are sending control requests in these sequences (seen when playing back
from android UI and PCM_START handling is modified to fail for the sake of
testing the behavior). Is the spec out of date? Is the reference linux/guest
android driver not compliant to the spec? Are these sequences made possible by
the fact that the PCM_START previous to PCM_STOP actually failed? If these are
valid sequences, the virtio spec should be updated to show them as valid.
* The virtio spec mentions that for playback, between PCM_PREPARE and
PCM_START the guest driver should send io messages to prime the host playback
stream. There is no statement what shall be done wrt io messages, in the
following situations:
* between PCM_STOP and PCM_START: this goes back to the semantics of
PCM_STOP: if PCM_STOP had drop semantics, all outstanding io messages could be
returned to the guest; if PCM_STOP had pause semantics, outstanding io messages
could be kept on the device side and returned to the guest only upon the
receipt of a PCM_RELEASE request from the guest; if PCM_STOP had drain
semantics, the outstanding io messages could be returned to the guest as they
are completed
* after a failed PCM_START: since PCM_START failed, all io messages sent
by the guest after PCM_PREPARE could be returned to the guest; or this could be
deferred to a PCM_STOP or PCM_RELEASE request
* There is no way for the vdev to notify the guest driver that there is an
error condition that requires a new PCM_PREPARE in order to clear. The vdev
could send the VIRTIO_SND_EVT_PCM_XRUN, which requires PCM_PREPARE to clear,
but this would be abusing the spec, as VIRTIO_SND_EVT_PCM_XRUN is meant
exclusively for XRUN errors.
* There is no way for the vdev to notify the guest driver that a stream is
disconnected (e.g. host audio device has been removed - think USB audio device
- or reinserted).
Recommendations:
* Add explicit statement about endianness of PCM data exchanged via txq and
rxq queues.
* Add explicit requirement for the guest driver to enqueue an io message
only after it has completed writing data to it (playback) or reading data from
it (capture)
* Add explicit requirement for control and io message failures to be
handled on the guest driver side. It would be expected that if PCM_START fails,
for instance, the guest driver doesn't send any further io messages. For io
messages, at the minimum the guest driver should use a count of successive
failures and stop retrying. If the handling for a failed control request is
retrying the same request, there should be a limited number of retries.
Specific to PCM_START failure, it should be specified whether PCM_STOP and
maybe PCM_RELEASE and PCM_PREPARE is required before attempting any PCM_START
retry. The behavior could be specified as dependent on whether the vdev has
returned the pending io messages when handling PCM_STOP (if all pending io
messages returned at PCM_STOP, perform PCM_RELEASE and PCM_PREPARE before
retrying).
* Add clarification for handling of oustanding io messages on the device
side upon a failed PCM_START control request, or upon a PCM_STOP control request
* Update valid transitions for each PCM control request, if specification
out of date (see above for control request sequences seen with the reference
linux/android guest driver, that are currently not specified as valid in the
virtio sound specification)
* Add explicit statement that the documented valid transitions for PCM
control requests are for successful control requests and in the case of failed
requests, the valid transitions are same as before the failed control request
(if this statement is correct, if not clarify what the expectation is)
* It would be beneficial if the virtio-snd specification did not leave the
interpretation of PCM_STOP semantics to the vdev (drop vs pause vs drain),
without a clear contract between vdev and guest driver
If the spec intends to allow the host to use both pause/resume and drop/go (and
maybe even drain/go) semantics for the START/STOP operations, then for clarity
it should use a separate request (PCM_PAUSE) for pause and if drain semantics
are to be included as well, then a separate request (PCM_DRAIN) for drain as
well. PCM_START could be used to restart/resume in all cases. Ideally there
should be the following optional stream features:
* VIRTIO_SND_PCM_F_PAUSE - if offered by the vdev it indicates support
for pause-resume both in the vdev and the host pcm stream; new request
VIRTIO_SND_R_PCM_PAUSE with valid transitions START or RELEASE; START would
have a valid transition to PAUSE;
* VIRTIO_SND_PCM_F_DRAIN - if offered by the vdev it indicates support
for drain both in the vdev and the host pcm stream, new request
VIRTIO_SND_R_PCM_DRAIN with valid transitions RELEASE; START would have a valid
transition to DRAIN;
* VIRTIO_SND_R_PCM_STOP would retain drop semantics and have possible
transitions RELEASE;
* A failed VIRTIO_SND_R_PCM_START would have the only possible
transition RELEASE;
* Pending io messages would only need to be returned to the guest driver
when handling VIRTIO_SND_R_PCM_RELEASE
* For priming the host pcm stream, io messages would only need to be
sent by the guest driver between VIRTIO_SND_R_PCM_PREPARE and
VIRTIO_SND_R_PCM_START
* The above approach would highly simplify the spec in terms of
documenting possible transitions and when to send io messages to prime the host
pcm stream and when to return pending io messages to the guest driver, as
opposed to overloading the use of PCM_STOP for drop/pause(/drain), when the
behavior wrt io messages would highly depend on the host interpretation of
PCM_STOP.
* It would be beneficial to add two new PCM notifications, via the
following stream features:
* VIRTIO_SND_PCM_F_EVT_ERR - indicates support for new notification
VIRTIO_SND_EVT_PCM_ERROR, that signals to the guest driver that en error
occurred (other than structly underrun/overrun) while handling io messages,
requiring a sequence of PCM_RELEASE, PCM_PREPARE, PCM_START to recover
* VIRTIO_SND_PCM_F_EVT_CONN_STATE - indicates support for new
notifications VIRTIO_SND_EVT_PCM_DISCONNECTED and VIRTIO_SND_EVT_PCM_CONNECTED,
to indicate to the guest driver pcm stream disconnected (e.g. USB device
removed) and pcm stream connected (e.g. USB device re-inserted)
* It would be beneficial to include Information on period alignment and max
buffer size in struct virtio_snd_pcm_info, available via optional feature flag,
e.g. VIRTIO_SND_PCM_F_PERIOD_INFO and the additional info would be added at the
end of struct struct virtio_snd_pcm_info, replacing the padding bytes and
adding to the length of the struct - new le32 fields period_align_bytes,
min_period_bytes, max_period_bytes, max_buffer_bytes, where min_period_bytes,
max_period_bytes and max_buffer_bytes are all multiples of period_align_bytes.
With this addition, the guest driver would have the information required to
match the period/buffer size capabilities of the host pcm stream and advertise
them to clients in the guest space, without the need to hard code these using
the guest virtio_snd module parameters.
* Alternatively a new control request to query the refined PCM parameters
could be added (e.g. PCM_GET_PARAMS, via PCM feature flag
VIRTIO_SND_PCM_F_GET_PARAMS), to be used by the guest driver after it has
successfully sent PCM_SET_PARAMS, in order to query the PCM parameters
effectively used by the device.
Audio Controls
The original virtio-snd device specification as added to virtio 1.1 did not
include audio control support, however the virtio 1.3 draft includes this.
Audio Control Roles
The roles describe standard uses for controls, like volume, mute and gain.
There is also an undefined role definition that according to the spec must not
be used.
* VIRTIO_SND_CTL_ROLE_UNDEFINED
* VIRTIO_SND_CTL_ROLE_VOLUME
* VIRTIO_SND_CTL_ROLE_MUTE
* VIRTIO_SND_CTL_ROLE_GAIN
Problems identified with the role definitions:
* Not all audio controls can be described as volume, mute or gain. The
undefined role must be allowed for any control that is not for volume, mute or
gain.
* What is the difference between VIRTIO_SND_CTL_ROLE_VOLUME and
VIRTIO_SND_CTL_ROLE_GAIN, is the first supposed to be used for playback and the
second for capture? Or is the first supposed to be used for digital volume and
the second for analog gain? Is a distinction between volume and gain necessary?
* If certain data types are assumed for controls with role
VIRTIO_SND_CTL_ROLE_VOLUME, VIRTIO_SND_CTL_ROLE_MUTE or
VIRTIO_SND_CTL_ROLE_GAIN, this should be explicitly specified (e.g. BOOLEAN for
VIRTIO_SND_CTL_ROLE_MUTE, INTEGER for VIRTIO_SND_CTL_ROLE_VOLUME and
VIRTIO_SND_CTL_ROLE_GAIN?).
Mute control value meaning
The virtio spec is not clear whether a value of 1 for an audio control with
role VIRTIO_SND_CTL_ROLE_MUTE means muted, or rather switched on (unmuted).
Testing end-to-end with the reference virtio-snd linux/android guest driver it
was inferred that 1 means switched on, i.e. unmuted. In our opinion this should
be clarified within the spec.
TLV support
The TLV struct and possible type values are considered well known from ALSA.
There are only a few common/standard TLV type definitions (for specific meta
data types, like db scale, db min/max plus a handful more or the container TLV
type that allows packing multiple TLVs into one containing TLV); they should be
defined in virtio-snd.h for the completeness of the specification. Otherwise a
non ALSA based host that supports reporting metadata would need to redefine
these to match the ALSA definitions
The audio controls extension proposal draft includes the definition of struct
virtio_snd_ctl_tlv. It is removed however in the virtio_snd.h header file of
the linux/android virtio-snd guest driver implementation for this extension -
the guest driver just performs memcpy to/from the equivalent ALSA struct. For
the completeness of the specification all data structures used for data
exchanges within the virtio spec should be part of the public header file. This
is particularly important when the host or guest don't use ALSA and don't have
access to the ALSA TLV struct definition.
Other than for metadata of controls for volume/gain, in ALSA audio controls can
use TLVs for representing the actual control data (such controls use type
VIRTIO_SND_CTL_TYPE_BYTES and access VIRTIO_SND_CTL_ACCESS_TLV_READ,
VIRTIO_SND_CTL_ACCESS_TLV_WRITE or a combination of
VIRTIO_SND_CTL_ACCESS_TLV_READ and VIRTIO_SND_CTL_ACCESS_TLV_WRITE). Since this
use seems to be supported by the reference linux/android virtio-snd guest
driver, it should be documented in the spec.
virtio_snd_ctl_info struct alignment
Field "name" of the virtio_snd_ctl_info struct has a length of 44 bytes. With
this field length, the alignment of the following value union field with __le64
data members depends on packing and/or alignment compiler directives that can
differ between host and guest and thus result in a different alignment of
"value". Klocwork detects this issue as a PORTING.STORAGE.STRUCT guideline
violation for this struct, which can be fixed by modifying the length of field
"name" to 48, or by adding a padding field of length 4 bytes between the "name"
and "value" fields (e.g. __u8 padding[4]).
Looking forward to clarifications and/or responses.
Radu Ocica
----------------------------------------------------------------------
This transmission (including any attachments) may contain confidential
information, privileged material (including material protected by the
solicitor-client or other applicable privileges), or constitute non-public
information. Any use of this information by anyone other than the intended
recipient is prohibited. If you have received this transmission in error,
please immediately reply to the sender and delete this information from your
system. Use, dissemination, distribution, or reproduction of this transmission
by unintended recipients is not authorized and may be unlawful.