Also, I'm a bit confused. I've looked at several VirtIO devices now and have seen things like this: #define VIRTIO_PCI_DEVICE_SIGNATURE SIGNATURE_32 ('V', 'P', 'C', 'I') // ... UINT32 Signature; I'm quite confused because I can't seem to find this anywhere in the VirtIO specification. The spec says nothing about signature values anywhere, just the magic value of 0x74726976. So where does this come from?
On 4/15/21, Ethin Probst via groups.io <harlydavidsen=gmail....@groups.io> wrote: > Hi Andrew, > > What would that protocol interface look like if we utilized your idea? > With mine (though I need to add channel mapping as well), your > workflow for playing a stereo sound from left to right would probably > be something like this: > 1) Encode the sound using a standard tool into a Wave PCM 16. > 2) Place the Wave file in the Firmware Volume using a given UUID as > the name. As simple as editing the platform FDF file. > 3) Write some BDS code > a) Lookup Wave file by UUID and read it into memory. > b) Decode the audio file (audio devices will not do this decoding > for you, you have to do that yourself). > c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate > of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit > PCM audio, the channel mapping, the number of samples, and the samples > themselves. > d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx() > for a playback event to signal. > e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you > just created. > The reason that LoadBuffer() takes so many parameters is because the > device does not know the audio that your passing in. If I'm given an > array of 16-bit audio samples, its impossible to know the parameters > (sample rate, sample format, channel mapping, etc.) from that alone. > Using your idea, though, my protocol could be greatly simplified. > Forcing a particular channel mapping, sample rate and sample format on > everyone would complicate application code. From an application point > of view, one would, with that type of protocol, need to do the > following: > 1) Load an audio file in any audio file format from any storage mechanism. > 2) Decode the audio file format to extract the samples and audio metadata. > 3) Resample the (now decoded) audio samples and convert (quantize) the > audio samples into signed 16-bit PCM audio. > 4) forward the samples onto the EFI audio protocol. > There is another option. (I'm happy we're discussing this now -- we > can hammer out all the details now which will make a lot of things > easier.) Since I'll most likely end up splitting the device-specific > interfaces to different audio protocols, we could make a simple audio > protocol that makes various assumptions about the audio samples your > giving it (e.g.: sample rate, format, ...). This would just allow > audio output and input in signed 16-bit PCM audio, as you've > suggested, and would be a simple and easy to use interface. Something > like: > typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { > EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; > EFI_SIMPLE_AUDIO_PROTOCOL_START Start; > EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; > } EFI_SIMPLE_AUDIO_PROTOCOL; > This way, users and driver developers have a simple audio protocol > they can implement if they like. It would assume signed 16-bit PCM > audio and mono channel mappings at 44100 Hz. Then, we can have an > advanced protocol for each device type (HDA, USB, VirtIO, ...) that > exposes all the knobs for sample formats, sample rates, that kind of > thing. Obviously, like the majority (if not all) UEFI protocols, these > advanced protocols would be optional. I think, however, that the > simple audio protocol should be a required protocol in all UEFI > implementations. But that might not be possible. So would this simpler > interface work as a starting point? > > On 4/15/21, Andrew Fish <af...@apple.com> wrote: >> >> >>> On Apr 15, 2021, at 1:11 PM, Ethin Probst <harlydavid...@gmail.com> >>> wrote: >>> >>>> Is there any necessity for audio input and output to be implemented >>>> within the same protocol? Unlike a network device (which is >>>> intrinsically bidirectional), it seems natural to conceptually separate >>>> audio input from audio output. >>> >>> Nope, there isn't a necessity to make them in one, they can be >>> separated into two. >>> >>>> The code controlling volume/mute may not have any access to the sample >>>> buffer. The most natural implementation would seem to allow for a >>>> platform to notice volume up/down keypresses and use those to control >>>> the >>>> overall system volume, without any knowledge of which samples (if any) >>>> are currently being played by other code in the system. >>> >>> Your assuming that the audio device your implementing the >>> volume/muting has volume control and muting functionality within >>> itself, then. >> >> Not really. We are assuming that audio hardware has a better >> understanding >> of how that system implements volume than some generic EFI Code that is >> by >> definition platform agnostic. >> >>> This may not be the case, and so we'd need to >>> effectively simulate it within the driver, which isn't too hard to do. >>> As an example, the VirtIO driver does not have a request type for >>> muting or for volume control (this would, most likely, be within the >>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, >>> either the driver would have to simulate the request or return >>> EFI_UNSUPPORTED this instance. >>> >> >> So this is an example of above since the audio hardware knows it is >> routing >> the sound output into another subsystem, and that subsystem controls the >> volume. So the VirtIo Sound Driver know best how to bstract volume/mute >> for >> this platform. >> >>>> Consider also the point of view of the developer implementing a driver >>>> for some other piece of audio hardware that happens not to support >>>> precisely the same sample rates etc as VirtIO. It would be extremely >>>> ugly to force all future hardware to pretend to have the same >>>> capabilities as VirtIO just because the API was initially designed with >>>> VirtIO in mind. >>> >>> Precisely, but the brilliance of VirtIO >> >> The brilliance of VirtIO is that it just needs to implement a generic >> device >> driver for a given operating system. In most cases these operating >> systems >> have sounds subsystems that manage sound and want fine granularity of >> control on what is going on. So the drivers are implemented to maximizes >> flexibility since the OS has lots of generic code that deals with sound, >> and >> even user configurable knobs to control audio. In our case that extra >> layer >> does not exist in EFI and the end user code just want to tell the driver >> do >> some simple things. >> >> Maybe it is easier to think about with an example. Lets say I want to play >> a >> cool sound on every boot. What would be the workflow to make the happen. >> 1) Encode the sound using a standard tool into a Wave PCM 16. >> 2) Place the Wave file in the Firmware Volume using a given UUID as the >> name. As simple as editing the platform FDF file. >> 3) Write some BDS code >> a) Lookup Wave file by UUID and read it into memory. >> b) Point the EFI Sound Protocol at the buffer with the Wave file >> c) Tell the EFI Sound Protocol to play the sound. >> >> If you start adding in a lot of perimeters that work flow starts getting >> really complicated really quickly. >> >>> is that the sample rate, >>> sample format, &c., do not have to all be supported by a VirtIO >>> device. Notice, also, how in my protocol proposal I noted that the >>> sample rates, at least, were "recommended," not "required." Should a >>> device not happen to support a sample rate or sample format, all it >>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>> sample rates it supports, channel mappings, &c. >>> >>> I do understand how just using a predefined sample rate and sample >>> format might be a good idea, and if that's the best way, then that's >>> what we'll do. The protocol can always be revised at a later time if >>> necessary. I apologize if my stance seems obstinate. >>> >> >> I think if we add the version into the protocol and make sure we have a >> separate load and play operation we could add a member to set the extra >> perimeters if needed. There might also be some platform specific generic >> tunables that might be interesting for a future member function. >> >> Thanks, >> >> Andrew Fish >> >>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a >>> network driver would be another good way of figuring out how async >>> works in UEFI. >>> >>> On 4/15/21, Andrew Fish <af...@apple.com> wrote: >>>> >>>> >>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown <mc...@ipxe.org> wrote: >>>>> >>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>> - I hoped to add recording in case we in future want to add >>>>>> accessibility aids like speech recognition (that was one of the todo >>>>>> tasks on the EDK2 tasks list) >>>>> >>>>> Is there any necessity for audio input and output to be implemented >>>>> within >>>>> the same protocol? Unlike a network device (which is intrinsically >>>>> bidirectional), it seems natural to conceptually separate audio input >>>>> from >>>>> audio output. >>>>> >>>>>> - Muting and volume control could easily be added by just replacing >>>>>> the sample buffer with silence and by multiplying all the samples by >>>>>> a >>>>>> percentage. >>>>> >>>>> The code controlling volume/mute may not have any access to the sample >>>>> buffer. The most natural implementation would seem to allow for a >>>>> platform to notice volume up/down keypresses and use those to control >>>>> the >>>>> overall system volume, without any knowledge of which samples (if any) >>>>> are >>>>> currently being played by other code in the system. >>>>> >>>> >>>> I’ve also thought of adding NVRAM variable that would let setup, UEFI >>>> Shell, >>>> or even the OS set the current volume, and Mute. This how it would be >>>> consumed concept is why I proposed mute and volume being separate APIs. >>>> The >>>> volume up/down API in addition to fixed percentage might be overkill, >>>> but >>>> it >>>> does allow a non liner mapping to the volume up/down keys. You would be >>>> surprised how picky audiophiles can be and it seems they like to file >>>> Bugzillas. >>>> >>>>>> - Finally, the reason I used enumerations for specifying parameters >>>>>> like sample rate and stuff was that I was looking at this protocol >>>>>> from a general UEFI applications point of view. VirtIO supports all >>>>>> of >>>>>> the sample configurations listed in my gist, and it seems reasonable >>>>>> to allow the application to control those parameters instead of >>>>>> forcing a particular parameter configuration onto the developer. >>>>> >>>>> Consider also the point of view of the developer implementing a driver >>>>> for >>>>> some other piece of audio hardware that happens not to support >>>>> precisely >>>>> the same sample rates etc as VirtIO. It would be extremely ugly to >>>>> force >>>>> all future hardware to pretend to have the same capabilities as VirtIO >>>>> just because the API was initially designed with VirtIO in mind. >>>>> >>>>> As a developer on the other side of the API, writing code to play >>>>> sound >>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>> consume >>>>> as simple as possible an abstraction of an audio output protocol and >>>>> not >>>>> have to care about what hardware is actually implementing it. >>>>> >>>> >>>> It may make sense to have an API to load the buffer/stream and other >>>> APIs >>>> to >>>> play or pause. This could allow an optional API to configure how the >>>> stream >>>> is played back. If we add a version to the Protocol that would at least >>>> future proof us. >>>> >>>> We did get feedback that it is very common to speed up the auto >>>> playback >>>> rates for accessibility. I’m not sure if that is practical with a >>>> simple >>>> PCM >>>> 16 wave file with the firmware audio implementation. I guess that is >>>> something we could investigate. >>>> >>>> In terms of maybe adding text to speech there is an open source project >>>> that >>>> conceptually we could port to EFI. It would likely be a binary that >>>> would >>>> have to live on the EFI System Partition due to size. I was thinking >>>> that >>>> words per minute could be part of that API and it would produce a PCM >>>> 16 >>>> wave file that the audio protocol we are discussing could play. >>>> >>>>> Both of these argue in favour of defining a very simple API that >>>>> expresses >>>>> only a common baseline capability that is plausibly implementable for >>>>> every piece of audio hardware ever made. >>>>> >>>>> Coupled with the relatively minimalistic requirements for boot-time >>>>> audio, >>>>> I'd probably suggest supporting only a single format for audio data, >>>>> with >>>>> a fixed sample rate (and possibly only mono output). >>>>> >>>> >>>> In my world the folks that work for Jony asked for a stereo boot bong >>>> to >>>> transition from left to right :). This is not the CODEC you are looking >>>> for >>>> was our response…. I also did not mention that some languages are right >>>> to >>>> left, as the only thing worse than one complex thing is two complex >>>> things >>>> to implement. >>>> >>>>> As always: perfection is achieved, not when there is nothing more to >>>>> add, >>>>> but when there is nothing left to take away. :) >>>>> >>>> >>>> "Simplicity is the ultimate sophistication” >>>> >>>> Thanks, >>>> >>>> Andrew Fish >>>> >>>>> Thanks, >>>>> >>>>> Michael >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Signed, >>> Ethin D. Probst >> >> > > > -- > Signed, > Ethin D. Probst > > > > > > -- Signed, Ethin D. Probst -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#74149): https://edk2.groups.io/g/devel/message/74149 Mute This Topic: https://groups.io/mt/81710286/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-