On Fri, Oct 03, 2014 at 10:06:02PM +0200, Jonas Smedegaard wrote: > Hi Ron, > > Thanks for clarifying... > > Quoting Ron (2014-10-03 18:28:34) > > On Fri, Oct 03, 2014 at 12:48:18PM +0200, Jonas Smedegaard wrote: > >> libvorbis has a tuning option called "application", with values > >> "voip", "audio" and "low-delay". > > > > Do you actually have some use case where it's really important to > > fiddle with that manually? > > > > These were manual overrides that were useful for testing in the early > > life of the codec, but libopus 1.1 actually has a neural network that > > analyses the signal in realtime to dynamically make the best selection > > of tuning parameters. So most people should never need to specify > > those, and are likely to get better results if they don't, since most > > real world audio doesn't fall cleanly into categories like that as far > > as the codec is concerned, even when a user might think the right > > choice is "obvious" for what they are encoding. > > Seems it is a documented feature of libopus: > http://opus-codec.org/docs/html_api-1.0.1/group__opus__encoder.html > > If discouraged and/or obsolete and/or even broken, I guess that should > be documented (or at the very least silently removed, but I don't see > why not mention such change).
Yes, it's still part of the library API, and it's not so much that it's "broken" per-se as that it very rarely makes sense to override what opusenc will already do by default unless you're really doing something very special, and know all the consequences and constraints that come with that. The "voip" option still introduces a high pass filter and changes some thresholds, but those are mostly only useful if you're really doing VoIP and don't want full band audio, and opusenc isn't really a tool for VoIP, since it encodes in Ogg not RTP. For "spoken word recordings" it's less clear if that's really an advantage over the music/speech detection in 1.1 (since things like breath noise are usually already absent from the recording). The "low-delay" option can be useful, but again only for real-time streaming and if you are using very small frames and a bitrate where CELT will be better than SILK (since it will disable the use of SILK, and small frames will also reduce the quality that is obtained at a given bitrate). But opusenc will already automatically select this for you if you specify a frame size < 10ms (which already precludes using SILK), and if you aren't using frames that small, it's almost certainly not what you want anyway. A special purpose application using the library directly might have considerations of its own about when to set these, but for users of opusenc, the best choice is essentially already a function of the other options which aren't "hidden". > > But that said ... > > > >> opusenc lack ability to apply this tuning. > > > > I don't believe this is strictly true. You should still be able to > > override that using the --set-ctl-int option (along with lots of other > > arcane options, that really require you to be quite familiar with the > > codec internals to use them in a way that does more good than harm to > > the quality of encoding). > > > > Not exposing this control more directly was a deliberate choice for > > the reasons above aiui. > > Ahh - I did see that mysterious option in the man page. > > Are you saying it is deliberately kept mysterious? Yes. Well not exactly "mysterious" since if you're familiar with the libopus CTLs it is just a direct interface to let you set any of those to whatever value you please - and there are more of them than just this one which people doing special things might want to be able to control - but I believe that Greg is a little gun-shy of exposing options that innocent people will mostly only hurt themselves with. Partly through prior experience with things exposing choices that were easier for casual users to get wrong than right, and partly because there was a fair bit of early confusion about when to use which of these choices, with a lot of people guessing wrong (like thinking "low delay must always be good" and "voip is any speech" when in reality what they mostly really do is trade away audio quality for other more specialised considerations). The emphasis was quite deliberately on opusenc should default to creating files of the highest quality for the sort of uses that opusenc is most appropriate for, while still letting expert users do "expert things" if and when they need to. > >> The tuning option is accessible e.g. from libav. > > > > It's quite possible that libav should also no longer encourage people > > to tweak at this too, though I'm not personally familiar with where > > and how they allow this. > > I guess either clearly documenting the feature or clearly stating that > it is discouraged and obsolete and irrelevant would help not only > end-users like me but also library users like libav. It's not really deprecated as such if you're using the library, and know full well why you're using them (and it kind of can't be without breaking the API since you need to pass this when creating an encoder). I guess it's more like the gcc optimiser, where for almost everyone -O2 is what you want to use, and only some tiny portion of people will really have the need, and do the detailed testing, to specify specific optimisation options more directly. > Here's avconv documentation: > > $ avconv -h full | grep libopus -A 10 > avconv version 11-6:11-1, Copyright (c) 2000-2014 the Libav developers > built on Sep 13 2014 19:43:14 with gcc 4.9.1 (Debian 4.9.1-13) > libopus AVOptions: > -application <int> E..A... Intended application type > voip E..A... Favor improved speech intelligibility > audio E..A... Favor faithfulness to the input > lowdelay E..A... Restrict to only the lowest delay modes > -frame_duration <float> E..A... Duration of a frame in milliseconds > -packet_loss <int> E..A... Expected packet loss percentage > -vbr <int> E..A... Variable bit rate mode > off E..A... Use constant bit rate > on E..A... Use variable bit rate > constrained E..A... Use constrained VBR Yeah the descriptions of those options are the sort of oversimplification we were trying to avoid (and even the description in the library docs could probably be better now). For some cases "audio" mode could actually give "improved speech intelligibility", and none of the modes are "faithful to the input", it's a lossy codec so by definition it's unfaithful, it just tries to be unfaithful in ways you can't hear (in all of the modes), but it always tries to be as unfaithful as it can get away with because that's how you save bits and get compression. And low-delay doesn't actually restrict you to the lowest delay modes at all. It just removes the extra codec lookahead delay that is required when using SILK or hybrid modes, you can still select frame sizes giving the largest possible latency. Or put another way, just about any one line description of these options is going to be completely misleading to anyone who doesn't dig a whole lot deeper than that. So having to do that to figure out how to set them (or know that they exist) isn't an entirely terrible state of affairs. It's not bad that they do exist, but exposing them gives people the impression they ought to use them or need to pick one, which in general, they shouldn't and don't at least in the normal case for opusenc. Apparently one mitigating factor for libav here, is that it can also be used to stream RTP, so the alternative options might actually be more relevant to it than they are to opusenc. > > If you have some really compelling need for this we can run that past > > Greg, but I suspect he'll probably say "use the ctl, or most > > preferably don't!" unless it's something we've really never thought > > about before. > > My usecase is to compare tools. I have learned that avconv use of > libvpx is inferior to using vpxenc directly, and I became curious if > that was the case for their use of other libraries too. That's how I > discovered this feature offered via libav but not opusenc. > > Might be that Handbrake and some of the gazillion other transcoding > wrappers make use of the feature too. Ok, so the short answer for your case then is most of the time you'd just want the "more obvious" options from opusenc, but when you do really need to tweak this to do a direct comparison of some special mode from another tool you'd indeed use the --ctl option. I believe that libav/ffmpeg does now have their own reimplementation of a decoder (I'm not sure if we have that in Debian yet though), but most of the quality related things are generally encoder side, so if you do find any notable difference that would definitely be something worth reporting, since it's likely to be "simple bug" rather than some fundamental shortcoming somewhere. One other thing to be aware of, depending on how you're doing your comparisons, is that opusdec will dither by default, which improves the audible quality of low level signals, but does raise the measured noise floor. You can turn that off if you need to. Cheers, Ron -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org