On 7/19/05, Monty <[EMAIL PROTECTED]> wrote: > On Tue, Jul 19, 2005 at 04:05:59PM -0700, Michael K. Edwards wrote: > > That's mighty cool. Can you say anything about the Mercora encoder's > > psycho-acoustic bits > > In fact, I can't say much about it (I know all about it but am under > NDA).
That's what I expected. Such is life. > > or about how you approach the risk that loading > > a particular codebook into the Vorbis decoder would result in > > something patent-infringing? > > The codebooks are huffman trees + a value per leaf: just data. The > code that applies them may infringe, but I doubt very much that raw > data itself can, genomics stupidity notwithstanding. That's a little like saying that no software can possibly infringe a patent because the object code is just data consumed by a Von Neumann machine. Only a little, of course; the codebook abstraction is hardly Turing complete. But suppose that the Vorbis decoder fit most of the claims of a patent, and that a certain pattern of codebook usage completed the fit. Then combining the two would be a patent-infringing use, and the suppliers of one or both (not to mention of the encoder) could be held liable depending on criteria such as whether there are substantial non-infringing uses. Let me make that a little more concrete. Lucent's patent #5,341,457 (at issue in the Dolby suit) has four independent claims: #1 ("method of processing ... audio signals", i. e., encoding) #10 ("storage medium" to which is applied a "recording signal", i. e., the data format put in a "physical" form according to the patent-agent shibboleth of the day) #13 ("method of transmitting audio signals", i. e., streaming encoder) #17 ("method for generating signals", i. e., the encoding process again, but this time stated all in one claim and hewing a little more closely to the preferred embodiment than #1 does) The disclosure also describes the decoder for these "signals". It is wholly plausible to me (IANAL, TINLA) that the history of the patent application would support a claim either that the act of decoding such a "storage medium" is an infringing use or that the examiner erroneously insisted on the "storage medium" lingo when the proper subject matter of the invention is the encoding and decoding processes. Now, my reading of this patent is that the "novel" bit of each independent claim is the use of "at least one tonality value reflecting the degree to which said time sequence of audio signals comprises tone-like quality" to control the noise masking threshold used when quantizing. The rest is vanilla blockwise transform coding (in the disclosure, 2048 FFT). In the preferred embodiment, the "tonality value" is a "Spectral Flatness Measure", a relatively inexpensive-to-calculate (given a cheap floating point multiply, anyway) proxy for a true statistical measure of tone strength. The disclosure is quite articulate on the scientific basis for varying the noise threshold, and hence the quantization, based on the degree of tonality in a given critical band. A range of noise thresholds would presumably translate, in the Vorbis codec as it does in the "entropy-coded case" of the '457 preferred embodiment, to a range of Huffman codebooks. Without going into needless detail, I submit that one could easily construct a Vorbis encoder that selected codebooks for residue encoding using substantially the method taught in the '457 patent. Would its output be meaningfully distinguishable from that of the reference Vorbis encoder or of the Mercora encoder? I have not studied either enough to be able to answer that question. Note that I turned first to the '457 patent, not least because its claim structure is simpler, but also because its claimed invention appears to me to be a little closer to the heart of the Vorbis system. A quick glance at #5,579,430 (the principal MP3 patent) persuades me that I could go through a similar exercise, not with claim 1 (since Vorbis doesn't appear to provide an escape mechanism from codebook into "PCM", i. e., raw data for rare entries), but with each of the other independent claims 19 and 22. Personally I think both of these claims are very weak on both the originality and non-obviousness fronts. In my unqualified opinion, if they were ever litigated they would have to have dependent claims containing non-trivial psycho-acoustic results or other engineering benefits folded into them, or else they could well be invalidated altogether. The claims dependent on 22 make it clear that it is about re-establishing sync in mid-stream, and hence outside the domain of Vorbis proper. But 19, 20, and 21 together represent a psycho-acoustic tactic that I wouldn't immediately dismiss as unfit for patenting, and could easily be embodied in an alternate Vorbis encoder. > > Have you tried, just for kicks, mapping > > the AC-3 and/or MP3 techniques onto the Vorbis framework? > > Vorbis isn't a framework, it's a codec. A more flexble codec than the > others, but still just a codec. > > The techniques used by both mp3 and AC3 are, to put it bluntly, > ancient. Although there was once some 'cargo cult' tendency to try > out what the other encoders did, for the most part the external > techniques turned out to be obsolete or inappropriate. Floor 0 is the > most visible example of taking a cue from outside research without > thinking it through (LSP is a *terrible* idea for wideband encodings). AC-3 and DTS are really very different from the music-oriented codecs. They use an impressive amount of ad-hockery to handle the vagaries of film sound (pop and classical music, speech, quiet ambient sounds, Foley work, explosions, subsonics, comfort noise, and most combinations of the above, spread across six channels or so with different purposes and frequency responses, _plus_ markup to support variant post-processing such as alternate voice dubs and dynamic range compression for non-theatre listening environments). Whatever the respects may be in which the Vorbis design may reflect newer and better fundamental research, it was silly for me to suggest that typical AC-3 media could be losslessly transcoded into a Vorbis bitstream without a considerable increase in bitrate. (Almost as silly, actually, as claiming that an AC-3 encoder is just a formalized chunk of pure math.) Although I do not by any means know all of the ins and outs of the MP3 format, I think there is more reason to believe that a lossless transcoder from MP3 to Vorbis might be possible, at least for some flavors of MP3. Your codebook might bloat out because you have to shoehorn all the values into it that the MP3 coded as PCM escapes, and you might not be able to represent all of the joint encoding variants used to improve the Huffman efficiency, and for various other reasons you would doubtless get a less efficiently coded Vorbis stream than the output of a native Vorbis encoder at the same perceptual quality, but that's not the point. Your decoder is more flexible than those with baked-in equivalents to the codebooks that you prepend to the encoded data. This means that you might be able to embody in your encoder the moral equivalent of the psycho-acoustic techniques used in The Other Guy's. In your shoes, I wouldn't want to wait until the discovery phase of a lawsuit to find out that The Other Guy's expert witness has figured out a way to coax your encoder into producing a stream whose codebooks will look like a smoking gun in the eyes of judge and jury. > In general, the 'lock you up tight' patents that the other firms go > for are not ones that strictly affect encoding or the raw bitstream > itself; they attempt to patent sufficient algoritms around the data > that it's impossible to encode/decode the bitstream itself without > infringing. This is another reason I feel relatively secure about > Vorbis; the bitstream looks/works nothing like the competition. > Should, God forbid, Vorbis be accused of using some specific technique > that is not central to handling the bitstream, we could sidestep it > easily. The only worrisome patents are the abusive, overly-broad > ones. The other firms have a variety of patent agents and attorneys at their disposal, thinking about the problem of securing legal barriers to competition from a variety of angles. Some of them focus on blocking unauthorized interoperable implementations, and others think more about how broad a swathe of techniques they can presumptively encumber for horse-trading purposes. AFAICT the differences between the various music-oriented bitstream types, including Ogg Vorbis, are more quantitative than qualitative -- except where sync strategies are concerned, which AIUI is more of an Ogg thing. I agree with you that some patents as granted, and occasionally even as litigated, are overly broad, and that the incidence of these failures is higher in the digital arena than in some others. But the fact that your bitstreams are not trivially interoperable does not mean that you are automatically safe from being found to infringe a patent of the limited scope typical of those in other industries. > However, the biggest reason I feel secure is that most of the world is > currently using and shipping Vorbis daily. Even Microsoft ships it in > games (where it's not obvious that it's there, but it is nonetheless). I'm glad that it's commercially successful! But note that "most of the [media-producing] world", by revenues, engages in some kind of patent horse-trading. You can't be sure that, say, Microsoft is comfortable using your format because their lawyers judge it to be patent-free rather than because they already have blanket licenses or no-sue agreements with the holders of patents that would otherwise concern them. There's really no substitute for the opinion of your own competent counsel (which I am not). > > It would be kind of fun to write a lossless transcoder to Vorbis from > > one or more patent-encumbered formats and to see if there are any > > discernible patterns in the codebooks. > > Can't happen. The transform domains are not compatable. Looks to me like MPEG Audio Decoder (libmad) uses an IMDCT, just like the decoder in the Vorbis I spec -- which is no surprise given that you cite (unless I am gravely mistaken) Fraunhofer's Dr. Brandenburg for its definition. Or is there some other fundamental incompatibility that I'm missing? > > It might also be a prudent > > defensive measure so that you can demonstrate what a potentially > > infringing Vorbis stream would look like and evaluate to what extent > > you can distinguish them from Mercora streams. > > Mercora is 100% real Vorbis. Aside from a different vedor string I > don't believe they are distinguishable from streams produced by our > reference encoder. I understand that they are interoperable. But I presume, based on what you have written, that the Mercora encoder uses psycho-acoustic techniques that are both more bit-efficient and substantially less processor-intensive than the reference encoder. This comment and those that followed were predicated on the assumption that you and/or Xiph.org were involved in the design and implementation of the Mercora encoder, or at least have some interest in the question of whether it or the bitstreams it produces are potentially patent-encumbered; my apologies if that is not the case. > > Could be doubly > > prudent if there's anything about the Mercora internals that you > > wouldn't want to have to divulge into the public record during a court > > proceeding, since presumably in the absence of a patent you have no > > way of retaining proprietary rights to that encoder's methods of > > operation other than trade secret law. > > The Mercora encoder isn't ours and we have no rights to it, but I will > say it doesn't do anything the reference encoder doesn't. Aside from > that, I'm not sure what your point actually is; the worry that third > parties using Vorbis would be exposing themselves to being forced to > violate NDA? No, not at all. I was conflating your interests and Mercora's here. My thought was that it could be difficult to defend a charge of patent infringement, hypothetically supported by evidence of similarity between a pattern of bit allocation in Mercora output and the analogous pattern in the output of a patented encoder, without divulging some details about the Mercora encoder's internals that are currently not public. The fact that you are under NDA about the Mercora psycho-acoustics suggests that they are held as a trade secret. That's a perfectly valid strategy; but it means that Mercora only has legal reinforcement for its efforts to retain a technological edge over its competitors so long as the techniques remain secret. A patent infringement proceeding is one of the easier ways for such a secret to be forced into the open, at which point it's available for use by all comers (apart from copyright, but that's usually no barrier to reimplementation). Contrariwise, in the event that Mercora is not commercially successful, its techniques could wind up in a sort of legal limbo in which no one who knows them is ever legally permitted to disclose them or use them elsewhere. I mean this in the nicest possible way, but those are exactly the risks that the patent system is designed to avert. > > I'm just trying to understand > > how deliberately eschewing patents works out in a field littered with > > them. > > If I was going to be worried about patents to the level of paranoia > some suggest, I'd have to give up computers and become a blacksmith or > machinist, or something (perhaps a hooligan, that's always appealed, > but I hate soccer and cheap booze). You can't demonstrate > conclusively that a single piece of software, anywhere, does not > infringe any patent. How many patents does GCC 'infringe'? 100? > 1000? 10,000? The only answer is: "The courts have not awarded any > infringement claim against the FSF regarding GCC" and that is the > closest practical definition we have of "does not infringe". Vorbis > meets the same definition and, honestly, is really not any more likely > than GCC to see an infringement claim (eg, Microsoft is not 'at war' > with us the way they are with the FSF. Microsoft is about as > aggressive as software companies get, yet for some reason they're not > using the patent card). Has there ever been a cease-and-desist letter, let alone an infringement proceeding, claiming patent rights against a compiler for a language that GCC supports? I'm not saying there hasn't, nor have I even researched the question. But the solutions to some problems require little dollops of ingenuity and large amounts of grunt work rather than the sort of quantum of novelty that patents are designed to encourage. Such problems are no less worthy of skill and design rigor, but they're closer to architecture than to applied science. Compilers may or may not be in this category, but there's no question that a lot of other software is. I've done other work of which I'm much prouder than the one patent I (successfully) applied for, but I would have to say that I haven't reduced any other invention to practice within the statutory definition as I understand it. That one patent, seen through the lens of time, is right smack in this signal-perception nexus (video motion estimation rather than psycho-acoustics); the closest I ever came to reducing a second invention to practice was encryption-related. If there were a third candidate to date, it would be outside software altogether -- and I've spent practically my whole working life so far doing software, relatively little of it in the above areas. That doesn't feel like a coincidence to me. > The only suggestion, at any time, that there may be an infringement > claim against Vorbis was an off-the-cuff remark from Henri Linde of > Thomson years ago when he was under the impression that 'Vorbis' was > just a tweaked mp3 encoder. He was corrected and retracted his > remarks (but that followup was not widely reported). I'm glad to hear that at least one of the sleeping dogs has considered attempting to bite but decided against it. I very much doubt that anything I write here adds to your risks, since I have no special knowledge or skills in this area and your competitors have more qualified analysts and attorneys that either of us can shake a stick at. > > I am going on the press release at > > http://investor.dolby.com/ReleaseDetail.cfm?ReleaseID=161066 ; I > [...] > > At this point a lawyer who knows what actually happened has to weigh > in and let us know; anything else is guessing, hearsay and uninformed > speculation I fear :-( Not that it's ever stopped Debian legal before, > but I'm not personally going to get involved in such a discussion > myself. Nor I, unless I get around to dropping by the law library and shelling out for the whole PACER docket. Last time I did that I was a bit disappointed, but the Dolby case is a lot fresher. Cheers, - Michael (IANAL, TINLA)