Le nonidi 9 thermidor, an CCXXIII, Vesselin Bontchev a écrit : > Hi! > > This patch adds support for Audible AA files. > > Audible samples can be obtained from, > > https://gitlab.com/vesselin.bontchev/audible-samples/tree/master > https://samples.ffmpeg.org/audible/ >
> Currently, this code generates corrupt audio output (in some places, and > deterministically). Does this happen with the Go and Python implementations too? > > By posting this patch, I am hoping to get some early feedback, and help in > making this work. > > Thanks, > Vesselin > From ba73543efc3fdc4b5c61e9cb56f998d748716e00 Mon Sep 17 00:00:00 2001 > From: Vesselin Bontchev <vesselin.bontc...@yandex.com> > Date: Sun, 19 Jul 2015 23:16:36 +0200 > Subject: [PATCH] Add support for Audible AA files > > https://en.wikipedia.org/wiki/Audible.com#Quality > --- > doc/demuxers.texi | 10 ++ > doc/general.texi | 2 + > libavformat/Makefile | 1 + > libavformat/aadec.c | 381 > +++++++++++++++++++++++++++++++++++++++++++++++ > libavformat/allformats.c | 1 + > 5 files changed, 395 insertions(+) > create mode 100644 libavformat/aadec.c > > diff --git a/doc/demuxers.texi b/doc/demuxers.texi > index e45e1af..df95233 100644 > --- a/doc/demuxers.texi > +++ b/doc/demuxers.texi > @@ -18,6 +18,16 @@ enabled demuxers. > > The description of some of the currently available demuxers follows. > > +@section aa > + > +Audible Format 2, 3, and 4 demuxer. > + > +This demuxer is used to demux Audible Format 2, 3, and 4 (.aa) files. > + > +@example > +ffmpeg -v debug -i input.aa -c:a copy output.wav > +@end example > + > @section applehttp > > Apple HTTP Live Streaming demuxer. > diff --git a/doc/general.texi b/doc/general.texi > index a260e79..2b782e0 100644 > --- a/doc/general.texi > +++ b/doc/general.texi > @@ -228,6 +228,8 @@ library: > @item 8088flex TMV @tab @tab X > @item AAX @tab @tab X > @tab Audible Enhanced Audio format, used in audiobooks. > +@item AA @tab @tab X > + @tab Audible Format 2, 3, and 4, used in audiobooks. > @item ACT Voice @tab @tab X > @tab contains G.729 audio > @item Adobe Filmstrip @tab X @tab X > diff --git a/libavformat/Makefile b/libavformat/Makefile > index cc73fd8..8405f8d 100644 > --- a/libavformat/Makefile > +++ b/libavformat/Makefile > @@ -395,6 +395,7 @@ OBJS-$(CONFIG_RTSP_DEMUXER) += rtsp.o > rtspdec.o httpauth.o \ > OBJS-$(CONFIG_RTSP_MUXER) += rtsp.o rtspenc.o httpauth.o \ > urldecode.o > OBJS-$(CONFIG_SAMI_DEMUXER) += samidec.o subtitles.o > +OBJS-$(CONFIG_AA_DEMUXER) += aadec.o > OBJS-$(CONFIG_SAP_DEMUXER) += sapdec.o > OBJS-$(CONFIG_SAP_MUXER) += sapenc.o > OBJS-$(CONFIG_SBG_DEMUXER) += sbgdec.o > diff --git a/libavformat/aadec.c b/libavformat/aadec.c > new file mode 100644 > index 0000000..bd994e8 > --- /dev/null > +++ b/libavformat/aadec.c > @@ -0,0 +1,381 @@ > +/* > + * Audible AA demuxer > + * Copyright (c) 2015 Vesselin Bontchev > + * > + * Inspired by https://github.com/jteeuwen/audible, and > + * https://code.google.com/p/pyaudibletags/source projects. Inspired, or translated from Go to C? The readString() wrapper seems to indicate a step-by-step translation, and that warrants copyright. The licence is GPL-compatible, though, but attribution must be preserved. > + * > + * See https://en.wikipedia.org/wiki/Audible.com#Quality for > + * format details. > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > USA > + */ > + > +#include "avformat.h" > +#include "internal.h" > +#include "libavutil/intreadwrite.h" > +#include "libavutil/tea.h" > +#include "libavutil/fifo.h" > +#include "libavutil/opt.h" > + > +#define AA_MAGIC 1469084982 /* this identifies an audible .aa file */ > +#define MAX_CODEC_SECOND_SIZE 3982 > +#define MAX_TOC_ENTRIES 64 > +#define TEA_BLOCK_SIZE 8 > + > +typedef struct AADemuxContext { > + int64_t data_end; Looks unused. > + uint32_t HeaderSeed; > + uint32_t Filesize; > + uint32_t Magic; > + uint32_t tocSize; > + uint32_t npairs; > + union { > + unsigned char key[16]; > + uint32_t part[4]; > + } HeaderKey; > + uint32_t TOC[MAX_TOC_ENTRIES][2]; These fields are used only in the read_header() function, they could (should?) be local variables. And some of them are stored but never accessed. And TOC could be a structure. > + > + void *aa_fixed_key; > + int aa_fixed_key_size; > + uint32_t start; > + uint32_t end; Same as above: local. > + int32_t codec_second_size; > + char codec_name[64]; Local. > + struct AVTEA *tea_ctx; > + uint8_t final_key[16]; > + int64_t current_chapter_size; > + int64_t current_codec_second_size; > + uint32_t trailing_bytes; Looks unused. > + int64_t total_parsed; Looks unused. > + struct AVFifoBuffer *cbuf; > + int chapter_idx; > +} AADemuxContext; > + > +static int32_t GetSecondSizeByCodecID(char *codec_name) Any reason to use UglyCamelCase for this? > +{ > + int32_t result = 0; > + > + if (!strcmp(codec_name, "mp332")) { > + // 0xC00D (marker in file) I do not understand the comment. > + result = 3982; > + } else if (!strcmp(codec_name, "acelp85")) { > + // 0xC00C > + result = 2000; > + } else if (!strcmp(codec_name, "acelp85")) { > + // 0xC010 > + result = 1045; > + } > + > + return result; > +} > + > +static void readString(AVIOContext *pb, int size, char *output) > +{ > + avio_read(pb, output, size); > +} Error code ignored. And the usefulness of this wrapper is hard to see. > + > +static int aa_read_header(AVFormatContext *s) > +{ > + AVStream *st; > + AADemuxContext *c; > + AVIOContext *pb; > + > + int i, j, idx; > + uint32_t nkey; > + uint32_t nval; > + char key[512] = {0}; > + char val[512] = {0}; > + unsigned char buffer[512] = {0}; > + uint32_t v0, v1; > + unsigned char dst[8]; > + unsigned char src[8]; > + unsigned char *output = buffer; Looks useless. > + int largest_idx = -1; > + int64_t largest_size = -1; > + int64_t current_size = -1; > + > + c = s->priv_data; > + pb = s->pb; You could init that directly while declaring the variables, that is how it is done. > + c->tea_ctx = av_tea_alloc(); > + if (!c->tea_ctx) > + return AVERROR(ENOMEM); > + c->cbuf = av_fifo_alloc(MAX_CODEC_SECOND_SIZE * 2); > + if (!c->cbuf) > + return AVERROR(ENOMEM); Leaks. > + > + c->Filesize = avio_rb32(pb); // file size > + c->Magic = avio_rb32(pb); // magic string > + c->tocSize = avio_rb32(pb); // TOC size > + avio_rb32(pb); // unidentified integer > + for (i = 0; i < c->tocSize; i++) { // read TOC > + avio_rb32(pb); // TOC entry index > + c->TOC[i][0] = avio_rb32(pb); // block Offset > + c->TOC[i][1] = avio_rb32(pb); // block size Unsafe, tocSize needs to be validated. > + } > + avio_read(pb, buffer, 24); // header termination block (ignored) > + c->npairs = avio_rb32(pb); // read dictionary entries > + for (i = 0; i < c->npairs; i++) { > + memset(val, 0, sizeof(val)); > + memset(key, 0, sizeof(key)); > + avio_r8(pb); // unidentified byte > + nkey = avio_rb32(pb); // key string length > + nval = avio_rb32(pb); // value string length > + if (nkey > 512) { sizeof(key) > + avio_seek(pb, nkey, SEEK_CUR); avio_skip() > + } else { > + readString(pb, nkey, key); // key string > + } > + if (nval > 512) { > + avio_seek(pb, nval, SEEK_CUR); Ditto. > + } else { > + readString(pb, nval, val); // value string. > + } > + if (!strncmp(key, "codec", 5)) { This will match "codec_params" too, you need to take the length into account. I suggest to add a 0-terminator to key and use a simple strcmp(). > + strncpy(c->codec_name, val, sizeof(c->codec_name) - 1); > + } > + if (!strncmp(key, "HeaderSeed", 10)) { > + c->HeaderSeed = atoi(val); > + } > + if (!strncmp(key, "HeaderKey", 9)) { > + j = 0; > + for (idx = 0; idx < 4; idx++) { > + c->HeaderKey.part[idx] = (uint32_t)atoi(val + j); > + c->HeaderKey.part[idx] = AV_RB32(c->HeaderKey.part + idx); > // convert to BE! Urgh! uint32_t key_part = atoi(...); AV_WB32(c->header_key + idx + 4, key_part); > + if (idx == 3) > + break; > + while (val[j] != ' ') // find the next space > + j++; > + j = j + 1; // skip over the space You could do much simpler using sscanf(): sscanf(val, "%d%d%d%d", &key_part[0], &key_part[1], &key_part[2], &key_part[3]); > + } > + } > + } > + > + /* decryption key derivation */ > + av_tea_init(c->tea_ctx, c->aa_fixed_key, 16); > + c->codec_second_size = GetSecondSizeByCodecID(c->codec_name); > + if (c->codec_second_size == -1) { The function never returns -1. > + return AVERROR_DECODER_NOT_FOUND; > + } > + memcpy(output + 0, "\x00\x00", 2); // purely for padding purposes > + memcpy(output + 2, &c->HeaderKey, 16); > + idx = 0; > + for (i = 0; i < 3 * TEA_BLOCK_SIZE; i += TEA_BLOCK_SIZE) { > + v0 = c->HeaderSeed; > + v1 = c->HeaderSeed + 1; > + AV_WB32(src, v0); > + AV_WB32(src + 4, v1); > + c->HeaderSeed = v1 + 1; > + av_tea_crypt(c->tea_ctx, dst, src, TEA_BLOCK_SIZE, NULL, 0); // TEA > ECB encrypt > + for (j = 0; j < 8 && idx < 18; j+=1, idx+=1) { > + output[idx] = output[idx] ^ dst[j]; > + } Not entirely sure: is it not a re-implementation of CBC mode? > + } > + memcpy(c->final_key, output + 2, 16); // skip first 2 bytes of output > + > + /* decoder setup */ > + st = avformat_new_stream(s, NULL); > + if (!st) > + return AVERROR(ENOMEM); > + st->codec->codec_type = AVMEDIA_TYPE_AUDIO; > + if (!strcmp(c->codec_name, "mp332")) { > + st->codec->codec_id = AV_CODEC_ID_MP3; > + } else if (!strcmp(c->codec_name, "acelp85")) { > + st->codec->codec_id = AV_CODEC_ID_SIPR; > + st->codec->block_align = 19; > + st->codec->channels = 1; > + st->codec->sample_rate = 8500; > + } else if (!strcmp(c->codec_name, "acelp16")) { > + st->codec->codec_id = AV_CODEC_ID_SIPR; > + st->codec->block_align = 20; > + st->codec->channels = 1; > + st->codec->sample_rate = 16000; > + } else { > + return AVERROR_DECODER_NOT_FOUND; > + } > + > + /* determine, and jump to audio start offset */ > + for (i = 1; i < c->tocSize; i++) { // skip the first entry! > + current_size = c->TOC[i][1]; > + if (current_size > largest_size) { > + largest_idx = i; > + largest_size = current_size; > + } > + } > + c->start = c->TOC[largest_idx][0]; > + c->end = c->TOC[largest_idx][1]; > + avio_seek(pb, c->start, SEEK_SET); > + c->current_chapter_size = 0; > + > + return 0; > +} > + > +static int aa_read_packet(AVFormatContext *s, AVPacket *pkt) > +{ > + AADemuxContext *c; > + unsigned char dst[8]; > + unsigned char src[8]; > + int i; > + int trailing_bytes; > + int blocks; > + uint8_t buf[MAX_CODEC_SECOND_SIZE * 2]; > + int written = 0; > + int queued_size; > + int ret = 0; > + int shortfall; > + int extra; > + > + c = s->priv_data; > + > + // do we already have some audio ouput from the last invocation? > + queued_size = av_fifo_size(c->cbuf); > + if (queued_size) { > + // av_log(s, AV_LOG_DEBUG, "[aa] draining queued data (%d bytes)\n", > queued_size); > + av_fifo_generic_read(c->cbuf, buf, queued_size, NULL); > + written = written + queued_size; > + } > + > + // are we at the start of a chapter? > + if (c->current_chapter_size == 0) { > + c->current_chapter_size = avio_rb32(s->pb); > + if (c->current_chapter_size == 0) { > + ret = AVERROR_EOF; > + goto end; > + } > + av_log(s, AV_LOG_DEBUG, "[aa] chapter %d (%ld bytes)\n", > c->chapter_idx, c->current_chapter_size); > + c->chapter_idx = c->chapter_idx + 1; > + avio_rb32(s->pb); // data start offset > + c-> current_codec_second_size = c->codec_second_size; Nit: strange space after ->. > + } > + > + // is this the last block in this chapter? > + if (c->current_chapter_size / c->current_codec_second_size == 0) { > + c->current_codec_second_size = c->current_chapter_size % > c->current_codec_second_size; > + } > + > + // decrypt c->current_codec_second_size bytes > + blocks = c->current_codec_second_size / TEA_BLOCK_SIZE; > + for (i = 0; i < blocks; i++) { > + avio_read(s->pb, src, TEA_BLOCK_SIZE); > + av_tea_init(c->tea_ctx, c->final_key, 16); > + av_tea_crypt(c->tea_ctx, dst, src, 1, NULL, 1); > + memcpy(buf + written, dst, TEA_BLOCK_SIZE); > + written = written + TEA_BLOCK_SIZE; > + } > + trailing_bytes = c->current_codec_second_size % TEA_BLOCK_SIZE; > + if (trailing_bytes != 0) { // trailing bytes are left unencrypted! > + avio_read(s->pb, src, trailing_bytes); > + memcpy(buf + written, dst, trailing_bytes); Looks strange: you read to src but copy dst. May this be the cause of the glitches? > + written = written + trailing_bytes; > + } > + > + // update state > + c->current_chapter_size = c->current_chapter_size - > c->current_codec_second_size; > + if (c->current_chapter_size <= 0) > + c->current_chapter_size = 0; > + > + if (written < c->codec_second_size) { // don't have enough output for > the decoder, pull in data from the next chapter > + shortfall = c->codec_second_size - written; > + c->current_chapter_size = avio_rb32(s->pb); > + if (c->current_chapter_size == 0) { > + ret = AVERROR_EOF; > + goto end; > + } > + av_log(s, AV_LOG_DEBUG, "[aa] chapter %d (%ld bytes)\n", > c->chapter_idx, c->current_chapter_size); > + c->chapter_idx = c->chapter_idx + 1; > + avio_rb32(s->pb); // data start offset (ignored) > + c->current_codec_second_size = c->codec_second_size; > + blocks = c->current_codec_second_size / TEA_BLOCK_SIZE; > + for (i = 0; i < blocks; i++) { > + avio_read(s->pb, src, 8); > + av_tea_init(c->tea_ctx, c->final_key, 16); > + av_tea_crypt(c->tea_ctx, dst, src, 1, NULL, 1); > + av_fifo_generic_write(c->cbuf, dst, TEA_BLOCK_SIZE, NULL); > + } > + trailing_bytes = c->current_codec_second_size % TEA_BLOCK_SIZE; > + if (trailing_bytes != 0) { > + avio_read(s->pb, src, trailing_bytes); > + av_fifo_generic_write(c->cbuf, src, trailing_bytes, NULL); > + } This code looks duplicated. > + c->current_chapter_size = c->current_chapter_size - > c->current_codec_second_size; > + if (c->current_chapter_size <= 0) > + c->current_chapter_size = 0; > + > + // borrow "shortfall" bytes > + av_fifo_generic_read(c->cbuf, buf + written, shortfall, NULL); > + written = written + shortfall; > + } > + > + // give only codec_second_size bytes to the decoder > + if (written > c->codec_second_size) { // c->cbuf drain triggers this > + extra = written - c->codec_second_size; > + av_fifo_generic_write(c->cbuf, buf + c->codec_second_size, extra, > NULL); > + written = c->codec_second_size; > + } > + > + av_init_packet(pkt); > + pkt->data = NULL; > + pkt->size = 0; > + pkt->pos = avio_tell(s->pb); > + > + // av_log(s, AV_LOG_DEBUG, "[aa] giving %d bytes to decoder\n", written); > + av_grow_packet(pkt, written); IIRC, av_new_packet() can do that in a single step. And do not forget to check memory allocations. > + memcpy(pkt->data, buf, written); > + > +end: > + return ret; > +} > + > +static int aa_probe(AVProbeData *p) > +{ > + unsigned char *buf = p->buf; > + > + // first 4 bytes are file size, next 4 bytes are the magic > + if (AV_RB32(buf+4) != AA_MAGIC) > + return AVPROBE_SCORE_EXTENSION; > + > + return AVPROBE_SCORE_MAX; Looks very wrong: 32-bits magic number -> MAX, anything else -> EXTENSION. Should be: 32->bits magic number -> MAX/4, anything -> 0. > +} > + > +#define OFFSET(x) offsetof(AADemuxContext, x) > +#define FLAGS AV_OPT_FLAG_DECODING_PARAM > +static const AVOption aa_options[] = { > + { "aa_fixed_key", // extracted from libAAX_SDK.so and AAXSDKWin.dll > files! > + "Fixed key used for handling Audible AA files", OFFSET(aa_fixed_key), > + AV_OPT_TYPE_BINARY, {.str="77214d4b196a87cd520045fd2a51d673"}, > + .flags = AV_OPT_FLAG_DECODING_PARAM }, > + { NULL }, > +}; > + > +static const AVClass aa_class = { > + .class_name = "aa", > + .item_name = av_default_item_name, > + .option = aa_options, > + .version = LIBAVUTIL_VERSION_INT, > +}; > + > +AVInputFormat ff_aa_demuxer = { > + .name = "aa", > + .long_name = NULL_IF_CONFIG_SMALL("Audible AA format files"), > + .priv_class = &aa_class, > + .priv_data_size = sizeof(AADemuxContext), > + .extensions = "aa", > + .read_probe = aa_probe, > + .read_header = aa_read_header, > + .read_packet = aa_read_packet, > + .flags = AVFMT_GENERIC_INDEX, > +}; > diff --git a/libavformat/allformats.c b/libavformat/allformats.c > index 181cb9e..60ec0ca 100644 > --- a/libavformat/allformats.c > +++ b/libavformat/allformats.c > @@ -262,6 +262,7 @@ void av_register_all(void) > REGISTER_MUXER (RTP_MPEGTS, rtp_mpegts); > REGISTER_MUXDEMUX(RTSP, rtsp); > REGISTER_DEMUXER (SAMI, sami); > + REGISTER_DEMUXER (AA, aa); Alphabetical order. > REGISTER_MUXDEMUX(SAP, sap); > REGISTER_DEMUXER (SBG, sbg); > REGISTER_DEMUXER (SDP, sdp); Regards, -- Nicolas George
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel