> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Anton > Khirnov > Sent: 2021年4月1日 16:13 > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH V6 4/6] lavu: add side data > AV_FRAME_DATA_BOUNDING_BOXES > > Quoting Guo, Yejun (2021-03-26 09:09:29) > > Signed-off-by: Guo, Yejun <yejun....@intel.com> > > --- > > doc/APIchanges | 2 ++ > > libavutil/Makefile | 1 + > > libavutil/boundingbox.h | 79 > +++++++++++++++++++++++++++++++++++++++++ > > libavutil/frame.c | 1 + > > libavutil/frame.h | 7 ++++ > > 5 files changed, 90 insertions(+) > > + > > +#ifndef AVUTIL_BOUNDINGBOX_H > > +#define AVUTIL_BOUNDINGBOX_H > > + > > +#include "libavutil/rational.h" > > + > > +typedef struct AVBoundingBox { > > + /** > > + * Distance in pixels from the top edge of the frame to top > > + * and bottom, and from the left edge of the frame to left and > > + * right, defining the bounding box. > > + */ > > + int top; > > + int left; > > + int bottom; > > + int right; > > Is this allowed to be negative? Can/should this be size_t?
There was a long discussion about size_t/int/uint32_t when I added struct AVRegionOfInterest in frame.h, and finally size_t is not chosen. > > > + > > +#define AV_BBOX_LABEL_NAME_MAX_SIZE 32 > > + > > + /** > > + * Detect result with confidence > > + */ > > + char detect_label[AV_BBOX_LABEL_NAME_MAX_SIZE]; > > + AVRational detect_confidence; > > + > > + /** > > + * At most 4 classifications based on the detected bounding box. > > + * For example, we can get max 4 different attributes with 4 different > > + * DNN models on one bounding box. > > + * classify_count is zero if no classification. > > + */ > > +#define AV_NUM_BBOX_CLASSIFY 4 > > + uint32_t classify_count; > > size_t? > > > + char > classify_labels[AV_NUM_BBOX_CLASSIFY][AV_BBOX_LABEL_NAME_MAX_SIZE]; > > + AVRational classify_confidences[AV_NUM_BBOX_CLASSIFY]; > > +} AVBoundingBox; > > + > > +typedef struct AVBoundingBoxHeader { > > This structure is not extensible in an ABI-compatible way. yes, we can add a version number (enum AVBBoxHeaderVersion, see below) at the beginning of the struct, and the user needs to set it with AV_BBOX_HEADER_CURRENT. It is not elegant, is this what we really want? enum AVBBoxHeaderVersion { AV_BBOX_HEADER_NONE = 0, AV_BBOX_HEADER_V1, AV_BBOX_HEADER_CURRENT = AV_BBOX_HEADER_V1, }; typedef struct AVBoundingBoxHeader { enum AVBBoxHeaderVersion version; union { AVBoundingBoxHeaderV1 v1; } header; } AVBoundingBoxHeader; We have carefully thought/discussed about the struct for months, and there's little chance for later update, imho. > > > + /** > > + * Information about how the bounding box is generated. > > + * for example, the DNN model name. > > + */ > > + char source[128]; > > + > > + /** > > + * The size of frame when it is detected. > > + */ > > + int frame_width; > > + int frame_height; > > Why? so we know if the value of top/left/bottom/right in AVBoundingBox can be used directly in other later filters. for example, the filter chain is: dnn_detect -> scale -> dnn_classify (in plan) In filter dnn_classify, it needs to check the frame size to know if the values in bounding boxes (the detection result) are still valid. > > > + > > + /** > > + * Number of bounding boxes. > > + */ > > + uint32_t nb_bbox; > > size_t? > > -- > Anton Khirnov > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org > with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".