> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Guo, > Yejun > Sent: 2021年2月16日 18:37 > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH V2 08/10] libavutil: add side data > AVDnnBoundingBox for dnn based detect/classify filters > > > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Mark > > Thompson > > Sent: 2021年2月16日 7:48 > > To: ffmpeg-devel@ffmpeg.org > > Subject: Re: [FFmpeg-devel] [PATCH V2 08/10] libavutil: add side data > > AVDnnBoundingBox for dnn based detect/classify filters > > > > On 11/02/2021 08:15, Guo, Yejun wrote: > > >> -----Original Message----- > > >> From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > > >> Mark Thompson > > >> Sent: 2021年2月11日 6:19 > > >> To: ffmpeg-devel@ffmpeg.org > > >> Subject: Re: [FFmpeg-devel] [PATCH V2 08/10] libavutil: add side > > >> data AVDnnBoundingBox for dnn based detect/classify filters > > >> > > >> On 10/02/2021 09:34, Guo, Yejun wrote: > > >>> Signed-off-by: Guo, Yejun <yejun....@intel.com> > > >>> --- > > >>> doc/APIchanges | 2 ++ > > >>> libavutil/Makefile | 1 + > > >>> libavutil/dnn_bbox.h | 68 > > >> ++++++++++++++++++++++++++++++++++++++++++++ > > >>> libavutil/frame.c | 1 + > > >>> libavutil/frame.h | 7 +++++ > > >>> libavutil/version.h | 2 +- > > >>> 6 files changed, 80 insertions(+), 1 deletion(-) > > >>> create mode 100644 libavutil/dnn_bbox.h > > >> > > >> What is the intended consumer of this box information? (Is there > > >> some other filter which will read these are do something with them, > > >> or some sort of user > > >> program?) > > >> > > >> If there is no use in ffmpeg outside libavfilter then the header > > >> should probably be in libavfilter. > > > > > > > > > Thanks for the feedback. > > > > > > For most case, other filters will use this box information, for > > > example, a classify filter will recognize the car number after the > > > car plate is detected, another filter can apply 'beauty' if a face > > > is detected, and updated drawbox filter (in plan) can draw the box > > > for visualization, and a new filter such as bbox_to_roi can be added > > > to apply roi > > encoding for the detected result. > > > > > > It is possible that some others will use it, for example, the new > > > codec is adding AI labels and so libavcodec might need it in the > > > future, and a user program might do something special such as: > > > 1. use libavcodec to decode > > > 2. use filter detect > > > 3. write his own code to handle the detect result > > > > > > As the first step, how about to put it in the libavfilter (so do not > > > expose it at API level and we are free to change it when needed)? > > > And we can move it to libavutil once it is required. > > > > Sure. > > > > >> How tied is this to the DNN implementation, and hence the DNN name? > > >> If someone made a standalone filter doing object detection by some > > >> other method, would it make sense for them to reuse this structure? > > > > > > Yes, this structure is general, I add dnn prefix because of two reasons: > > > 1. There's already bounding box in libavfilter/bbox.h, see below, > > > it's simple and we could not reuse it, so we need a new name. > > > typedef struct FFBoundingBox { > > > int x1, x2, y1, y2; > > > } FFBoundingBox; > > > > Right, really this is just the return type for the internal > > ff_calculate_bounding_box() function - if you want to reuse the name > > externally then it would be fine to rename the existing stuff to get > > it out of the way. > > yeah, I'll consider to rename it after these patches are done, since they now > are not conflict from compiler's perspective. > > > > > > 2. DNN is currently the dominate method for object detection. > > > > Unless your ID values or something else about the output are > > DNN-specific then I'm not really seeing the attraction of associating > > them with the DNN name for external use. If a user wants to detect > > some objects in an image and then do something with the result then > > maybe they know they are using DNN for first step, but they won't care > about where the result came from after that. > > It reminds me that we might need to save some information such as model > name, name of data set trained, name/parameters of other non-dnn > implementations etc., and so the user knows better about the bbox. For > example, we can add 'char source[128]' > as box header for all the bboxes. > > I'll think about it and send new patches for the side data and detect filter.
hi, I'll push the first 7 patches of this patch set tomorrow if there's no other comment for them, and then send new patch set for side data and detect filter, thanks. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".