On 11-05-2019 04:27, Reimar Döffinger wrote: > On Thu, May 09, 2019 at 11:45:59PM +0530, Shivam Goyal wrote: > >> @@ -117,4 +120,128 @@ static int h264_probe(const AVProbeData *p) >> return 0; >> } >> >> +static const uint8_t arecont_sign[] = {0x2D, 0x2D, 0x66, 0x62, 0x64, 0x72, >> 0x0D, 0x0A}; > > I admit I was more thinking of either > static const uint8_t arecont_sign[] = "--fbdr\r\n"; > which ends up 1 byte too long, or > static const uint8_t arecont_sign[8] = "--fbdr\r\n"; > > Though there is also the option to go for > static const uint64_t arecont_sign = AV_RL64("--fbdr\r\n"); > or similar.
Okay, I will change it. >> +static int arecont_find_sign(unsigned char *data, int size) > > You should be consistent with the types even if they are essentially > the same. > i.e. uint8_t instead of unsigned char * > Also const, since this function does/should not modify "data". Will change it too, >> + int sign_size = sizeof(arecont_sign) / sizeof(arecont_sign[0]); > > First, this is the expression of number of elements, so the > division part is semantically wrong. > Also it's pointless because it will be 1 always. Will use strlen() as i am changing the above. >> + j = memchr(data, arecont_sign[0], size); >> + while (j != NULL && size - sign_size >= (j - data)) { >> + if (!memcmp(arecont_sign, j, sign_size)) >> + return (j - data); >> + if (size - sign_size == (j - data)) >> + return -1; >> + j = memchr(data + (j - data) + 1, arecont_sign[0], size - (j - >> data)); >> + } > > I know I brought this memchr up, but did you do any benchmarks? > Unless you have actual evidence of a speed problem AND can > show that your solution actually makes it faster, I'd > suggest to go with the simplest possible solution. Yeah, I checked the both approaches and perform test to compare time. and found the previous one is good for shorter strings (length < 300). and the newer version is good for strings larger than this. The newer version is 4 microseconds faster in our case(length ~ 1100). As the diff is small, so i think the simpler version is better. >> + data = av_malloc(size); >> + ret = avio_read_partial(s->pb, data, size); >> + if (ret < 0) { >> + av_free(data); >> + av_packet_unref(pkt); >> + return ret; >> + } >> + if (pos <= ARECONT_H264_MIME_SIZE) { >> + avio_seek(s->pb, 0, SEEK_SET); >> + start = pos; >> + } else { >> + avio_seek(s->pb, pos - ARECONT_H264_MIME_SIZE, SEEK_SET); >> + start = ARECONT_H264_MIME_SIZE; >> + } >> + ret = avio_read_partial(s->pb, data, size); >> + if (ret < 0 || start >= ret) { >> + av_free(data); >> + av_packet_unref(pkt); >> + return ret; >> + } > > Unfortunately I still have no idea what that code is meant to do. > First there is no point in allocating "data" when you have > pkt->data already (yes, it would mean using memmove instead > of memcpy later on, but that seems to be about it). > Then the code reads TWICE into the same buffer for some reason, > the first read seems completely pointless? > Also the only point of the _partial variant of the read function > is reducing latency, however the avio_seek is likely to be > quite a bad hit for latency that this really seems like > premature optimization. > Also the avio_seek means this demuxer might not work at all > if the stream is not seekable (might since I don't know > if we maybe do enough buffering nowadays - but if the code > relies on that there should be a check). What i did in the above code is that as the packet can contain partial http header at the start or end of the pkt, which needs to be removed, i seeked backward 100 bytes( or at the start of the file if necessory) and the i did the same to remove the partial http header at the end, i seeked forward 100 bytes extra. start end stream..............^.........................^.................... |_____________| |_________| | | http header http header I first filled the "data" and then searched for the http header, copied the bytes before position of http header from "data" to pkt->data. And then skipped the http header and again searched for another http header in "data" and did the same thing. Also, the code is reading twice because the first is just if any errors come( like file ended). because after that the avio_seek() seek stream backwards and if it would do, then error can't be detected. Yeah, i realized that the avio_seek() is not a good idea. One another approach i thought is that to store 100 bytes (or 70) of the current iteration for the next iteration. In this way we don't have to use avio_seek.also there would not be any need of reading twice. Please suggest if it is right. (Sorry, if i am asking too much doubts). >> + if ((j = arecont_find_sign(data + i, ret - i)) >= 0) { > > Very personal dislike but: > please just don't put assignments into the if. Okay, would change this. >> + k = 0; >> + for (w = j + sign_size; w + 1 < ret; w++) { > > So what exactly happens when you actually end up hitting > that loop end condition? > It seems to me you then just leave the whole thing in, > even though you should have removed it? When the http header is not found on the "data" then the loop ends, or if it is found then the bytes before the position of http-header are copied to pkt->data and then the http-header is skipped and loop again runs. >> + } else >> + break; > > Inverting the condition and putting the 1-line case in the > "if" instead of the "else" is much better for readability. > In this case it also saves you 1 indentation level. Okay, would change this too. > On the general approach: > you are scanning a whole, potentially many, many GB large video > file for a 8-character string. > A false positive has a REALLY high likelihood. > I think this needs to be changed into a more clever approach, > that actually knows where these strings can appear and removes > them in a more targeted way... Yeah this would be slow for large files, I also tried finding out if there is any relation between positions of the http-header but didn't found. Thanks for the review Shivam Goyal _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".