On Sun, May 8, 2016 at 9:15 AM, Simon Pieters <sim...@opera.com> wrote: > httparchive (494,168 pages): > > SELECT COUNT(*) AS num, REGEXP_EXTRACT(LOWER(body), > r'<track\s(?:[^>]+\s)?kind\s*=\s*([a-z]+|["\'][^"\']+["\'])') as match > FROM [httparchive:har.2016_04_15_chrome_requests_bodies] > GROUP BY match > ORDER BY num DESC > > Row num match > 1 17616286 null > 2 523 "subtitles" > 3 108 "captions" > 4 58 "metadata" > 5 6 "subtitle" > 6 6 'subtitles' > 7 5 "thumbnails" > 8 3 'captions' > 9 1 "dotsub" > 10 1 "${assettracktype}" > 11 1 'subtitle' > > > We could add "subtitle" as a new keyword if that turns out to be a problem.
Thanks for the data! Looks like we're talking on the order of 0.001% of pages, so I think this can be safely landed. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform