On Mon, May 10, 2021 at 11:31 AM Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Mon, May 10, 2021 at 10:49:19AM +0100, Stefan Hajnoczi wrote: > > > qemu.org bandwidth usage has been as follows: > > - Jan: 12.56 TB > > - Feb: 10.55 TB > > - Mar: 10.28 TB > > - Apr: 7.62 TB > > > > In May qemu.org has averaged 232.25 GB/day so far putting it on track > > for 7 TB total this month. > > That decrease seems to show we've had a big effect from moving to > gitlab. Not big enough yet though. > > > Roughly 75% of traffic is git (https), 25% is tarball downloads, and > > the rest is wiki/web/miscellaneous traffic. Fun fact: > > qemu-4.2.0.tar.xz is the most popular download! > > First git traffic... > > When you say "git (https)" are you exclusively meaning access of > git via https:// protocol URIs, or does that include git:// URIs > too ?
This includes git-http-backend(1) only. I think gitweb traffic is separate. > > Or are git:/// URI traffic not accounted for at all in your 75%/25% > split there ? git-daemon is not included in the stats because they are web server stats only. Based on the network bandwidth fees that QEMU has been paying I do know git-daemon traffic is much smaller than git-http-backend traffic. > > For the https:// URIs should we setup a HTTP redirect ? > > When git clones via https it fetches some specific paths which > I believe we have rules for in httpd conf: > > ScriptAliasMatch "^/git/(.*\.git/(HEAD|info/refs))$" \ > /usr/libexec/git-core/git-http-backend/$1 > ScriptAliasMatch "^/git/(.*\.git/git-(upload|receive)-pack)$" \ > /usr/libexec/git-core/git-http-backend/$1 > > If we set those URI path matches to send a HTTP 307 redirect > to gitlab, that would essentially kill off our git traffic on > qemu.org, while still allowing the qemu.org gitweb UI to > work normally. The downside is that people won't notice to > update their clone URIs. Still feels like an easy win and > we can easily remove the redirect if we use code 307. I remember there were concerns about warning messages that git-clone(1) prints when an HTTP redirect is encountered? If everyone is okay I can turn the git-http-backend(1) aliases into HTTP 307 redirects to GitLab. > Third, qemu 4.2.0.... > > I wonder why this is the most popular. Something must be linking > to this, as you would otherwise have to go out of your way to > search it out. > > Do we have any stats on the referrer URLs ? > > I wonder if there's some key page(s) that need updating ? > > If we're unlucky there might be some CI system that hardcoded > use of qemu 4.2.0 that's frequently pulling it. The majority of qemu-4.2.0.tar.xz downloads have the wget user agent and no referrer. The IP addresses don't have a clear pattern (there are many). Stefan