yep, and i will tell you guys ONLY if you promise to NOT try this yourselves... checking the rate limit also counts as a hit and increments our numbers:
# curl -i https://api.github.com/users/whatever 2> /dev/null | egrep ^X-Rate X-RateLimit-Limit: 60 X-RateLimit-Remaining: 51 X-RateLimit-Reset: 1413590269 (yes, that is the exact url that they recommended on the github site lol) so, earlier today, we had a spark build fail w/a git timeout at 10:57am, but there were only ~7 builds run that hour, so that points to us NOT hitting the rate limit... at least for this fail. whee! is it beer-thirty yet? shane On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Wow, thanks for this deep dive Shane. Is there a way to check if we are > getting hit by rate limiting directly, or do we need to contact GitHub > for that? > > 2014년 10월 17일 금요일, shane knapp<skn...@berkeley.edu>님이 작성한 메시지: > > quick update: >> >> here are some stats i scraped over the past week of ALL pull request >> builder projects and timeout failures. due to the large number of spark >> ghprb jobs, i don't have great records earlier than oct 7th. the data is >> current up until ~230pm today: >> >> spark and new spark ghprb total builds vs git fetch timeouts: >> $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i spark >> | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let >> total=passed+failed; fail_percent=$(echo "scale=2; $failed/$total" | bc | >> sed "s/^\.//g"); line="$x -- total builds: $total\tp/f: >> $passed/$failed\tfail%: $fail_percent%"; echo -e $line; done >> 10-09 -- total builds: 140 p/f: 92/48 fail%: 34% >> 10-10 -- total builds: 65 p/f: 59/6 fail%: 09% >> 10-11 -- total builds: 29 p/f: 29/0 fail%: 0% >> 10-12 -- total builds: 24 p/f: 21/3 fail%: 12% >> 10-13 -- total builds: 39 p/f: 35/4 fail%: 10% >> 10-14 -- total builds: 7 p/f: 5/2 fail%: 28% >> 10-15 -- total builds: 37 p/f: 34/3 fail%: 08% >> 10-16 -- total builds: 71 p/f: 59/12 fail%: 16% >> 10-17 -- total builds: 26 p/f: 20/6 fail%: 23% >> >> all other ghprb builds vs git fetch timeouts: >> $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi >> spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let >> total=passed+failed; fail_percent=$(echo "scale=2; $failed/$total" | bc | >> sed "s/^\.//g"); line="$x -- total builds: $total\tp/f: >> $passed/$failed\tfail%: $fail_percent%"; echo -e $line; done >> 10-09 -- total builds: 16 p/f: 16/0 fail%: 0% >> 10-10 -- total builds: 46 p/f: 40/6 fail%: 13% >> 10-11 -- total builds: 4 p/f: 4/0 fail%: 0% >> 10-12 -- total builds: 2 p/f: 2/0 fail%: 0% >> 10-13 -- total builds: 2 p/f: 2/0 fail%: 0% >> 10-14 -- total builds: 10 p/f: 10/0 fail%: 0% >> 10-15 -- total builds: 5 p/f: 5/0 fail%: 0% >> 10-16 -- total builds: 5 p/f: 5/0 fail%: 0% >> 10-17 -- total builds: 0 p/f: 0/0 fail%: 0% >> >> note: the 15th was the day i rolled back to the earlier version of the >> git plugin. it doesn't seem to have helped much, so i'll probably bring us >> back up to the latest version soon. >> also note: rocking some floating point math on the CLI! ;) >> >> i also compared the distribution of git timeout failures vs time of day, >> and there appears to be no correlation. the failures are pretty evenly >> distributed over each hour of the day. >> >> we could be hitting the rate limit due to the ghprb hitting github a >> couple of times for each build, but we're averaging ~10-20 builds per hour >> (a build hits github 2-4 times, from what i can tell). i'll have to look >> more in to this on monday, but suffice to say we may need to move from >> unauthorized https fetches to authorized requests. this means retrofitting >> all of our jobs. yay! fun! :) >> >> another option is to have local mirrors of all of the repos. the problem >> w/this is that there might be a window where changes haven't made it to the >> local mirror and tests run against it. more fun stuff to think about... >> >> now that i have some stats, and a list of all of the times/dates of the >> failures, i will be drafting my email to github and firing that off later >> today or first thing monday. >> >> have a great weekend everyone! >> >> shane, who spent way too much time on the CLI and is ready for some beer. >> >> On Thu, Oct 16, 2014 at 1:04 PM, Nicholas Chammas < >> nicholas.cham...@gmail.com> wrote: >> >>> On Thu, Oct 16, 2014 at 3:55 PM, shane knapp <skn...@berkeley.edu> >>> wrote: >>> >>>> i really, truly hate non-deterministic failures. >>> >>> >>> Amen bruddah. >>> >> >>