Hey Danilo.  I'm copying launchpad-dev so everybody knows what's up.

Everybody on launchpad-dev, hi.  Staging apparently hasn't been updated since 
2011-05-20.  LOSAs are on a sprint.  Danilo is planning to try to check with 
the LOSAs his tomorrow morning.  If you want to know more, read on.

When I talked to Francis about staging being down, we verified that we were 
apparently running r10574, according to the bottom of 
https://staging.launchpad.net/, which is from 2011-05-20.  Because of that, he 
said that you should call the IS hotline number tomorrow if there is no losa 
response while they are on the sprint: it's been a pretty long time since 
staging had a successful update.

I then looked at a graph and at the logs on devpad (e.g. 
/srv/launchpad.net-logs/staging/sourcherry/2011-05-23-staging_restore.log).  
Here's some other data you might already have.

-----

https://lpstats.canonical.com/graphs/StagingRestoreDurations/ shows several 
restores since the 20th.  It doesn't show success or failure AFAIK, but it does 
show that run time seems consistent with a healthy restore.

-----

The full restore on the 21st shows this "FATAL" problem, but other restores 
(such as on the 23rd) do not.  The comment seems to imply that it might be OK 
for this to fail?  In any case, since other recent restores don't have this 
message, it is probably unrelated.

# Uninstall Slony-I if it is installed - a pg_dump of a DB with
# Slony-I installed isn't usable without this step.
LPCONFIG=staging-setup  ./repair-restored-db.py
/tmp/slonik_qCFRY.sk:3: FATAL:  database "dbname=lpmain_staging_new" does not 
exist
2011-05-21 18:32:14 ERROR   slonik script failed

-----

I suspect that the following indicates a problem we need to look at for our own 
purposes, but that is not pertinent to the staging restore failure.  Near the 
end of the staging restore logs of the 23rd (but not the 21st or 20th!), I saw 

Tue May 24 15:35:51 UTC 2011 Send bug notifications
2011-05-24 15:38:16 ERROR   Error while building email notifications.
 -> 
http://staging.launchpadlibrarian.net/72065616/8rdUH0Yw2QTwDT8HY6ZxKEVo2tB.txt 
(61963)

That referenced .txt file shows this traceback.

Traceback (most recent call last):
  File 
"/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/scripts/bugnotification.py",
 line 290, in get_email_notifications
    yield construct_email_notifications(batch)
  File 
"/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/scripts/bugnotification.py",
 line 177, in construct_email_notifications
    bug, recipients, filtered_notifications)
  File 
"/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/model/bugnotification.py",
 line 278, in getRecipientFilterData
    del recipient_id_map[person_id]['filters'][filter_id]
KeyError: 61963

As I said, this is not in the logs for the 21st or 20th, so it probably is just 
a problem for us, but not the staging problem.  Or maybe it's entirely 
spurious.  It's worth investigating though.  If you get the LOSA's attention 
and you think it makes sense, could you ask them to run 
"cronscripts/send-bug-notifications.py -vv" on staging and give you the output, 
to see if this is actually an issue?

OK, that's all I know. :-)  Maybe others on -dev will have some input.

Thank you!

Gary
_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to     : launchpad-dev@lists.launchpad.net
Unsubscribe : https://launchpad.net/~launchpad-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to