After getting new failures due to an unrelated fix, I think this will be more trouble than it's worth.
First, we can't get rid of the XPASSes, so those will always be noisy. Second, some XPASSes will need to be unmarked because we just fixed the underlying problem. Third, we are at such an early stage, that fixes to a test case will generally expose failures in other already failing tests, but these failures will be in a different place. So more noise. I really think that for now the easiest way to keep track of this is to have a clean build to compare against. Diego.