Windows XP tests no longer run by default on try

2015-11-03 Thread Jonathan Griffin
TL;DR - If you want to run Windows XP tests on try, you need to specify
[Windows XP] after the suite name; they are not longer run by default.

As most people are aware, our Windows test capacity is considerably
overloaded. However, we need to start turning on e10s tests on Windows, so
the e10s team can ship this feature and ensure it's adequately tested in
automation. In order to prevent this from causing even worse overloading,
we're making a few changes, and one of those is to turn Windows XP tests
off by default on try. This will allow us to shift some Windows XP test
machines to the Windows 7 pool, so we can start turning on e10s tests there.

Windows XP tests are no longer run when specifying only -p all or -p win32.
Instead, you need to put [Windows XP] after suite names you want run on
that platform, for example:

 try: -b o -p win32 -u mochitests[Windows XP],reftest -t none

This will run mochitests on all win32 platforms including XP; reftest will
be run on all win32 platforms except for XP.

Trychooser has been updated accordingly. See
https://bugzilla.mozilla.org/show_bug.cgi?id=1219434 for more details.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Too many oranges!

2015-12-22 Thread Jonathan Griffin
I think this is a great idea. Although it won't fix the problem long-term,
what it will do is get engineers and especially engineering managers
thinking about the problem, and hopefully understanding it better so they
can incorporate it into future priorities.

There are two fundamental problems that lead to the current state: weak or
no ownership of many tests or suites, so that fixing these oranges is
always somebody else's problem, and a focus on "hard" deliverables which
often leave little or no time to deal with unplanned problems like an
increase in intermittents.

If we dedicate a cycle to quality and tests, we should use that opportunity
to figure out what a more viable strategy is longer-term for making sure
these don't get out of hand again, which might include having teams adopt
test, suites, and the intermittent orange counts associated with them, and
being accountable for them in goals and deliverables.

Jonathan

On Tue, Dec 22, 2015 at 8:16 AM, Douglas Turner  wrote:

> Mike -- totally supportive of this. I would *love* to see a release cycle
> completely dedicated to quality.  We branch again on January 26.  We could
> use that cycle to focus on nothing but quality (fixing tests, bug triaging,
> no feature development at all).
>
> Thoughts?
>
> On Tue, Dec 22, 2015 at 7:41 AM Mike Conley  wrote:
>
> > I would support scheduled time[1] to do maintenance[2] and help improve
> our
> > developer tooling and documentation. I'm less sure how to integrate such
> a
> > thing in practice.
> >
> > [1]: A day, a week, heck maybe even a release cycle
> > [2]: Where maintenance is fixing oranges, closing out papercuts,
> > refactoring, etc.
> >
> > On 21 December 2015 at 17:35,  wrote:
> >
> > > On Monday, December 21, 2015 at 1:16:13 PM UTC-6, Kartikaya Gupta
> wrote:
> > > > So, I propose that we create an orangefactor threshold above which
> the
> > > > tree should just be closed until people start fixing intermittent
> > > > oranges. Thoughts?
> > > >
> > > > kats
> > >
> > > How about regularly scheduled test fix days where everyone drops what
> > they
> > > are doing and spends a day fixing tests? mc could be closed to
> everything
> > > except critical work and test fixes. Managers would be able to opt
> > > individuals out of this as needed but generally everyone would be
> > expected
> > > to take part.
> > >
> > > Jim
> > > ___
> > > dev-platform mailing list
> > > dev-platform@lists.mozilla.org
> > > https://lists.mozilla.org/listinfo/dev-platform
> > >
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
> >
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Test automation addons and signing

2016-02-29 Thread Jonathan Griffin
The decision not to enforce addon signing on trunk/aurora is not changing.
But, to support running our automation with unsigned addons on
trunk/aurora, but signed addons on beta/release, we would have to implement
some pretty complex logic at the aurora -> beta uplift, and this is
substantial enough that it would block addon signing for some time.

So, we've taken the expedient path and are signing addons everywhere (or
converting them to restartless addons, which can remain unsigned), in order
to simplify the automation and release engineering tasks. If this proves to
be problematic, we may look at alternatives in the future.

Jonathan

On Sat, Feb 27, 2016 at 4:33 AM, Philip Chee  wrote:

> On 26/02/2016 22:45, Andrew Halberstadt wrote:
> > To date, our continuous integration has been setting
> > 'xpinstall.signatures.required=false' to bypass addon signing. But
> > soon, this pref will become obsolete and Firefox will enforce signing
> > no matter what.
> >
> > In preparation, we will begin signing extensions that are used in
> > our test automation. If you maintain one of these addons, it means
> > you will be required to re-sign it everytime a change is made. For
> > more information on what this means, how to sign addons or how to
> > bypass signing completely, see the following document:
> > https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions
> >
> >  Let me know if you have any questions or concerns, Andrew
>
> We were promised that trunk and aurora builds would not enforce addon
> signing when xpinstall.signatures.required=false. I did not see any
> discussion about rescinding this commitment.
>
> Phil
>
> --
> Philip Chee , 
> http://flashblock.mozdev.org/ http://xsidebar.mozdev.org
> Guard us from the she-wolf and the wolf, and guard us from the thief,
> oh Night, and so be good for us to pass.
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Engineering Productivity Q1 Update

2016-04-14 Thread Jonathan Griffin
Engineering Productivity is off to a great start in 2016; here’s what we’ve
been up to in Q1.
Build System

Build system improvements are a major priority for Engineering Productivity
in 2016. The build team made great progress in Q1:


   -

   Windows builds are now made using VS2015. This shaves 100 minutes off of
   PGO builds!
   -

   Install manifest processing is up to 10x faster on Windows (10s now vs
   100s before). Tests files are now lazily installed, making builds and test
   invocation significantly faster.
   -

   Many improvements to artifact builds, which have resulted in a 50% speed
   improvement
   -

   Artifact builds now support git-cinnabar users
   -

   A lot of work has been done to migrate legacy Makefiles to moz.build
   files, and to move away from autoconf; more along these lines will be done
   in Q2
   -

   Build telemetry has been added, which will allow us to track
   improvements for developer builds; this is currently opt-in, so please
   consider setting BUILD_SYSTEM_TELEMETRY=1 in your build environment to help
   us validate this
   -

   The ICU build system has been reimplemented so it no longer excessively
   slows down builds.


The build team is a large meta-team comprised of individuals from
Engineering Productivity and several other teams; thanks to everyone who
has contributed.
MozReview and Autoland

The primary goal of the MozReview team in Q1 was to increase user adoption
by addressing various UX issues that have confused or frustrated users. To
that end, a feedback panel consisting of some of Mozilla’s top reviewers
has been created to provide a feedback loop for the MozReview developers.
We’ve identified a number of issues that impact reviewer productivity and
are working on them, starting with the top issue: lack of inline comments
in the diff viewer. We also explored confusion around the general layout
and flow of MozReview/Review Board, and working with UX designer Tiffanie
Shakespeare we’re coming up with some big changes that should improve
general usability. We have been working on a framework that will allow us
to experiment in the UI without having to completely fork Review Board.

In addition, this quarter we implemented various high-priority fixes and
improvements, including


   -

   Disabling interdiff rebase filtering, since it was unreliable.
   -

   Adding options to disable reviewer deduction and to publish without
   prompting when pushing commits up.
   -

   Concatenating MozReview BMO-comment emails to reduce the volume of email
   sent out when many commits are published.
   -

   Adding extra context to the diffs in BMO comments.
   -

   Showing a comment button when hovering over the diff viewer, improving
   discoverability.
   -

   Clarifying status of reviewers in the commits table.


We’re also very close to landing two other important features: switching
from “ship it” to the standard BMO review flags (r?/r+/r-), and letting
reviewers delegate reviews to others.


Finally, autoland-to-inbound was rolled out, giving MozReview users an easy
way to land reviewed patches.
TaskCluster Migration

Engineering Productivity is helping the TaskCluster team and Release
Engineering migrate builds and automated tests from buildbot to
TaskCluster. In Q1, this involved a lot of work in crafting a docker image
that could be used to run linux64 debug unit tests successfully, and
related work in greening up the test suites in that environment. Linux64
builds and tests in TaskCluster are now running as Tier 1 in Treeherder, so
the teams are moving on to other linux64 flavors: opt, pgo, and asan.

Performance Automation

Sheriffing of performance regressions of Talos tests has moved entirely to
Perfherder; Talos no longer reports data to graphserver, and graphserver
will be retired in the future. Perfherder also now displays performance
metrics generated by AreWeFastYet  and
AreWeSlimYet .

To support the e10s project, Perfherder now has an e10s dashboard
 that can be used to view
the differences between e10s and non-e10s Talos tests.

Finally, performance benchmarks previously running in Mozbench have been
migrated to AreWeFastYet, and Mozbench has been retired.
Continuous Integration

A lot of work has been completed to support the addon signing project; this
includes taking all of the addons used by test automation and either
converting them to restartless addons and making them get installed via a
new API, or signing them in-tree. All test harnesses now work with addon
signing enforced.

For e10s, all appropriate test suites have been enabled in e10s mode on
Windows 7 on trunk, with the exception of a couple of suites on Windows 7
debug, due to ongoing assertions and leaks. All suites are running in e10s
mode on all platforms on the project branch ash
. All relevant test suites
have been 

Notice: decommissioning git.mozilla.org

2016-07-01 Thread Jonathan Griffin
We are planning to decommission git.mozilla.org. It was originally created
to serve the B2G project, but is no longer needed for that purpose, and
other uses have remained slight and don't justify the operational or
maintenance costs.

There is some work underway to migrate existing users to alternatives; this
is tracked in bugs that block https://bugzil.la/1277297. As soon as the
last of these is cleaned up, we'll be turning off git.mozilla.org. This
will likely occur the week of July 5th.

For futher details, see this dev-version-control thread:
https://groups.google.com/forum/#!topic/mozilla.dev.version-control/H6_TWlWPQGk
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Notice: decommissioning git.mozilla.org

2016-07-01 Thread Jonathan Griffin
There is no change to https://github.com/mozilla/gecko-dev or related
workflows. Only the git.mozilla.org mirror of this repo is going away.

On Fri, Jul 1, 2016 at 12:40 PM, Nicolas B. Pierron <
nicolas.b.pier...@mozilla.com> wrote:

> On 07/01/2016 06:11 PM, Jonathan Griffin wrote:
>
>> We are planning to decommission git.mozilla.org. It was originally
>> created
>> to serve the B2G project, but is no longer needed for that purpose, and
>> other uses have remained slight and don't justify the operational or
>> maintenance costs.
>>
>> There is some work underway to migrate existing users to alternatives;
>> this
>> is tracked in bugs that block https://bugzil.la/1277297. As soon as the
>> last of these is cleaned up, we'll be turning off git.mozilla.org. This
>> will likely occur the week of July 5th.
>>
>> For futher details, see this dev-version-control thread:
>>
>> https://groups.google.com/forum/#!topic/mozilla.dev.version-control/H6_TWlWPQGk
>>
>>
> Would we keep maintaining the github mirror[1] of Gecko?
>
> Are we planning to enforce git-cinnabar, even if this implies that we
> cannot collaborate (yet) with other git repositories?
>
> [1] https://github.com/mozilla/gecko-dev
>
> Apparently git.mozilla.org/integration/gecko-dev.git is already lagging
> behind.
>
> If like me you were using gecko-dev.git, you can switch to github using
> the following command (assuming …/gecko-dev.git was named origin):
>
> $ git remote set-url origin https://github.com/mozilla/gecko-dev.git
>
> --
> Nicolas B. Pierron
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Engineering Productivity Q2 Rollup

2016-07-16 Thread Jonathan Griffin
Engineering Productivity made great progress on our top-level goals in Q2;
keep reading for a description of the highlights. For a sneak peek of what
we're doing in Q3, see this Google Doc: https://goo.gl/Z2YgQ2

Platform Operations Top-Level Projects

'''Build Faster'''
Overall goal: Reduce build times for local developers and automation;
improve maintainability.
Q2 progress:
* Artifact Builds were rolled out to all desktop and android Firefox
engineers
* Reduced PGO Build times by over 2 hours.
* The build system now lazily installs test files, saving build time and
making mach commands for test execution faster.
* A significant amount of logic has been removed from configure and
Makefiles, resulting in better performance. This also paves the way for
replacing gnu make with a newer build backend, like tup or bazel.
* A distributed global cache is being made usable by local builds, which
can reduce the amount of time needed to make C++ builds, especially if you
have a fast internet connection and/or are located in an office. This work
will finish up in Q3.

'''MozReview'''
Overall goal: Increase adoption of MozReview for gecko code reviews
Q2 progress:
* A decision was made to fork Review Board, which will make iteration
faster, at the cost of potentially losing the ability to incorporate
upstream improvements easily.
* Autoland has been changed to land to a special 'autoland' integration
branch, instead of mozilla-inbound. The autoland branch is merged
periodically to mozilla-central, similar to mozilla-inbound.
* MozReview now uses Bugzilla-style r?/r+/r- flags instead of "ship it".
* Anyone can change the reviewers for any review request, similar to
Bugzilla’s attachments. This will allow, for instance, delegating a review
to someone else without having to involve the author.
* An automated CI system was brought online, which should increase
MozReview's development velocity.
* Actions are being published to Pulse.

'''TaskCluster Migration'''
Overall goal: Help the TaskCluster team migrate continuous integration from
buildbot to TaskCluster
Q2 progress:
* Most linux-based builds and tests are now running in TaskCluster, as Tier
1 or Tier 2, with the exception of linux64 Talos tests. For Talos, an
experiment was conducted to determine whether the tests could be moved to
AWS or to a docker-based hardware solution, but tests results in either
case are too variable, so Talos will be ported to TaskCluster on native
hardware at a later date.
* Considerable progress has been made in porting Windows unit tests to AWS;
this frees up Windows hardware capacity (reducing wait times, especially on
Try), and paves the way for migration of Windows tests to TaskCluster.

Other Projects

'''Treeherder'''
* A new bug filer tool was created for filing intermittent bug failures.
* Treeherder now ingests test results from TaskCluster using Pulse; this
allows those test results to be visible in staging and dev environments, as
well as production.
* Adding new TaskCluster-based jobs to a push in Treeherder is now possible.
* The first version of automatic classification has been deployed; this
automatically classifies some known intermittents as such, without the
involvement of a sheriff. Work is underway to expand the number of failure
types that auto classification recognizes.

'''Performance Testing'''
* Some improvements were made to make it easier to display non-Talos
performance metrics in Perfherder. For example, Servo is now using
Perfherder, see: https://mzl.la/29JyMMD
* Support has been added for micro benchmarks, see e.g.,
https://mzl.la/29AkESh
* System utilization during unit tests is now being reported to Perfherder.

'''Mobile Testing'''
* Autophone (Android phone-based automation) is now reporting to Treeherder
as Tier 2. It now has a new dashboard: http://phonedash.mozilla.org

'''Bugzilla'''
* A major memory leak was fixed, leading to much longer lifetimes for the
webhead processes, and a small performance gain.
* You can now view and comment on GitHub pull-request diffs in Splinter.
* Many modal-UI bugs fixed.
* New textual bug-summary field.
* The production database is now in UTC.
* Cisco Spark support.
* Many bugs in the in-progress upstream-merge branch (merging in recent
upstream Bugzilla work) have been fixed.

'''Marionette/WebDriver'''
* Support for running WebDriver tests in gecko and Servo has been added to
web-platform-tests.
* Shipped geckodriver (https://github.com/mozilla/geckodriver), the HTTPD
frontend to Marionette, and gathered user feedback
* Aligned a number of commands to the WebDriver specification.
* Marionette support has been added to Fennec.
* Unit tests of the Marionette test harness have been expanded.

'''Firefox UI Tests'''
* Firefox UI Tests was the project of the month in June:
http://bit.ly/29JBamC
* With esr38 being deprecated, mozmill tests are now retired, and all
Firefox UI Tests are Marionette-based and live in-tree.
* Functional tests for linux have been moved to Task

Usability improvements for Firefox automation initiative - Status update #3

2016-08-18 Thread Jonathan Griffin
On this update, we will look at the progress made since our second update.

A reminder that this quarter’s main focus is on:

* Debugging tests on interactive workers (only Linux on TaskCluster)
* Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management page
for it:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements


Status update:

Debugging tests on interactive workers

---

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262260


Accomplished recently:

* Blogged about it:
https://ahal.ca/blog/2016/taskcluster-interactive-loaner/
* Refactored mochitest harness to use --flavor (for similar UX to existing
mach command)
* Started looking into test resolving outside of the build system

Upcoming:

* Mach workflow for wpt and Mn
* Mach workflow for Android tests


Thunder Try - Improve end to end times on try

-

Project #1 - Artifact builds on automation

##

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Nothing new in this edition.


Project #2 - S3 Cloud Compiler Cache



Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Accomplished recently:

* Doing some manual testing of the rewritten s3 cache code on try currently.


Project #3 - Metrics



Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:

* Finished a very earlier prototype of an “Infraherder” dashboard, which
(currently) lets you visualize the last jobs to finish on try:
http://wlach.github.io/treeherder/ui/infra.html#/last-finished
* Published an RFC in order to solicit more feedback:
https://docs.google.com/document/d/1SrlJQQ3qWuM0tvruG6Lr59t3hJ4XRUoMIrIRQYvwu9A/edit?usp=sharing

Upcoming:

* Get more feedback on Infraherder, figure out if we want to proceed with
this approach


Other

#

* Bug 1290282 - Changed all Linux build EC2 instances used by TaskCluster
from c3/m3/r3.2xlarge to c4/m4.4xlarge - ~40% time reduction in many build
tasks
* Bug 1272083 - Downloading and unzipping should be performed as data is
received
** Prototype ready to be integrated into Mozharness


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1290282
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1272083
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Feedback requested: UI changes for Treeherder

2016-10-10 Thread Jonathan Griffin
TL;DR - we'd like some feedback on some UI changes to Treeherder
recommended by a 3rd party UX team.

The longer story - a 3rd party UX team has provided us with some
suggestions on how to improve Treeherder's UI from the perspective of the
average developer. They've created a set of wireframes here:

https://drive.google.com/file/d/0B3__-vbLGlRHajhEOE1SVXBQUU0/view?usp=sharing

The highlights of these changes:

* failed jobs, build jobs, and test jobs are separated into different
columns
* many of the icons scattered around the existing UI have been collapsed
into a few menus, making them more discoverable
* the bottom panels (displayed when selecting a job) have been simplified

We may implement these wireframes for non-sheriffs, or on a per-user basis,
or only for Try. If anyone has feedback, please give that by EOD Wednesday.
Since some of the features may be difficult to understand from the
wireframes alone, we will do a walkthrough of them at the next Treeherder
meeting, which is this Wednesday @ 9am PDT, in the A-Team vidyo room.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Engineering Productivity Q3 Rollup

2016-10-14 Thread Jonathan Griffin
Here’s what Engineering Productivity has accomplished in Q3. In Q4, we’re
switching to OKR’s for tracking goals and progress; these should be
published soon.

== Build System ==
Overall goal: Reduce build times for local developers and automation;
improve maintainability.

Q3 progress:
* TaskCluster tasks now running on faster AWS instances. Execution time
decreases of several minutes across the board.
* Initial support for building with Tup build system.
* Libffi’s configure script, formerly a long pole when running configure,
is no longer run, and its functionality has been incorporated into our
Python configure infrastructure. Additionally, libffi no longer builds with
its own build system and is now incorporated into moz.build.
* Significant progress has been made to convert our autoconf based
configure script to a more performant and maintainable Python based system.
The quantity of .m4 and .sh code invoked during this part of the build has
been reduced by ⅓.


== MozReview ==
Overall goal: Increase adoption of MozReview for mozilla-central code
reviews

Q3 progress:
* New, custom theme
* “Finish Review” dialog reworked for improved usability
* Automatic LDAP association
* Display diff statistics in MozReview and BMO
* Prototyped review-request UI changes to improve clarity and
discoverability
* New Vagrant-based development environment to encourage contributions


== Debugging UX ==
Overall goal: Make it easier to debug and resolve automated test failures
Project page:
https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements

Sub-goal: Created simple workflow to run test suites from an interactive
(one-click) loaner
Q3 progress:
* Deployed automation changes that provide developers with the ability to
run tests, clone mozilla-central, and obtain shell access on actual linux64
test workers with just a few clicks from Treeherder.

Sub-goal: Reduce median end-to-end times on Try to 60 minutes
Q3 progress:
* Download zip files into memory before unzipping to save writing the zip
file to disk first
* Using ‘--artifact’ flag in try syntax replaces all scheduled build jobs
with an opt artifact build on linux, linux64, mac64, win32, win64. |./mach
try| also recognizes --artifact flag and prevents the user from scheduling
compiled-code tests with it. Compiled-code test jobs report an ‘exception’
result when run against an artifact build on try.
* Made progress on rewriting sccache in Rust, after which we can deploy a
two-tier sccache for Try builds, that will allow Try builds to consume
sccache objects from mozilla-central
* Build jobs in automation are now using version control clone+checkout
best practices; this results in faster VCS operations, less overhead during
automation jobs
* WPT tests now running from source checkout
* Work in progress to show high level “end-to-end times” across all
platforms and tests, on Try. This is building on previous work, which is a
prototype limited to build times:
http://people.mozilla.org/~klahnakoski/MoBuildbotTimings/Builds-Overview.html


== Other Projects ==

Bugzilla
* Memory leak stuff previously discussed was implemented this quarter
* BMO implemented CSP for some pages, yielding an A- on Mozilla Observatory
* BMO switched to vendored dependencies which will result in a much faster
development pace
* Legacy bzapi saw a performance improve 226s to 184ms or 122826% faster (
http://dylanwh.tumblr.com/post/151314085067/sorry-i-meant-i-changed-it-from-226s-to-184ms
)
* Work continues on search speed improvements; when landed, keyword search
for intermittent-failure goes from 2 to 0.08s

Treeherder
* Migrated Treeherder from SCL3 to Heroku.
* Use ReactJS to make the exclusion editor (Sheriff panel / Admin) usable
* Support submitting Github resultsets via Pulse

Continuous Integration and Test Harnesses
* Stood up flake8 lint job with mozlint, and converted eslint job to use
mozlint
* Finished migrating majority of jobs from windows hardware to windows VM
in AWS, only a few remain running on HW with no plans to move to windows vm
VM to limitations of the VM’s
* Implemented Edge support in wptrunner
* Did initial exploratory work on cross-browser comparisons using
web-platform-tests and wptrunner
* Our Outreachy intern, Anjana Vakil, landed log parsing for
marionette-harness tests on Treeherder by integrating a pytest-mozlog
plugin into mozlog. Marionette-harness tests are thus ready to be Tier-1 on
Treeherder.
* Documented some best practices for code backouts; see
https://wiki.mozilla.org/Sheriffing/How:To:Backouts#Best_Practices_and_Communication

Janitor (https://janitor.technology)
* Refactored the Janitor service into a multi-server cluster (multiple
Docker hosts with Node.js routing agent, TLS / OAuth2 token-based
communication)
* Completed preliminary service review, started hardening (formalized +
implemented firewall rules, started implementing Mozilla’s security
guidelines, scheduled rapid risk assessment)

W3C WebDriver and Marionet

Re: Unable to run TPS tests

2013-04-03 Thread Jonathan Griffin
You can't run TPS via tryserver; it isn't run in buildbot at all.  It 
can't, since it uses live Sync servers.


Raymond, the problem you're experiencing is likely due to changes in 
mozprocess/mozrunner API's that TPS hasn't been updated to handle.  Can 
you file a bug about this, and assign it to me?


Thanks,

Jonathan


On 4/3/2013 9:56 AM, Justin Lebar wrote:

I don't know, actually.  You can ask on #developers, but I'd just run
'em all.  :)

On Wed, Apr 3, 2013 at 12:54 PM, Raymond Lee
 wrote:

Thanks Justin!  Can you suggest what try syntax I can use please?  I don't
see a TPS option in the try syntax builder page.
http://trychooser.pub.build.mozilla.org/

On 3 Apr, 2013, at 11:47 PM, Justin Lebar  wrote:

In general you'll have much more success running these benchmarks on
tryserver rather than trying to run them locally.  Even if you got the
test working, there's no guarantee that your local benchmark results
will have any bearing on the benchmark results on our servers.  (In
particular, the servers are configured to reduce the noise in these
results.)

On Wed, Apr 3, 2013 at 3:15 AM,   wrote:

Hi all

I am trying to run TPS to ensure my patch works for bug 852041
(https://bugzilla.mozilla.org/show_bug.cgi?id=852041)

However, I got some errors when I ran the following.  Could someone give me
some suggestions how to fix it please?

1. source /Users/raymond/Documents/virtualenv/bin/activate
2. runtps --binary=/Users/raymond/Documents/mozilla-central/obj-ff-dbg/

== START ==
using result file tps_result.json
['/Users/raymondlee/Documents/appcoast/mozilla-central2/obj-ff-dbg/',
'-profile',
'/var/folders/43/d4b7hbz56jn2gvhmdhtrytxwgn/T/tmpozAsVT.mozrunner',
'-tps', '/var/folders/43/d4b7hbz56jn2gvhmdhtrytxwgn/T/tps_test_Uwxl4Y',
'-tpsphase', u'1', '-tpslogfile',
'/Users/raymondlee/Documents/appcoast/mozilla-central2/obj-ff-dbg/dist/tps.log']
Traceback (most recent call last):
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",
line 331, in run_tests
self.run_test_group()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",
line 405, in run_test_group
result = self.run_single_test(testdir, test)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",
line 223, in run_single_test
phase.run()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/phase.py",
line 49, in run
profile=self.profile)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/firefoxrunner.py",
line 90, in run
self.runner.start()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozrunner-5.15-py2.7.egg/mozrunner/runner.py",
line 183, in start
self.process_handler.run(timeout, outputTimeout)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozprocess-0.9-py2.7.egg/mozprocess/processhandler.py",
line 621, in run
self.proc = self.Process(self.cmd, **args)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozprocess-0.9-py2.7.egg/mozprocess/processhandler.py",
line 76, in __init__
universal_newlines, startupinfo, creationflags)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 679, in __init__
errread, errwrite)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 1249, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
Test Summary

Exception AttributeError: "'ProcessHandler' object has no attribute 'proc'"
in > ignored

== END ==

Thanks
Raymond
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Unable to run TPS tests

2013-04-03 Thread Jonathan Griffin
I just tested this myself and found that it works.  The problem is in 
your command-line:


>>> 2. runtps --binary=/Users/raymond/Documents/mozilla-central/obj-ff-dbg/

--binary needs to be the full path of the binary, not the directory to it.

The error message could certainly be improved.  :)

Let me know if this doesn't resolve your issue.

Jonathan


On 4/3/2013 10:31 AM, Jonathan Griffin wrote:

You can't run TPS via tryserver; it isn't run in buildbot at all.  It
can't, since it uses live Sync servers.

Raymond, the problem you're experiencing is likely due to changes in
mozprocess/mozrunner API's that TPS hasn't been updated to handle.  Can
you file a bug about this, and assign it to me?

Thanks,

Jonathan


On 4/3/2013 9:56 AM, Justin Lebar wrote:

I don't know, actually.  You can ask on #developers, but I'd just run
'em all.  :)

On Wed, Apr 3, 2013 at 12:54 PM, Raymond Lee
 wrote:

Thanks Justin!  Can you suggest what try syntax I can use please?  I
don't
see a TPS option in the try syntax builder page.
http://trychooser.pub.build.mozilla.org/

On 3 Apr, 2013, at 11:47 PM, Justin Lebar 
wrote:

In general you'll have much more success running these benchmarks on
tryserver rather than trying to run them locally.  Even if you got the
test working, there's no guarantee that your local benchmark results
will have any bearing on the benchmark results on our servers.  (In
particular, the servers are configured to reduce the noise in these
results.)

On Wed, Apr 3, 2013 at 3:15 AM,   wrote:

Hi all

I am trying to run TPS to ensure my patch works for bug 852041
(https://bugzilla.mozilla.org/show_bug.cgi?id=852041)

However, I got some errors when I ran the following.  Could someone
give me
some suggestions how to fix it please?

1. source /Users/raymond/Documents/virtualenv/bin/activate
2. runtps --binary=/Users/raymond/Documents/mozilla-central/obj-ff-dbg/

== START ==
using result file tps_result.json
['/Users/raymondlee/Documents/appcoast/mozilla-central2/obj-ff-dbg/',
'-profile',
'/var/folders/43/d4b7hbz56jn2gvhmdhtrytxwgn/T/tmpozAsVT.mozrunner',
'-tps',
'/var/folders/43/d4b7hbz56jn2gvhmdhtrytxwgn/T/tps_test_Uwxl4Y',
'-tpsphase', u'1', '-tpslogfile',
'/Users/raymondlee/Documents/appcoast/mozilla-central2/obj-ff-dbg/dist/tps.log']

Traceback (most recent call last):
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",

line 331, in run_tests
self.run_test_group()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",

line 405, in run_test_group
result = self.run_single_test(testdir, test)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/testrunner.py",

line 223, in run_single_test
phase.run()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/phase.py",

line 49, in run
profile=self.profile)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/tps-0.4-py2.7.egg/tps/firefoxrunner.py",

line 90, in run
self.runner.start()
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozrunner-5.15-py2.7.egg/mozrunner/runner.py",

line 183, in start
self.process_handler.run(timeout, outputTimeout)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozprocess-0.9-py2.7.egg/mozprocess/processhandler.py",

line 621, in run
self.proc = self.Process(self.cmd, **args)
  File
"/Users/raymondlee/Documents/appcoast/virtualenv/lib/python2.7/site-packages/mozprocess-0.9-py2.7.egg/mozprocess/processhandler.py",

line 76, in __init__
universal_newlines, startupinfo, creationflags)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",

line 679, in __init__
errread, errwrite)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",

line 1249, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
Test Summary

Exception AttributeError: "'ProcessHandler' object has no attribute
'proc'"
in > ignored

== END ==

Thanks
Raymond
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: js-inbound as a separate tree

2013-12-19 Thread Jonathan Griffin
We already have the approximate equivalent of this.  It's the 
'checkin-needed' keyword.


Add this to your bug, and the sheriffs will land the patch for you, 
using the approximate process you describe.  The only difference is this 
is done out-of-band, so turnaround may take up to 24 hrs.


The advantage of using this is that it allows the sheriffs to regulate 
patch landings more evenly, and avoid landings during peak hours, which 
makes bustages easier to spot, and potentially reduces the duration of 
tree closures.


The disadvantage is that you may end up waiting to have your patch 
landed, and some bugs may be hard for the sheriffs to land; e.g., if 
there are a lot of patches in a single bug, and some of them have 
already landed, or there are patches for multiple repos.


Jonathan


On 12/19/2013 2:42 PM, Bobby Holley wrote:

As someone who works mostly on the intersection of the JS engine and
everything else, I'm not really wild about this. SpiderMonkey is pretty
intimately tied to the rest of Gecko, certainly just as much as something
like gfx. I think fx-team makes more sense, since most of the patches there
consist primarily of changes to XUL/CSS/JS.

The main problem with inbound seems to be that it requires all developers,
who are generally working on disjoint things, to devote attention to
serializing their patches into inbound with other patches that are mostly
unrelated (but might not be!). As the number of pushers and inbound
closures increases, this becomes more and more of an attention-suck.

The long-term solution that we're working towards is some kind of
bugzilla-based auto-lander IIUC. But in the mean time, it seems like it
would be trivial to write a locally-hosted (mach-integrated?) auto-lander
script, the automates the process of:
(1) Wait until inbound is open.
(2) pull -u, apply the patches, and make sure they apply cleanly.
(3) Push, and mark the bug.

In the case where the patches don't apply, the developer can be alerted,
since her attention is basically required in that case anyway. In all other
cases, we effectively emulate the experience of pushing to an always-open
inbound.

This would be a relatively trivial tool to write, especially compared with
the infra and staff burden of maintaining a bunch of separate repos.

Thoughts?
bholley


On Thu, Dec 19, 2013 at 10:48 AM, Jason Orendorff wrote:


On dev-tech-js-engine-internals, there's been some discussion about
reviving a separate tree for JS engine development.

The tradeoffs are like any other team-specific tree.

Pro:
- protect the rest of the project from closures and breakage due to JS
patches
- protect the JS team from closures and breakage on mozilla-inbound
- avoid perverse incentives (rushing to land while the tree is open)

Con:
- more work for sheriffs (mostly merges)
- breakage caused by merges is a huge pain to track down
- makes it harder to land stuff that touches both JS and other modules

We did this before once (the badly named "tracemonkey" tree), and it
was, I dunno, OK. The sheriffs have leveled up a *lot* since then.

There is one JS-specific downside: because everything else in Gecko
depends on the JS engine, JS patches might be extra likely to conflict
with stuff landing on mozilla-inbound, causing problems that only
surface after merging (the worst kind). I don't remember this being a
big deal when the JS engine had its own repo before, though.

We could use one of these to start:


Thoughts?

-j
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: We live in a memory-constrained world

2014-02-26 Thread Jonathan Griffin
Splitting the valgrind tests up and running them separately as test jobs 
in TBPL is definitely something the A*Team can help with.  I've filed 
bug 977240 for this.


Jonathan

On 2/25/14 7:25 PM, Nicholas Nethercote wrote:

On Tue, Feb 25, 2014 at 2:32 PM, Mike Hommey  wrote:

I never understood why we need those jobs to be builds. Why not turn
--enable-valgrind on m-c builds, and run valgrind as a test job?

--disable-jemalloc is needed as well.

As for the structure... I just took what already existed and got it
into good enough shape to make visible on TBPL. (That was more than
enough for a non-TBPL/buildbot expert like me to take on.) I'm fine
with the idea of the Valgrind test job being split up into suites and
using the test machines for the testing part, but I'm not jumping up
and down to volunteer to do it.

Nick
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Jonathan Griffin
With respect to Autoland, I think we'll need to figure out how to make 
it take intermittents into account.  I don't think we'll ever be a state 
with 0 intermittents.


Jonathan

On 4/4/2014 1:30 PM, Chris Peterson wrote:

On 4/4/14, 1:19 PM, Gavin Sharp wrote:

The majority of the time identifying the regressing patch is
difficult


Identifying the regressing patch is only difficult because we have so 
many intermittently failing tests.


Intermittent oranges are one of the major blockers for Autoland. If 
TBPL never shows an entirely green test pass on mozilla-inbound, 
Autoland can't programmatically determine when a checkin-need patch 
can be safely landed.



chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: B2G emulator issues

2014-04-07 Thread Jonathan Griffin

How easy is it to identify CPU-sensitive tests?

I think the most practical solution (at least in the near term) is to 
find that set of tests, and run only that set on a faster VM, or on real 
hardware (like our ix slaves).


Jonathan


On 4/7/2014 3:16 PM, Randell Jesup wrote:

The B2G emulator design is causing all sorts of problems.  We just fixed
the #2 orange which was caused by the Audio channel StartPlaying()
taking up to 20 seconds to run (and we "fixed" it by effectively
removing some timeouts).  However, we just wasted half a week trying to
land AEC & MediaStreamGraph improvements.  We still haven't landed due
to yet another B2G emulator orange, but the solution we used for the M10
problem doesn't fix the fundamental problems with B2G emulator.

Details:

We ran into huge problems getting AEC/MediaStreamGraph changes (bug
818822 and things dependent on it) into the tree due to problems with
B2g-emulator debug M10 (permaorange timeouts).  This test adds a fairly
small amount of processing to input audio data (resampling to 44100Hz).

A test that runs perfectly in emulator opt builds and runs fine locally
in M10 debug (10-12 seconds reported for the test in the logs, with or
without the change), goes from taking 30-40 seconds on tbpl to
350-450(!) seconds (and then times out).  Fix that one, and others fail
even worse.

I contacted Gregor Wagner asking for help and also jgriffin in #b2g.  We
found one problem (emulator going to 'sleep' during mochitests, bug
992436); I have a patch up to enable wakelock globally for mochitests.
However, that just pushed the error a little deeper.

The fundamental problem is that b2g-emulator can't deal safely with any
sort of realtime or semi-realtime data unless run on a fast machine.
The architecture for the emulator setup means the effective CPU power is
dependent on the machine running the test, and that varies a lot (and
tbpl machines are WAY slower than my 2.5 year old desktop).  Combine
that with Debug being much slower, and it's recipe for disaster for any
sort of time-dependent tests.

I worked around it for now, by turning down the timers that push fake
realtime data into the system - this will cause audio underruns in
MediaStreamGraph, and doesn't solve the problem of MediaStreamGraph
potentially overloading itself for other reasons, or breaking
assumptions about being able to keep up with data streams.  (MSG wants
to run every 10ms or so.)

This problem also likely plays hell with the Web Audio tests, and will
play hell with WebRTC echo cancellation and the media reception code,
which will start trying to insert loss-concealment data and break
timer-based packet loss recovery, bandwidth estimators, etc.


As to what to do?  That's a good question, as turning off the emulator
tests isn't a realistic option.

One option (very, very painful, and even slower) would be a proper
device simulator which simulates both the CPU and the system hardware
(of *some* B2G phone).  This would produce the most realistic result
with an emulator.

Another option (likely not simple) would be to find a way to "slow down
time" for the emulator, such as intercepting system calls and increasing
any time constants (multiplying timer values, timeout values to socket
calls, etc, etc).  This may not be simple.  For devices (audio, etc),
frequencies may need modifying or other adjustments made.

We could require that the emulator needs X Bogomips to run, or to run a
specific test suite.

We could segment out tests that require higher performance and run them
on faster VMs/etc.

We could turn off certain tests on tbpl and run them on separate
dedicated test machines (a bit similar to PGO).  There are downsides to
this of course.

Lastly, we could put in a bank of HW running B2G to run the tests like
the Android test boards/phones.


So, what do we do?  Because if we do nothing, it will only get worse.



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: B2G emulator issues

2014-04-08 Thread Jonathan Griffin


On 4/8/2014 1:05 AM, Thomas Zimmermann wrote:

There are tests that instruct the emulator to trigger certain HW events.
We can't run them on actual phones.

To me, the idea of switching to a x86-based emulator seems to be the
most promising solution. What would be necessary?

Best regards
Thomas


We'd need these things:

1 - a consensus we want to move to x86-based emulators, which presumes 
that architecture-specific problems aren't likely or important enough to 
warrant continued use of arm-based emulators


2 - RelEng would need to stand up x86-based KitKat emulator builds

3 - The A*Team would need to get all of the tests running against these 
builds


4 - The A*Team and developers would have to work on fixing the 
inevitable test failures that occur when standing up any new platform


I'll bring this topic up at the next B2G Engineering Meeting.

Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Is it time for mochitest-chrome on Android and B2G

2014-06-17 Thread Jonathan Griffin
Periodically, we field a request to add support for mochitest-chrome to 
Android and B2G.  To date, we've avoided this by pointing out ways that 
mochitest-plain can be used for the same use case, which usually 
involves SpecialPowers.


We have a new request for this, in the context of requestAutocomplete 
(https://bugzilla.mozilla.org/show_bug.cgi?id=1021060#c16).  The tests 
for this, as well as some other features we've seen requests for, need 
to be able to execute some setup code with chrome privileges, and 
SpecialPowers isn't always flexible enough.  As bholley points out, 
SpecialPowers is a "best effort" and doesn't necessarily support 
everything a test may wish to do.


Has the time come to bite the bullet and add mochitest-chrome support to 
Android and B2G?  This would be a non-trivial effort, and would need to 
be done separately for Android and B2G.  Adding support for this would 
come at the expense of something else, possibly work related to Android 
4.4 tests on emulators, work integrating our harnesses with structured 
logging, and/or work on reducing our intermittent test failures on B2G.  
These tasks are important and I wouldn't want to delay them without a 
very clear need.  Does that exist here?


Note that we are talking only about enabling harness support for 
mochitest-chrome in Android and B2G in order to provide a framework for 
tests that would otherwise be difficult to write.  We are not talking 
about taking the existing set of mochitest-chrome tests and getting them 
to work in Android and B2G.  Many of those tests don't apply to Android 
or B2G, and for those that theoretically do, many of them won't work 
because they rely on XUL files which aren't supported in B2G, and may 
not be in Android (not sure on that point).


For more context about the history of mochitest-chrome on B2G, see 
https://bugzilla.mozilla.org/show_bug.cgi?id=797164


Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Are you interested in doing dynamic analysis of JS code?

2014-07-01 Thread Jonathan Griffin
The A-team would be very interested in being able to track JS code 
coverage; if you implemented the ability, we could add jobs in TBPL to 
track our test coverage over time, which would probably be useful and 
interesting.


We'd be happy to get involved at any stage where it's practical to start 
thinking about how we could integrate this into test runs.


Jonathan

On 6/25/2014 8:15 AM, Jason Orendorff wrote:
We're considering building a JavaScript API for dynamic analysis of JS 
code.

Here's the sort of thing you could do with it:

  - Gather code coverage information (useful for testing/release mgmt?)

  - Trace all object mutation and method calls (useful for devtools?)

  - Record/replay of JS execution (useful for devtools?)

  - Implement taint analysis (useful for the security team or devtools?)

  - Detect when a mathematical operation returns NaN (useful for game
developers?)

Note that the API would not directly offer all these features. 
Instead, it

would offer some powerful but mind-boggling way of instrumenting all JS
code. It would be up to you, the user, to configure the 
instrumentation, get
useful data out of it, and display or analyze it. There would be some 
overhead

when you turn this on; we don't know how much yet.

We would present a detailed example of how to use the proposed API, 
but we are

so early in the process that we're not even sure what it would look like.
There are several possibilities.

We need to know how to prioritize this work. We need to know what kind 
of API
we should build. So we're looking for early adopters. If that's you, 
please

speak up and tell us how you'd like to instrument JS code.



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try-based code coverage results

2014-07-07 Thread Jonathan Griffin

Hey Joshua,

That's awesome!

How long does the try run take that generated this data?  We should 
consider scheduling a periodic job to collect this data and track it 
over time.


Jonathan

On 7/6/2014 10:02 PM, Joshua Cranmer 🐧 wrote:
I don't know how many people follow code-coverage updates in general, 
but I've produced relatively up-to-date code coverage results based on 
, and they may 
be found here: .


In contrast to earlier versions of my work, you can actually explore 
the coverage as delineated by specific tests, as identified by their 
TBPL identifier. Christian's persistent requests for me to limit the 
depth of the treemap view are still unresolved, because, well, at 2 AM 
in the morning, I just wanted to push a version that worked.


The test data was generated by pushing modified configs to try and 
using blobber features to grab the resulting coverage data. Only 
Linux32/64 is used, and only "opt" builds are represented (it's a 
--disable-optimize --disable-debug kind of build), the latter because 
I wanted to push a version out tonight and the debug .gcda tarballs 
are taking way too long to finish downloading.


Effectively, only xpcshell tests, and the M, M-e10s, and R groups are 
represented in the output data. M-e10s is slightly borked: only 
M-e10s(1) [I think] is shown, because, well, treeherder didn't 
distinguish between the five of them. A similar problem with the debug 
M(dt1/dt2/dt3) test suites will arise when I incorporate that data. 
C++ unit tests are not present because blobber doesn't run on C++ unit 
tests for some reason, and Jit-tests, jetpack tests, and Marionette 
tests await me hooking in the upload scripts to those testsuites (and 
Jit-tests would suffer a similar numbering problems). The individual 
testsuites within M-oth may be mislabeled because I can't sort names 
properly.


There's a final, separate issue with treeherder not recording the 
blobber upload artifacts for a few of the runs (e.g., Linux32 opt X), 
even though it finished without errors and tbpl records those 
artifacts. So coverage data is missing for the affected run. It's also 
worth noting that a few test runs are mired with timeouts and 
excessive failures, the worst culprit being Linux32 debug where half 
the testsuites either had some failures or buildbot timeouts (and no 
data at all).


If you want the underlying raw data (the .info files I prepare from 
every individual run's info), I can provide that on request, but the 
data is rather large (~2.5G uncompressed).


In short:
* I have up-to-date code-coverage on Linux 32-bit and Linux 64-bit. 
Opt is up right now; debug will be uploaded hopefully within 24 hours.

* Per-test [TBPL run] level of detail is visible.
* Treeherder seems to be having a bit of an ontology issue...



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try-based code coverage results

2014-07-07 Thread Jonathan Griffin
I guess a related question is, if we could run this periodically on 
TBPL, what would be the right frequency?


We could potentially create a job in buidlbot that would handle the 
downloading/post-processing, which might be a bit faster than doing it 
on an external system.


Jonathan

On 7/7/2014 10:40 AM, Joshua Cranmer 🐧 wrote:

On 7/7/2014 11:39 AM, Jonathan Griffin wrote:

Hey Joshua,

That's awesome!

How long does the try run take that generated this data?  We should 
consider scheduling a periodic job to collect this data and track it 
over time.


Well, it depends on how overloaded try is at the moment. ^_^

The builds take an hour themselves, and the longest-running tests on 
debug builds can run long enough to encroach the hard (?) 2 hour limit 
for tests. Post-processing of the try data can take another several 
hours (a large part of which is limited by the time it takes to 
download ~3.5GB of data).




___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try-based code coverage results

2014-07-07 Thread Jonathan Griffin
So it sounds like it would be valuable to add try syntax to trigger 
this, as well as produce periodic reports.   Most of the work needed is 
the same.


I'll file a bug to track this; I don't have an ETA for starting work on 
it, but we want to get to it before things bitrot.


Jonathan

On 7/7/2014 12:49 PM, Joshua Cranmer 🐧 wrote:

On 7/7/2014 1:11 PM, Jonathan Griffin wrote:
I guess a related question is, if we could run this periodically on 
TBPL, what would be the right frequency?


Several years ago, I did a project where I ran code-coverage on 
roughly every nightly build of Thunderbird [1] (and I still have those 
results!). When I talked about this issue back then, people seemed to 
think that weekly was a good metric. I think Christian Holler was 
doing builds roughly monthly a few years ago based on an earlier 
version of my code-coverage-on-try technique until those builds fell 
apart [2].





On 7/7/2014 11:18 AM, Brian Smith wrote:
Ideally, you would be able to trigger it on a try run for specific 
test suites or even specific subsets of tests. For example, for 
certificate verification changes and SSL changes, it would be great 
for the reviewer to be able to insist on seeing code coverage reports 
on the try run that preceded the review request, for xpcshell, 
cppunit, and GTest, without doing coverage for all test suites.


To minimize the performance impact of it further, ideally it would be 
possible to scope the try runs to "cppunit, GTest, and xpcshell tests 
under the security/ directory in the tree."


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try-based code coverage results

2014-07-07 Thread Jonathan Griffin
Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1035464 for those 
that would like to follow along.


Jonathan

On 7/7/2014 3:22 PM, Jonathan Griffin wrote:
So it sounds like it would be valuable to add try syntax to trigger 
this, as well as produce periodic reports.   Most of the work needed 
is the same.


I'll file a bug to track this; I don't have an ETA for starting work 
on it, but we want to get to it before things bitrot.


Jonathan

On 7/7/2014 12:49 PM, Joshua Cranmer 🐧 wrote:

On 7/7/2014 1:11 PM, Jonathan Griffin wrote:
I guess a related question is, if we could run this periodically on 
TBPL, what would be the right frequency?


Several years ago, I did a project where I ran code-coverage on 
roughly every nightly build of Thunderbird [1] (and I still have 
those results!). When I talked about this issue back then, people 
seemed to think that weekly was a good metric. I think Christian 
Holler was doing builds roughly monthly a few years ago based on an 
earlier version of my code-coverage-on-try technique until those 
builds fell apart [2].





On 7/7/2014 11:18 AM, Brian Smith wrote:
Ideally, you would be able to trigger it on a try run for specific 
test suites or even specific subsets of tests. For example, for 
certificate verification changes and SSL changes, it would be great 
for the reviewer to be able to insist on seeing code coverage reports 
on the try run that preceded the review request, for xpcshell, 
cppunit, and GTest, without doing coverage for all test suites.


To minimize the performance impact of it further, ideally it would be 
possible to scope the try runs to "cppunit, GTest, and xpcshell tests 
under the security/ directory in the tree."


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Switching Jetpack to use the runtests.py automation

2014-08-05 Thread Jonathan Griffin
If this only involves tiny changes to mochitest and it's ready, I'd go 
ahead and do that.


I am interested in seeing what your requirements are, though, and 
figuring out if we could meet them later with a better architected 
solution, whether it's Marionette or something else.  Mochitest is kind 
of a monster, and the more we hack on corner cases the more fragile and 
unmaintainable it becomes.


Jonathan

On 8/5/2014 9:20 AM, Dave Townsend wrote:

On Mon, Aug 4, 2014 at 6:21 PM, Gregory Szorc  wrote:


On 8/4/14, 10:39 AM, Dave Townsend wrote:


I've done a little investigation into marionette and I've found a few
issues with it:

Firstly it doesn't look like running marionette directly or through mach
allows developers to select individual directories or test files to run,
rather it is a one-shot affair. This is very inconvenient for development.

Secondly marionette doesn't seem to be built to scale to many test
types. It uses regular expressions on filenames to determine the test
type, as it happens the Jetpack tests do use a different form to the
existing marionette tests so it's not out of the question but still
makes me wary of adding a new test type.

Thirdly I can't run marionette tests locally, they consistently fail
quite badly.

These problems make marionette a less than desirable option for use as a
base for our test harness right now so I plan to get my work to make
mochitests run jetpack tests completed this week and submit it for
review. If Marionette becomes a better choice in the future a lot of the
work I'm doing right now carries over, it will be simpler to switch from
mochitest to marionette than it is switching to mochitest.


The issues listed seem fixable. I would rather we spend energy improving
Marionette than piling yet more things on top of mochitest's haphazard base.

The various automation "failures" in the past few weeks should be reason
enough to avoid mochitest and go with a better-engineered and tested
solution (marionette).



Who is going to do that work? I have patches that vastly improve the
testing situation for jetpack tests by allowing other developers run them
more easily, making them easier for releng to manage and most importantly
making them meet tbpl visibility requirements. They involve tiny changes to
the mochitest harness code. We're already hidden by default on tinderbox
and are hitting problems because of it, I'd rather go ahead and finish up
this work than wait for some future time when marionette can be upgraded to
meet our requirements.
___
tools mailing list
to...@lists.mozilla.org
https://lists.mozilla.org/listinfo/tools


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: External dependent tests in gecko and gaia

2014-08-14 Thread Jonathan Griffin
I think this is a great idea, although like others have said, I'd like 
to have this implemented inside the test manifests, regardless of 
directory structure.


A related piece is reporting; for years, we've had tests like this run 
on separate systems, reporting to custom dashboards, because they 
weren't appropriate for buildbot, and TBPL can't display data for tests 
running anywhere else.  With Treeherder, we're now in a position to be 
able to display results for tests running anywhere, and this could lead 
to greater visibility and adoption of non-buildbot automation.


Such tests would still need to be sheriffed differently (by different 
people and/or with different rules), so we'd have to work out what the 
views for these tests should be, but using Treeherder to expose test 
results for the variety of automation that's run in Mozilla will be a 
big win.


Jonathan

On 8/13/2014 9:21 AM, Edwin Wong wrote:

Hi dev-platform,

TL;DR - Cloud Services and Quality Engineering would like to propose the creation of 
a directory named “external" in gecko and gaia repos for externally dependent 
tests.

This enables features married to Cloud Services such as Loop, FindMyDevice, 
FirefoxAccounts, and Sync to have centralized tests that can be run locally or on 
other continuous integration systems. These tests would live in this "external” 
directory along side existing tests (so they live together). These will be run and 
sheriffed independently from the main tests.  Reviews would be governed by modules 
and feature teams.

More detail:
https://wiki.mozilla.org/QA/External_Tests

Cheers,
Edwin Wong
Cloud Services QE Manager
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running mochitests from a copy of the objdir?

2014-08-19 Thread Jonathan Griffin

Hi Neil,

Can you show us the command-line you're using?

Jonathan

On 8/19/2014 1:53 AM, Neil wrote:

Gregory Szorc wrote:


On 8/18/2014 4:45 PM, Neil wrote:



Time was that you could just python runtests.py to run mochitests.

Then we needed modules that you don't get in the default python, so 
you had to invoke python from the virtualenv instead.


Now that doesn't work either, because it's trying to run .mozconfig, 
so my questions are a) why and b) how do I stop it? (I found 
--binary but that didn't seem to be enough on its own.)



Can you please describe your workflow

It's quite simple: python runtests.py had all the right defaults for 
me, while the Makefile targets pass a bunch of parameters that I don't 
want, and mach doesn't work in comm-central yet, so I don't know how 
badly that would work out.


(as opposed to giving you hints on how to subtly hack around the 
existing implementation)



I'd prefer hints on how to fix the build system, but I'll take hacks too.



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
Our pools of test slaves are often at or over capacity, and this has the 
effect of increasing job coalescing and test wait times.  This, in turn, 
can lead to longer tree closures caused by test bustage, and can cause 
try runs to be very slow to complete.


One of the easiest ways to mitigate this is to run tests less often.

To assess the impact of doing this, we will be performing an experiment 
the week of August 25, in which we will run debug tests on 
mozilla-inbound on most desktop platforms every other run, instead of 
every run as we do now.  Debug tests on linux64 will continue to run 
every time.  Non-desktop platforms and trees other than mozilla-inbound 
will not be affected.


This approach is based on the premise that the number of debug-only 
platform-specific failures on desktop is low enough to be manageable, 
and that the extra burden this imposes on the sheriffs will be small 
enough compared to the improvement in test slave metrics to justify the 
cost.


While this experiment is in progress, we will be monitoring job 
coalescing and test wait times, as well as impacts on sheriffs and 
developers.  If the experiment causes sheriffs to be unable to perform 
their job effectively, it can be terminated prematurely.


We intend to use the data we collect during the experiment to inform 
decisions about additional tooling we need to make this or a similar 
plan permanent at some point in the future, as well as validating the 
premise on which this experiment is based.


After the conclusion of this experiment, a follow-up post will be made 
which will discuss our findings.  If you have any concerns, feel free to 
reach out to me.


Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
I also agree about coalescing better.  We are looking at ways to do that 
in conjunction with 
https://wiki.mozilla.org/Auto-tools/Projects/Autoland, which we'll have 
a prototype of by the end of the quarter.  In this model, commits that 
are going through autoland could be coalesced when landing on inbound, 
which would reduce slave load on all platforms.


Until that's deployed and in widespread use, we have other options to 
decrease slave load, and this experiment is the simplest.  It won't 
result in reduced test coverage, since sheriffs will backfill in the 
case of a regression.  Essentially, we're not running tests that would 
have passed anyway.


Depending on feedback we receive after this experiment, we may opt to 
change our approach in the future:  i.e., run tests every Nth opt build 
instead of debug build, or try to identify sets of "never failing" tests 
and just run those less frequently, or always include at least one 
flavor of Windows, OSX and Linux on every commit, etc.


Regards,

Jonathan


On 8/19/2014 1:55 PM, Benoit Girard wrote:

I completely agree with Jeff Gilbert on this one.

I think we should try to coalesce -better-. I just checked the current
state of mozilla-inbound and it doesn't feel any of the current patch
really need their own set of tests because they're are not time
sensitive or sufficiently complex. Right now developers are asked to
create bugs for their own change with their own patch. This leads to a
lot of little patches being landed by individual developers which
seems to reflect the current state of mozilla-inbound.

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together. Opting into this means that your
patch may take significantly longer to get merged if it's landed with
another bad patch and should only be used when that's acceptable.
Right now developers with commit access are not encouraged to make use
of checkin-needed AFAIK. If we started recommending against individual
landings for simple changes, and improved the process, we could
probably significantly cut the number of tests jobs by cutting the
number of pushes.

On Tue, Aug 19, 2014 at 3:57 PM, Jeff Gilbert  wrote:

I would actually say that debug tests are more important for continuous 
integration than opt tests. At least in code I deal with, we have a ton of 
asserts to guarantee behavior, and we really want test coverage with these via 
CI. If a test passes on debug, it should almost certainly pass on opt, just 
faster. The opposite is not true.

"They take a long time and then break" is part of what I believe caused us to 
not bother with debug testing on much of Android and B2G, which we still haven't 
completely fixed. It should be unacceptable to ship without CI on debug tests, but here 
we are anyways. (This is finally nearly fixed, though there is still some work to do)

I'm not saying running debug tests less often is on the same scale of bad, but 
I would like to express my concerns about heading in that direction.

-Jeff

- Original Message -
From: "Jonathan Griffin" 
To: dev-platform@lists.mozilla.org
Sent: Tuesday, August 19, 2014 12:22:21 PM
Subject: Experiment with running debug tests less often on mozilla-inbound  
the week of August 25

Our pools of test slaves are often at or over capacity, and this has the
effect of increasing job coalescing and test wait times.  This, in turn,
can lead to longer tree closures caused by test bustage, and can cause
try runs to be very slow to complete.

One of the easiest ways to mitigate this is to run tests less often.

To assess the impact of doing this, we will be performing an experiment
the week of August 25, in which we will run debug tests on
mozilla-inbound on most desktop platforms every other run, instead of
every run as we do now.  Debug tests on linux64 will continue to run
every time.  Non-desktop platforms and trees other than mozilla-inbound
will not be affected.

This approach is based on the premise that the number of debug-only
platform-specific failures on desktop is low enough to be manageable,
and that the extra burden this imposes on the sheriffs will be small
enough compared to the improvement in test slave metrics to justify the
cost.

While this experiment is in progress, we will be monitoring job
coalescing and test wait times, as well as impacts on sheriffs and
developers.  If the experiment causes sheriffs to be unable to perform
their job effectively, it can be terminated prematurely.

We intend to use the data we collect during the experiment to inform
decisions about additional tooling we need to make this or a similar
plan permanent at some point in the future, as well as validating the
premise on which this experiment is based.

After the conclusion of this experiment, a follow-up post will be made
which will discuss 

Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:
I would actually say that debug tests are more important for 
continuous integration than opt tests. At least in code I deal with, 
we have a ton of asserts to guarantee behavior, and we really want 
test coverage with these via CI. If a test passes on debug, it should 
almost certainly pass on opt, just faster. The opposite is not true.


"They take a long time and then break" is part of what I believe 
caused us to not bother with debug testing on much of Android and 
B2G, which we still haven't completely fixed. It should be 
unacceptable to ship without CI on debug tests, but here we are 
anyways. (This is finally nearly fixed, though there is still some 
work to do)


I'm not saying running debug tests less often is on the same scale of 
bad, but I would like to express my concerns about heading in that 
direction.


I second this.  I'm curious to know why you picked debug tests for 
this experiment.  Would it not make more sense to run opt tests on 
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less 
frequently would have a larger impact.  If there's a broad consensus 
that debug runs are more valuable, we could switch to running opt tests 
less frequently instead.


Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
No, fx-team is not affected by this experiment; we intend to target 
mozilla-inbound only for this 1-week trial.  The reason is that the 
number of commits on m-i seems larger than fx-team, and therefore the 
impacts should be more visible.


Jonathan

On 8/19/2014 3:19 PM, Matthew N. wrote:

On 8/19/14 12:22 PM, Jonathan Griffin wrote:

To assess the impact of doing this, we will be performing an experiment
the week of August 25, in which we will run debug tests on
mozilla-inbound on most desktop platforms every other run, instead of
every run as we do now.  Debug tests on linux64 will continue to run
every time.  Non-desktop platforms and trees other than mozilla-inbound
will not be affected.


To clarify, is fx-team affected by this change? I ask because you 
mention "desktop" and that is where the desktop front-end team does 
landings. I suspect fx-team landings are less likely to hit debug-only 
issues than mozilla-inbound as fx-team has much fewer C++ changes and 
anecdotally JS-only changes seem to trigger debug-only failures less 
often.



This approach is based on the premise that the number of debug-only
platform-specific failures on desktop is low enough to be manageable,
and that the extra burden this imposes on the sheriffs will be small
enough compared to the improvement in test slave metrics to justify the
cost.


FWIW, I think fx-team is more desktop-specific (although Android 
front-end stuff also lands there and I'm not familiar with that).


MattN
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin
Thanks Ed.  To paraphrase, no test coverage is being lost here, we're 
just being a little more deliberate with job coalescing.  All tests will 
be run on all platforms (including debug tests) on a commit before a 
merge to m-c.


Jonathan

On 8/21/2014 9:35 AM, Ed Morley wrote:
I think much of the pushback in this thread is due to a 
misunderstanding of some combination of:

* our current buildbot scheduling
* the proposal
* how trees are sheriffed and merged

To clarify:

1) We already have coalescing [*] of jobs on all trees apart from try.

2) This coalescing means that all jobs are still run at some point, 
but just may not run on every push.


3) When failures are detected, coalescing means that regression ranges 
are larger and so sometimes result in longer tree integration repo 
closures, whilst the sheriffs force trigger jobs on the revisions that 
did not originally run them.


4) When merging into mozilla-central, sheriffs ensure that all jobs 
are green - including those that got coalesced and those that are only 
scheduled periodically (eg non-unified & PGO builds are only run every 
3 hours). (This is a fairly manual process currently, but better 
tooling should be possible with treeherder).


5) This proposal does not mean debug-only issues are somehow not worth 
acting on or that they'll end up shipped/on mozilla-central, thanks to 
#4.


6) This proposal is purely trying to make existing coalescing (#1/#2) 
more intelligent, to ensure that we expend the finite amount of 
machine time we have at present on the most appropriate jobs at each 
point, in order to reduce the impact of #3.


Fwiw I'm on the fence as to whether the algorithm suggested in this 
proposal is the most effective way to aid with #3 - however it's worth 
trying to find out.


Best wishes,

Ed

[*] Collapsing of pending jobs of the same type, when the queue size 
is greater than 1.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin

Hey Martin,

This is a good idea, and we've been thinking about approaches like 
this.  Basically, the idea is to run tests that "(nearly) always pass" 
less often.  There are currently some tests that fit into this category, 
like dom level0,1,2 tests in mochitest-plain, and those are 
time-consuming to run.  Your idea takes this a step further, by 
identifying tests that sometimes fail, correlating those with code 
changes, and ensuring those get run.


Both of these require some tooling to implement, so we're experimenting 
initially with approaches that we can get nearly for "free", like 
running some tests only every other commit, and letting sheriffs trigger 
the missing tests in case a failure occurs.


The ultimate solution may blend a bit of both approaches, and will have 
to balance implementation cost with the gain we get from the related 
reduction in slave load.


Jonathan


On 8/21/2014 10:07 AM, Martin Thomson wrote:

On 20/08/14 17:37, Jonas Sicking wrote:

It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only.


Here's a potential project that might help.  For all of the trees 
(probably try especially), look at the checkins and for each directory 
affected build up a probability of failure for each of the tests.


You would have to find which commits were on m-c at the time of the 
run to set the baseline for the checkin; and intermittent failures 
would add a certain noise floor.


The basic idea though is that the information would be very simple to 
use: For each directory touched in a commit, find all the tests that 
cross a certain failure threshold across the assembled dataset and 
ensure that those test groups are run.


And this would need to include prerequisites, like builds for the 
given runs.  You would, of course, include builds as tests.


Setting the threshold might take some tuning, because failure rates 
will vary across different test groups.  I keep hearing bad things 
about certain ones, for instance and build failures are far less 
common than test failures on the whole, naturally.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin
It will be handled just like coalesced jobs today:  sheriffs will 
backfill the missing data, and backout only the offender.


An illustration might help.  Today we might have something like this, 
for a given job:


 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  pass   pass
commit 3 pass  fail   pass
commit 4 pass  fail   pass

In this case (assuming the two failures are the same), it's easy for 
sheriffs to see that commit 3 is the culprit and the one that needs to 
be backed out.


During the experiment, we might see something like this:

 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  not runnot run
commit 3 pass  fail   pass
commit 4 pass  not runnot run

Here, it isn't obvious whether the problem is caused by commit 2 or 
commit 3.  (This situation already occurs today because of "random" 
coalescing.)


In this case, the sheriffs will backfill missing test data, so we might see:

 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  pass   not run
commit 3 pass  fail   pass
commit 4 pass  fail   not run

...and then they have enough data to determine that commit 3 (and not 
commit 2) is to blame, and can take the appropriate action.


In summary, the sheriffs won't be backing out extra commits because of 
the coalescing, and it remains the sheriffs' job to backfill tests when 
they determine they need to do so in order to bisect a failure.   We 
aren't placing any extra burden on developers with this experiment, and 
part of the reason for this experiment is to determine how much of an 
extra burden this is for the sheriffs.


Jonathan

On 8/21/2014 3:03 PM, Jonas Sicking wrote:

What will be the policy if a test fails and it's unclear which push
caused the regression? Is it the sheriff's job, or the people who
pushed's job to figure out which push was the culprit and make sure
that that push gets backed out?

I.e. if 4 pushes land between two testruns, and we see a regression,
will the 4 pushes be backed out? Or will sheriffs run the missing
tests and only back out the offending push?

/ Jonas

On Thu, Aug 21, 2014 at 10:50 AM, Jonathan Griffin  wrote:

Thanks Ed.  To paraphrase, no test coverage is being lost here, we're just
being a little more deliberate with job coalescing.  All tests will be run
on all platforms (including debug tests) on a commit before a merge to m-c.

Jonathan


On 8/21/2014 9:35 AM, Ed Morley wrote:

I think much of the pushback in this thread is due to a misunderstanding
of some combination of:
* our current buildbot scheduling
* the proposal
* how trees are sheriffed and merged

To clarify:

1) We already have coalescing [*] of jobs on all trees apart from try.

2) This coalescing means that all jobs are still run at some point, but
just may not run on every push.

3) When failures are detected, coalescing means that regression ranges are
larger and so sometimes result in longer tree integration repo closures,
whilst the sheriffs force trigger jobs on the revisions that did not
originally run them.

4) When merging into mozilla-central, sheriffs ensure that all jobs are
green - including those that got coalesced and those that are only scheduled
periodically (eg non-unified & PGO builds are only run every 3 hours). (This
is a fairly manual process currently, but better tooling should be possible
with treeherder).

5) This proposal does not mean debug-only issues are somehow not worth
acting on or that they'll end up shipped/on mozilla-central, thanks to #4.

6) This proposal is purely trying to make existing coalescing (#1/#2) more
intelligent, to ensure that we expend the finite amount of machine time we
have at present on the most appropriate jobs at each point, in order to
reduce the impact of #3.

Fwiw I'm on the fence as to whether the algorithm suggested in this
proposal is the most effective way to aid with #3 - however it's worth
trying to find out.

Best wishes,

Ed

[*] Collapsing of pending jobs of the same type, when the queue size is
greater than 1.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running mochitests from a copy of the objdir?

2014-08-26 Thread Jonathan Griffin


On 8/20/2014 11:24 AM, Gregory Szorc wrote:


It sounds like the "copy build to remote machine so we can run tests 
there" is somewhat common and needs to be better supported by the tools.


I think it makes sense to leverage test archives and build packages 
for this. Since mozharness already supports running things from there, 
I guess the missing piece is some glue code that invokes mozharness 
with a sane set of default options. That missing piece sounds a lot 
like what mach does. We talked about bundling mach in the tests 
archive to facilitate things like this. Perhaps we should move forward 
with that? Or is there a better way?


The problem is at the unholy intersection of build, test, and 
automation responsibilities. Who's the owner? A-Team?


Yes, I think we should own this.  I think the ultimate solution involves 
several pieces which probably look something like this:


- bundling mach in tests.zip and making it operable from there (I just 
filed https://bugzilla.mozilla.org/show_bug.cgi?id=1058923 to track this)
- making mozharness scripts invoke mach where possible, reducing the 
complexity of these scripts and providing a common invocation path
- adding some more intelligent tooling to mozharness to make it easier 
to operate with local builds and test packages
- add self tests for our harnesses to catch problems in 
developer-friendly code paths that aren't used in automation (see 
https://bugzilla.mozilla.org/show_bug.cgi?id=1048884)
- add mach targets for triggering try jobs using local builds and/or 
test packages


With this set of tasks, I think we can meet a set of requirements that 
will prevent a lot of developer frustration:


- mach targets/developer-friendly harness options shouldn't get broken 
even though tests pass in automation
- it should be trivially easy to run tests using the same options used 
in buildbot

- it should be trivially easy to run tests from tests.zip
- it should be trivially easy to use local test packages and builds in 
try jobs


I'll follow-up with additional posts as we flesh out a plan to tackle 
this in more detail.


Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: B2G emulator issues

2014-08-28 Thread Jonathan Griffin
Some more details on how we're approaching this problem from the 
infrastructure side:


Releng recently gave us the ability to run select jobs on faster VM's 
than the default, see 
https://bugzilla.mozilla.org/show_bug.cgi?id=1031083.  We have B2G 
emulator media mochitests scheduled on cedar using these faster VM's.  
After fixing a minor problem with these, we'll be able to see if these 
faster VM's solve the problem.  Local experiments suggest they do, but 
it will take a number of runs in buildbot to be sure.


If that doesn't fix the problem, we have the option of trying still 
faster VM's (at greater cost), or trying to run the tests on real 
hardware.  The disadvantage of running the tests on real hardware is 
that such hardware doesn't scale very readily and is already stretched 
pretty thin, and the emulator doesn't currently run on our linux 
hardware slaves, and will require some amount of work to fix.


This work is being tracked in 
https://bugzilla.mozilla.org/show_bug.cgi?id=994920.


Jonathan


On 8/28/2014 3:06 PM, Randell Jesup wrote:

I wrote in April:

The B2G emulator design is causing all sorts of problems.  We just fixed
the #2 orange which was caused by the Audio channel StartPlaying()
taking up to 20 seconds to run (and we "fixed" it by effectively
removing some timeouts).  However, we just wasted half a week trying to
land AEC & MediaStreamGraph improvements.  We still haven't landed due
to yet another B2G emulator orange, but the solution we used for the M10
problem doesn't fix the fundamental problems with B2G emulator.

You can read the earlier thread (starting 7-apr) about this issue.  We
wallpapered over the issues (including turning down 'fake' audio
generation to 1/10th realtime and letting it underflow).

The problems with the b2g emulator have just gotten worse as we add more
tests and make changes to improve the system that give the emulators
fits.

Right now, we're looking at being blocked from landing important
improvements (that make things *not* fail due to perf timeouts in
real-user-scenarios) because b2g-emulator chokes on anything even
smelling of realtime data.  It can stall for 10's of seconds (see
above), or even minutes.  Even running a single test can cause other,
unrelated tests to perma-orange.

The stuff we've had to do (like turning down audio generation) to block
oranges in the current setup makes the tests very non-real-world, and so
greatly diminishes their utility anyways.

There was work being done to move media and other semi-realtime tests to
faster hardware; that is happening but it's not ready yet. (For reference,
in April tests showed that a b2g emulator mochitest that took <10
seconds on my Xeon took 350-450 seconds on tbpl.)


The fundamental problem is that b2g-emulator can't deal safely with any
sort of realtime or semi-realtime data unless run on a fast machine.
The architecture for the emulator setup means the effective CPU power is
dependent on the machine running the test, and that varies a lot (and
tbpl machines are WAY slower than my 2.5 year old desktop).  Combine
that with Debug being much slower, and it's recipe for disaster for any
sort of time-dependent tests.

...

So, what do we do?  Because if we do nothing, it will only get worse.

So we've done nothing (that's landed at least), and it has gotten worse,
and we're at the breaking point where b2g emulator (especially debug)
for media tests (especially webrtc) is providing negative value, and
blocking critically important improvements.

We've just landed bug 1059867 to disable most webrtc tests on the emulator
until we can get them running on hardware that has the power to run them
(or other fixes make them viable again (bug 1059878)). We may need to
consider similar measures for other media tests (webaudio, etc). In the
meantime, we're going to try to run local emulator pull/build/mochitest
cronjobs on faster desktop machines (perhaps mine) on a daily or perhaps
continuous basis.  (Poor man's tbpl - maybe I'll un-mothball tinderbox
for some nostalgic flames...)

Also note that webrtc tests do run on the b2g desktop tbpl runs, so we
have some coverage.

I hope we can find a better solution than "run it on my dev machine"
sometime soon (very soon!), but right now that's better than playing
whack-a-random-timeout or just increasing run times to infinity.


P.S. there are some interesting threads of stuff that could help a lot,
like the comment Jay Wang made in April about SpecialPowers.exactGC
taking 3-10s per instance on b2g debug, and tons of them being run (one
test took 102s to finish, and had 90 gc's which mostly took ~10s each).
Bug 1012516



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Who wishes to discuss test suites at MozLandia?

2014-11-26 Thread Jonathan Griffin

  
  
I imagine several people on the A-Team would be interested in
attending; can you cc auto-to...@mozilla.com with details when you
create a session?

Jonathan

On 11/26/14, 3:01 AM, David
  Rajchenbach-Teller wrote:


  The test suites have changed a lot during the past few months (e.g.
Assert.jsm, fail-on-uncaught-rejections), and as I have exposed on
dev-platform, I have plans to further change them.

If you are attending MozLandia and want to brainstorm future changes on
the test suites, please ping me so that we can arrange a session.

Cheers,
 David


  
  
  
  ___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



  

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Wish list for tools to help fix intermittent bugs

2014-12-09 Thread Jonathan Griffin

Thanks Andrew.

Gijs, if you'd like to see the notes we took in PDX on this topic, 
they're here: https://etherpad.mozilla.org/ateam-pdx-intermittent-oranges


Feel free to add more ideas and comments.  We're currently working on 
our Q1 plan and will see how many of these things we can fit in then.


Jonathan

On 12/9/2014 6:24 AM, Andrew Halberstadt wrote:
We had a session on intermittents in PDX. Additionally we (the ateam) 
have had several brainstorming sessions prior to the work week. I'll 
try to summarize what we talked about and answer your questions at the 
same time in-line.


On 08/12/14 03:52 PM, Gijs Kruitbosch wrote:

1) make it easier to figure out from bugzilla/treeherder when and where
the failure first occurred
- I don't want to know the first thing that got reported to bmo - IME,
that is not always the first time it happened, just the first time it
got filed.

In other words, can I query treeherder in some way (we have structured
logs now right, and all this stuff is in a DB somewhere?) with a test
name and a regex, to have it tell me where the test first failed with a
message matching that regex?


Structured logs have been around for a few months now, but only 
recently has mozharness started using them for determining failure 
status (and even now only for a few suites).


The next step is absolutely storing this stuff into a DB. Starting now 
and into Q1 we'll be creating a prototype to figure out things like 
schemas, costs and logistics. Unlike logs, we want to keep this data 
forever, so we need to make sure we get it right.


As part of the prototype phase, we plan to answer some simple 
questions that don't require lots of historical data. Can we identify 
new flaky tests? Can we normalize chunks based on runtime instead of 
number of tests?




2) make it easier to figure out from bugzilla/treeherder when and where
the failure happens

3) numbers on how frequently a test fails


I think these both tie into number 1. We aren't sure exactly what the 
schema will look like, but tying metadata about the test run into the 
results is obviously something we need to do. These questions would 
become easy to answer.


We also want to look into cross correlating data from other systems 
(e.g bugzilla, orangefactor, ...) into test results. This will likely 
be further out though.




4) automate regression hunting (aka mozregression for intermittent
infra-only failures)


Yes, this is explicitly one of the first things we'll be tackling. 
Often sheriffs don't have time to go and retrigger backfills, they 
shouldn't have to. This sort of but not really depends on the DB 
project outlined above.




5) rr or similar recording of failing test runs

We've talked about this before on this newsgroup, but it's been a long
time. Is this feasible and/or currently in the pipeline?


We're aware of rr, but it's not something that has been called out as 
something we should do in the short term. My understanding is that 
there are still a lot of unknowns, and getting something stood up in 
production infrastructure will likely be a large multi-quarter 
project. Maybe :roc can clarify here.


I'm not saying we won't do it, it would be awesome, but it seems like 
there are easier wins we can make in the meantime.




~ Gijs


Other things that we talked about that might make dealing with 
intermittents better:


* dynamic (maybe also static) analysis of new tests to determine 
common bad patterns (ehsan has ideas) to be integrated into autoland 
or a post-commit hook or some kind of quarantine.


* in-tree chunking/more dynamic test scheduling (ability to schedule 
only certain tests). One of the end goals here is for the term 
"chunking" to disappear from the point of view of developers.


* c++ code coverage tied into the build system with automatically 
updated reports (I'm working on the build integration pieces on the 
side).


* automatic filing of intermittents (this is currently what the 
sheriffs spend the most time on, fixing this frees them up to better 
monitor the tree).


Thanks for caring about the state of intermittents, they've been 
neglected for too long. I'm hopeful that 2015 will bring many 
improvements in this area. And of course, please let us know if you 
have any other ideas or would like to help out.


-Andrew
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


What are your pain points when running unittests?

2015-03-12 Thread Jonathan Griffin
The A-Team is embarking on a project to improve the developer experience
when running unittests locally.  This project will address the following
frequently-heard complaints:

* Locally developers often use mach to run tests, but tests in CI use
mozharness, which can result in different behaviors.
* It's hard to reproduce a try job because it's hard to set up the test
environment and difficult to figure out which command-line arguments to use.
* It's difficult to run tests from a tests.zip package if you don't have a
build on that machine and thus can't use mach.
* It's difficult to run tests through a debugger using a downloaded build.

The quintessential use case here is making it easy to reproduce a try run
locally, without a local build, using a syntax something like:

* runtests --try 2844bc3a9227

Ideally, this would download the appropriate build and tests.zip package,
bootstrap the test environment, and run the tests using the exact same
arguments as are used on try, optionally running it through an appropriate
debugger.  You would be able to substitute a local build and/or local
tests.zip package if desired.  You would be able to override command-line
arguments used in CI if you wanted to, otherwise the tests would be run
using the same args as in CI.

What other use cases would you like us to address, which aren't derivatives
of the above issues?

Thanks for your input,

Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using rr with test infrastructure

2015-03-13 Thread Jonathan Griffin
OrangeFactor suggests that linux is about equal to our other platforms in
terms of catching intermittents:
http://brasstacks.mozilla.com/orangefactor/?display=BugCount&tree=trunk&includefiltertype=quicksearch&includefilterdetailsexcludeResolved=false&includefilterdetailsexcludeDisabled=false&includefilterdetailsquicksearch=&includefilterdetailsnumbugs=0&includefilterdetailsresolvedIds=&excludefiltertype=quicksearch&excludefilterdetailsquicksearch=&excludefilterdetailsnumbugs=0&excludefilterdetailsresolvedIds=&startday=2015-03-05&endday=2015-03-13

Jonathan

On Fri, Mar 13, 2015 at 5:26 AM, Ted Mielczarek  wrote:

>
> The other question I have is: what percentage of our intermittent
> failures occur on Linux? If it's not that high then this is a lot of
> investment for minimal gain.
>
> -Ted
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


mochitest-chrome tests now running on B2G emulators

2015-03-23 Thread Jonathan Griffin
A mochitest-chrome job is now running on B2G emulators, and appears in
Treeherder as M(c).  This job skips most existing chrome tests, since most
of the existing tests are not compatible with B2G.  But it provides a
better alternative when writing mochitests that need chrome privileges than
using SpecialPowers in mochitest-plain.

If you want your new chrome mochitest to get run on B2G, just make sure
it's written as an XHTML file and not a XUL one; see
https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/static/chrome.template.txt
.

If you don't want your new chrome mochitest to run on B2G, just add
"skip-if = buildapp == 'b2g'" to the relevant chrome.ini manifest.

To run the tests locally, just add a --chrome argument to your
runtestsb2g.py command-line, as documented here:
https://developer.mozilla.org/en-US/Firefox_OS/Platform/Automated_testing/Mochitests#Running_the_tests_2

Regards,

Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: mochitest-chrome tests now running on B2G emulators

2015-03-23 Thread Jonathan Griffin
Thanks to gbrown for doing a lot of the hard work here!

Jonathan

On Mon, Mar 23, 2015 at 4:59 PM, Bobby Holley  wrote:

> Awesome - thank you for making that happen!
>
> All - if you find yourself using SpecialPowers.wrap for anything other
> than twiddling an occasional value or knob, you're doing it wrong. Write a
> mochitest-chrome test instead.
>
> bholley
>
> On Mon, Mar 23, 2015 at 4:38 PM, Jonathan Griffin 
> wrote:
>
>> A mochitest-chrome job is now running on B2G emulators, and appears in
>> Treeherder as M(c).  This job skips most existing chrome tests, since most
>> of the existing tests are not compatible with B2G.  But it provides a
>> better alternative when writing mochitests that need chrome privileges
>> than
>> using SpecialPowers in mochitest-plain.
>>
>> If you want your new chrome mochitest to get run on B2G, just make sure
>> it's written as an XHTML file and not a XUL one; see
>>
>> https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/static/chrome.template.txt
>> .
>>
>> If you don't want your new chrome mochitest to run on B2G, just add
>> "skip-if = buildapp == 'b2g'" to the relevant chrome.ini manifest.
>>
>> To run the tests locally, just add a --chrome argument to your
>> runtestsb2g.py command-line, as documented here:
>>
>> https://developer.mozilla.org/en-US/Firefox_OS/Platform/Automated_testing/Mochitests#Running_the_tests_2
>>
>> Regards,
>>
>> Jonathan
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: mochitest-chrome tests now running on B2G emulators

2015-03-24 Thread Jonathan Griffin
Hi Panos,

Most (or all) of the tests in that directory were not working on B2G as
written.   They were timing out and causing the harness to abort.  So I
disabled that entire directory in the interests of getting the job running.

I haven't debugged any of those, but you could remove that line from the
manifest and run them on try with the syntax:

-b o -p emulator -u mochitest-chrome -t none

Jonathan


On Tue, Mar 24, 2015 at 4:05 AM, Panos Astithas  wrote:

> This is very cool, I've been waiting for this for a long time!
>
> I see that existing tests are now skipped by default on b2g, e.g.:
>
>
> https://dxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/tests/mochitest/chrome.ini#2
>
> Is this because nobody has tested whether they work yet, or is it because
> they are not working on b2g as they are written? If the latter, do you have
> any insights on what needs to change?
>
> Thanks,
> Panos
>
>
> On Tue, Mar 24, 2015 at 1:38 AM, Jonathan Griffin 
> wrote:
>
>> A mochitest-chrome job is now running on B2G emulators, and appears in
>> Treeherder as M(c).  This job skips most existing chrome tests, since most
>> of the existing tests are not compatible with B2G.  But it provides a
>> better alternative when writing mochitests that need chrome privileges
>> than
>> using SpecialPowers in mochitest-plain.
>>
>> If you want your new chrome mochitest to get run on B2G, just make sure
>> it's written as an XHTML file and not a XUL one; see
>>
>> https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/static/chrome.template.txt
>> .
>>
>> If you don't want your new chrome mochitest to run on B2G, just add
>> "skip-if = buildapp == 'b2g'" to the relevant chrome.ini manifest.
>>
>> To run the tests locally, just add a --chrome argument to your
>> runtestsb2g.py command-line, as documented here:
>>
>> https://developer.mozilla.org/en-US/Firefox_OS/Platform/Automated_testing/Mochitests#Running_the_tests_2
>>
>> Regards,
>>
>> Jonathan
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


tryserver: the meaning of 'mochitest-chrome' is changing

2015-04-06 Thread Jonathan Griffin
Hi all,

At the next buildbot reconfig, the meaning of the string 'mochitest-chrome'
in tryserver syntax is changing.  Previously, it was an alias which would
be translated into 'mochitest-o', the job in which mochitest-chrome is
run.  Now, it will be interpreted as a job name.

This means if you want to run mochitest-chrome on B2G emulators, you'll be
able to specify "-u mochitest-chrome".  But, if you want to run
mochitest-chrome on desktop Firefox, you'll need to specify the job name
which contains mochitest-chrome, which is "-u mochitest-o".  Specifying "-u
mochitest-chrome" on a desktop job will result in a no-op.

This should have little to no impact; trychooser produces a syntax which is
unaffected by this, and a quick look a try commits show that no one has
actually used "-u mochitest-chrome" in at least the last two weeks.

See https://bugzilla.mozilla.org/show_bug.cgi?id=1147586 for details.

Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Can we make try builds default to a different profile than Nightly/Aurora/Beta/Release builds?

2015-04-08 Thread Jonathan Griffin
There is also an old, unmaintained GUI for managing profiles:
https://developer.mozilla.org/en-US/docs/Profile_Manager

It still works, although there are a few bugs.  It may be an improvement
over command-line arguments for less technical users.

Jonathan

On Wed, Apr 8, 2015 at 12:49 PM, Gavin Sharp  wrote:

> I think you can get this fairly easily by just changing one of the
> values (Vendor or Name) in build/application.ini such that a different
> profile folder is used.
>
> Gavin
>
> On Wed, Apr 8, 2015 at 12:28 PM, L. David Baron  wrote:
> > On Wednesday 2015-04-08 12:08 -0700, Seth Fowler wrote:
> >> I think one way we could reduce the burden on users would be to just
> make try builds default to a different profile than channel builds.
> >
> > Is there a simple patch one could push to change this default, and
> > just include on any try pushes where you need this behavior?
> >
> > I'm a little nervous about making try builds differ from other
> > trees, since that just increases the risk of bustage (or bugs in
> > testing) that shows up in one place but not the other.
> >
> > -David
> >
> > --
> > 𝄞   L. David Baron http://dbaron.org/   𝄂
> > 𝄢   Mozilla  https://www.mozilla.org/   𝄂
> >  Before I built a wall I'd ask to know
> >  What I was walling in or walling out,
> >  And to whom I was like to give offense.
> >- Robert Frost, Mending Wall (1914)
> >
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
> >
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to remove SpecialPowers from Marionette

2015-05-08 Thread Jonathan Griffin
We are removing SpecialPowers from Marionette, see
https://bugzilla.mozilla.org/show_bug.cgi?id=1149618.  This means
Marionette tests will no longer be able to use SpecialPowers to gain access
to a privileged context.

As part of this effort, I'm adapting all Marionette tests in
mozilla-central and gaia, including B2G WebAPI tests and gaiatest.  Other
users of Marionette and SpecialPowers should plan to manage the change
themselves, but if you need assistance, please comment in bug 1149618.

Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


The OrangeFactor is now calculated per push

2015-05-15 Thread Jonathan Griffin
OrangeFactor [ http://brasstacks.mozilla.com/orangefactor/ ] now displays
oranges/push; it used to display oranges/testrun.

The definition of testrun was always a bit fuzzy, but was intended to
compensate for the fact that some pushes will naturally have more oranges
than others due to manual or periodic retriggers.  For example, PGO builds
on m-c are periodic, so they may appear 0, once, or multiple times on a
single commit depending on the rate of incoming pushes, and how many times
they're triggered for a particular commit will influence the total number
of oranges that exist for that commit.

At the time OF was originally designed, the assumptions behind our
definition of "testrun" produced a reasonably smooth graph of oranges over
time, but over the past couple of years, these assumptions have broken
down, with the result that the OF has become pretty arbitrary, although it
still is useful as a relative yardstick.

Because of these issues, we've changed OrangeFactor to display the actual
numbers of oranges/push.  This means that the data is more precise, but
also more variable.  In particular, you will notice significant spikes of
the OF on weekends, which reflects both the reduced amount of coalescing
that occurs then as well as the fact that we're more likely to get multiple
triggers of PGO and other builds against a single commit, because the rate
of commits is lower.  This is particularly noticeable when looking at
multi-week time slices (the default view shows only the most recent week).

What this means is that if you're using OrangeFactor to watch trends in
intermittents over time, you should be careful to compare weekdays against
weekdays, and weekends against weekends, or use the 7-day moving average;
comparisons of weekends to weekdays are likely to be misleading.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New requirement for tier 1 platforms: working assertion stacks

2015-07-10 Thread Jonathan Griffin
For quite some time we've wanted unit tests for our test harnesses which
verify issues like correct end-to-end handling of crashes and hangs,
including generation of proper stack traces.

It's not just a per-platform issue, but a per-harness per-platform issue,
so fixing this across the board isn't trivial.

This has never quite bubbled up to the top of our priority stack, but if
it's been causing lots of pain, maybe we can increase the priority of this.

Jonathan


On Fri, Jul 10, 2015 at 11:46 AM, Ehsan Akhgari 
wrote:

> On 2015-07-10 1:12 PM, Kyle Huey wrote:
>
>> On Fri, Jul 10, 2015 at 10:06 AM, Andrew McCreight <
>> amccrei...@mozilla.com>
>> wrote:
>>
>>  Are we going to have tests for this? Does working include being properly
>>> symbolicated?
>>>
>>>
>> Working does include properly symbolicated.  Tests would be ideal, if
>> tricky ...
>>
>
> Is that practical?  Our stack walking has gotten broken on various tier 1
> platforms in various cases quite a few times, and usually we know about
> them by people noticing them and filing bugs and Ted fixing them.
>
> Is someone signing up to do the work to keep the working on all tier 1
> platforms now?
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Gecko 17 as the base for B2G v1

2012-08-02 Thread Jonathan Griffin
For a discussion of current B2G test automation status and future plans, 
see this blog post:


http://jagriffin.wordpress.com/2012/07/31/mozilla-a-team-b2g-test-automation-update/

Jonathan

On 8/1/2012 9:30 PM, Boris Zbarsky wrote:

On 8/1/12 5:47 PM, Alex Keybl wrote:

any desktop/mobile change that negatively impacts B2G builds in a
significant way will be backed out (and vice versa).


Do we have any sort of B2G test coverage? Ideally on try?

-Boris

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform