summarizing build-on-checkin feedback

Lots of feedback on the build-on-checkin idea in my blog, the newsgroup, and especially joduinn's recent post on the subject. The primary concerns seem to be:

  • we need as many performance tests per checkin as possible

I've filed bug 410869 to track this. I think the way we do this now is wrong, and we'd get more performance cycles if we fixed this by separating the start time of the test from the revision that the test is for. Also, we should do a separate perf test for each checkin, not just the latest when the perf machine becomes available, to be able to track down regressions to a specific changeset.

  • sometimes the build breaks for non-checkin reasons, and someone needs to be hunted down to correct it if it's build-on-checkin

I think this is mainly a fault of not having adequate monitoring, auto-recovery, and load-balancing of the server farm, and not giving the right people access to force builds directly. bhearsum is rocking the monitoring side in bug 410019 so we'll know as soon as anything goes wrong at the machine level, and Buildbot can do the load-balancing and give developers an interface to force/clobber/stop builds as needed, without having to give everyone in the project a shell account or wait til the next checkin to pick up a CLOBBER file.

  • some people will still be stuck waiting for build cycles, this just moves the problem around

I think this is absolutely a valid concern, and the more I think about it, build-on-checkin isn't really all that valuable until we have multiple buildslaves able to run in parallel, so no one has to wait for the current cycle to finish in order to have their checkin tested. bug 411629 has been filed to track this.

  • CVS commits are not atomic, what if we pull a partial checkin?

Fortunately this goes away when we switch to hg for Moz2, but even for 1.8 and 1.9 branches we poll Bonsai (and can use the revision, aka branch+timestamp) that it contains, instead of just blindly pulling CVS. I don't *think* that Bonsai is susceptible to this kind of thing due to the way it groups checkins before reporting them, but please correct me if this is wrong. Also, isn't this a problem today, since Tinderbox client just blindly picks a timestamp and pulls it?

If I've missed or misrepresented anything, please let me know, and check out the dependency tree on bug 401936 for more information.