rhelmer's blog - mozilla

Vectiv and the Browser Monoculture

2019-06-15T12:22:00-07:00

So, so tired of the "hot take" that having a single browser engine implementation is good, and there is no value to having multiple implementations of a standard. I have a little story to tell about this.

In the late 90s, I worked for a company called Vectiv. There isn't much info on the web (the name has been used by other companies in the meantime), this old press release is one of the few I can find.

Vectiv was a web-based service for commercial real estate departments doing site selection. This was pretty revolutionary at the time, as the state-of-the-art for most of these was to buy a bunch of paper maps and put them up on the walls, using push-pins to keep track of current and possible store locations.

The story of Vectiv is interesting on its own, but the relevant bit to this story is that it was written for and tested exclusively in IE 5.5 for Windows, as was the style at the time. The once-dominant Netscape browser had plummeted to negligible market share, and was struggling to rewrite Netscape 6 to be based on the open-source Mozilla Suite.

Around this time, Apple was starting to have a resurgence. Steve Jobs had returned, and the candy-colored iMac was proving to be successful. Apple was planning to launch official stores, and the head of their real estate department was a board member of Vectiv, so we managed to land our first deal - a pilot project with Apple's nascent real estate department.

We picked up a few iMacs around the office for testing, and immediately hit a snag - Steve had ordered that everyone in the company, real estate dept included, has to use the new Mac OS X. The iMacs that the dept used (and that we tested on) were pretty slow, but serviceable. The real snag was that our product didn't really work on IE for Mac. Like, at all. Pages wouldn't load, and the browser would consistently crash on certain pages.

This was before Safari and its Webkit engine, We started debugging and rewriting bits of the product, and simultaneously talking to Microsoft about our problems. They were responsive, and hopeful the upcoming update would fix some of our problems. Sadly, there were to be no further updates for IE 5 for Mac.

I was something on a Unix fanboy at the time, and had been using early releases of Mozilla Suite on my Solaris workstation, so I knew that our product basically worked with some rough edges (mostly minor things like CSS, with a few less trivial problems around divergent web standards.)

Long story short, our QA manager and myself visited Apple's real estate and test folks, and we settled on using Mozilla 0.6 for the pilot, and corresponding Netscape 6 when it was released (I think we ended up using Netscape 7.1, which I recall being a lot more usable, being based on Mozilla 1.4)

Vectiv had other clients like Dollartree and Quiznos, but getting over that initial pilot hurdle was key to proving that our product worked and had backing from a known brand. Vectiv was VC backed and like many startups caught up in the dot-com crash ran out of runway, although the product was sold and did live on. I did a few consulting gigs setting up local installs for the remaining clients.

Most people reading this probably know the rest of the story - IE stagnated, AOL pulled the plug on Netscape, and Mozilla Suite was reborn as the Firefox browser. With MS moving to Google Chrome's Blink browser engine, Mozilla Firefox's Gecko engine along with Apple Safari's Webkit are the only independent implementations of the various web standards.

(Blink is technically a fork of Webkit, but IE and Netscape were ultimately forks of NCSA Mosaic, I think it's fair to call it independent at this point.)

To be clear: having multiple browser engines didn't ultimately save Vectiv, but Firefox did open the door for Safari and Chrome, as Firefox's Firebug (the predecessor of today's integrated devtools) enticed web developers enough that they made their sites more standards-compliant just so they could have access to nice devtools.

It's easy for me to write a nice narrative of the past, complete with the moral of the story. The future isn't totally certain, but it's clear that the web will continue to play a large role in the world. Let's not (again) back ourselves into a corner and cede all meaningful control over that future.

A new owner for add-ons manager

2017-11-15T12:58:00-08:00

A little over a year ago, Mossop announced a change of ownership of the add-ons manager.

I have been honored to set direction and work on such an important part of Firefox, and proud of the work I've done. A big part of this was to help teams go faster in delivering their work to users, and there was also quite a bit of performance work and review for 57 as well as changes to better support WebExtensions.

Over this time, it's become clear that the WebExtensions team is more than equipped to handle ownership of the add-ons manager itself.

In particular, Andrew Swan has been instrumental in setting and communicating technical direction as well as contributing code and reviews. The add-ons manager isn't really its own official module as such, but I believe that Andrew has shown leadership here and would like to publicly pass the torch.

Kris Maglione has also been doing quite exceptional work here, so I think either of them should be able to take a vacation (but not together) and not leave a vacuum of technical leadership.

As Mossop did before me, I am going to be updating the suggested reviewers in Bugzilla to be aswan and kmag, with me as a last resort.

Please join me in congratulating Andrew and sending him all of your add-on manager related questions!

about:addons in React

2016-11-30T12:34:00-08:00

While working on tracking down some tricky UI bugs in about:addons, I wondered what it would look like to rewrite it using web technologies. I've been meaning to learn React (which the Firefox devtools use), and it seems like a good choice for this kind of application:

easy to create reusable components

XBL is used for this in the current about:addons, but this is a non-standard Mozilla-specific technology that we want to move away from, along with XUL.

manage state transitions, undo, etc.

There is quite a bit of code in the current about:addons implementation to deal with undoing various actions. React makes it pretty easy to track this sort of thing through libraries like Redux.

To explore this a bit, I made a simple React version of about:addons. It's actually installable as a Firefox extension which overrides about:addons.

Note that it's just a proof-of-concept and almost certainly buggy - the way it's hooking into the existing sidebar in about:addons needs some work for instance. I'm also a React newb so pretty sure I'm doing it wrong. Also, I've only implemented #1 above so far, as of this writing.

I am finding React pretty easy to work with, and I suspect it'll take far less code to write something equivalent to the current implementation.

Toy Add-on Manager in Rust

2016-11-30T12:14:00-08:00

I've been playing with Rust lately, and since I mostly work on the Add-on Manager these days, I thought I'd combine these into a toy rust version.

The Add-on Manager in Firefox is written in Javascript. It uses a lot of ES6 features, and has "chrome" (as opposed to "content") privileges, which means that it can access internal Firefox-only APIs to do things like download and install extensions, themes, and plugins.

One of the core components is a class named AddonInstall which implements a state machine to download, verify, and install add-ons. The main purpose of this toy Rust project so far has been to model the design and see what it looks like.

So far mostly it's an exercise in how awesome Enum is compared to the JS equivalent (int constants), and how nice match is (versus switch statements).

It's possible to compile the Rust app to a native binary, or alternatively to asm.js/wasm, so one thing I'd like to try soon is loading a wasm version of this Rust app inside a Firefox JSM (which is the type of JS module used for internal Firefox code).

There's a webplatform crate on crates.io that enables which allows for easy DOM access, it'd be interesting to see if this works for Firefox chrome code too.

Better Source Code Browsing With FreeBSD and Mozilla DXR

2014-11-27T16:45:00-08:00

Lately I've been reading about the design and implementation of the FreeBSD Operating System (great book, you should read it).

However I find browsing the source code quite painful. Using vim or emacs is fine for editing invidual files, but when you are trying to understand and browse around a large codebase, dropping to a shell and grepping/finding around gets old fast. I know about ctags and similar, but I also find editors uncomfortable for browsing large codebases for an extended amount of time - web pages tend to be easier on the eyes.

There's an LXR fork called FXR available, which is way better and I am very grateful for it - however it has all the same shortcomings LXR that we've become very familiar with on the Mozilla LXR fork (MXR):

based on regex, not static analysis of the code - sometimes it gets things wrong, and it doesn't really understand the difference between a variable with the same name in different files
not particularly easy on the eyes (shallow and easily fixable, I know)

I've been an admirer of Mozilla's next gen code browsing tool, DXR, for a long time now. DXR uses a clang plugin to do static analysis of the code, so it produces the real call graph - this means it doesn't need to guess at the definition of types or where a variable is used, it knows.

A good example is to contrast a file on MXR with the same file on DXR. Let's say you wanted to know where this macro was first defined, that's easy in DXR - just click on the word "NS_WARNING" and select "Jump to definition".

Now try that on MXR - clicking on "NS_WARNING" instead yields a search which is not particularly helpful, since it shows every place in the codebase that the word "NS_WARNING" appears (note that DXR has the ability to do this same type of search, in case that's really what you're after).

So that's what DXR is and why it's useful. I got frustrated enough with the status quo trying to grok the FreeBSD sources that I took a few days and the with help of folks in the #static channel on irc.mozilla.org (particularly Erik Rose) to get DXR running on FreeBSD and indexed a tiny part of the source tree as a proof-of-concept (the source for "/bin/cat"):

http://freebsdxr.rhelmer.org

This is running on a FreeBSD instance in AWS.

DXR is currently undergoing major changes, SQLite to ElasticSearch transition being the central one. I am tracking how to get the "es" branch of DXR going in this gist.

Currently I am able to get a LINT kernel build indexed on DXR master branch, but still working through issues on the "es" branch.

Overall, I feel like I've learned way more about static analysis, how DXR works, FreeBSD source code and produced some useful patches for the Mozilla and the DXR project and hopefully will provide a useful resource for the FreeBSD project, all along the way. Totally worth it, I highly recommended working with all of the aforementioned :)

Deploying Socorro quickly

2014-06-21T16:45:00-07:00

I've been seeing a lot more people looking for help and information about installing and running Socorro (the software that powers the crash-stats.mozilla.com)

We've done a lot of work the past few years on making the system more flexible and are constantly working on improving the documentation, especially the installation instructions - and the more people that are able to get the system going, the more contributions we've seen.

Still, the docs have been mostly focused on getting a developer install for hacking on the system, and less so on installing and upgrading the software without having to configure and understand every component.

In response to some specific questions on the mailing list about how to install and then upgrade Socorro, we've released the deploy script that Mozilla uses internally (with some modifications to work in a more vanilla environment).

The easiest way to get a system going is to spin up a Vagrant VM and then follow the "Installing from binary package" instructions.

We also run a Jenkins bot to ensure that the Vagrant and deploy script don't regress.

This is easy enough that it's making our "Installing from source" instructions look quite baroque, so expect those to see some improvements soon too!

I'd like to give a particular shout-out to Jørgen P. Tjernø who has been doing quite a bit of work to make sure deploys are smooth - thanks Jørgen!

Etherpad 2013 Meetup Videos on Air Mozilla

2013-04-17T23:26:00-07:00

I have been working for a few months now on migrating Mozilla's Etherpad install from the original Etherpad to Etherpad Lite. I got a chance to work with a fair amount of people from the excellent Etherpad community while working on things like adding "Team Site" support to Etherpad Lite, and met even more amazing people recently during the 2013 Etherpad meetup.

Videos have just been posted on Air Mozilla:

Etherpad 2013 Meetup Part 1

Etherpad 2013 Meetup Part 2

If you are at all interested in:

original Etherpad (!)
Etherpad Lite
upcoming node.js features
operational transformation
real-time multi-user wiki(pedia) editing (WYSIWYWIKI?)
hosted etherpad solutions
other cool stuff

Then definitely check these out!

capture and replay http post using tcpdump

2012-12-07T23:06:00-08:00

Mozilla runs a crash-stats service, which accepts crash reports from clients (mobile/desktop browsers, B2G, etc) and provides a reporting interface.

Recently, a change landed on the client side to enable multiple minidumps to be attached to an incoming crash, and we want to add support to the server to accept these as soon as possible.

Our usual test procedure is to pull an existing crash from production and submit it as a new crash to our dev and staging instances. Unfortunately, we had no easy way to test this particular scenario, since the current crash collector only stores a single minidump, and discards any others. We really want real data in this case - we of course have unit tests and synthetic data, but the crash collector is a critical service so we want to get it right the first time when we push updates.

We decided that the most expedient way to get real data would be to capture from production using tcpdump, then replay this to the dev/staging servers.

There are tools readily available to do this - the major concern is that we're capturing a large amount of traffic, so we want to filter out as much as possible. Also, tcpdump has a built-in mechanism for rolling and gzipping capture files (either every n seconds, or when the file gets over n bytes).

First, run tcpdump on the target (production) server:

tcpdump -i eth0 dst port 81 -C 100 -z "gzip" -w output.pcap

eth0 is the interface we're interested in, only incoming traffic, and only port 81. The -C and -z commands will cause tcpdump to roll the output.pcap file every 100 megabytes.

This ends up producing a (potentially large) number of files:

output.pcap
output.pcap1.gz
output.pcap2.gz

When you feel you've captured enough data, stop the tcpdump process and use tcpslice to rebuild a single capture file:

tcpslice -w full.pcap output.pcap*

Then use tcptrace to reassemble the packets into complete sessions (this is necessary since TCP packets may be received out-of-order). This will create one file per HTTP session:

tcptrace -e full.pcap

Now we have a set of files named e.g. fmekmf.dat - if you take a look inside these you will see they are full HTTP sessions. They can be replayed against a dev/stage server using netcat like so:

cat aaju2aajv_contents.dat | nc devserver 80

You may need to modify the files first, to change the Host header for example. This is easy to do in-place with sed:

cat aaju2aajv_contents.dat | sed 's/Host: prodserver/Host: devserver/' | nc devserver 80

NOTE - this technique potentially uses a ton of disk space, I did this in many stages so I could backtrack in case I made any mistakes. If disk space (and overall time) are a premium, for example you are setting up a continuous pipeline, I'd investigate using named pipes instead of creating actual files for uncompressing and running tcpslice + tcptrace.

Also, if you are doing this in a one-off manner then tcpflow or wireshark (wireshark has a terminal version, tshark) are easier to work with- I wanted to do the capture on a locked-down server which had tcpdump available, and wanted to take advantage of the log rolling+compression feature.

webkit using Perf-o-Matic 2.x

2012-03-01T11:13:00-08:00

Thanks to the massive efforts of Ryosuke Niwa (rniwa), the WebKit project now uses the code from graphs.mozilla.org, check it out:

webkit-perf.appspot.com

We share all the same front-end code, the major differences are that they have their own backend graphserver and the static dashboard images are generated using the google charts API instead of node.js (makes sense, since their server runs on Google Appengine).

rniwa has been doing fantastic work and contributing tons of great features (while refactoring the code base appropriately), and he started working on this at an excellent time - just as we kicked off the Signal From Noise project which is leading the way to another major evolution of our work on measuring and tracking performance for important Mozilla projects like Firefox, Fennec and B2G.

hacking on graphs 2.0 is fun and easy

2011-05-24T15:27:00-07:00

Interested in adding features/fixing bugs/using your own data with perf-o-matic 2.0? It's easy!

git clone git://github.com/rhelmer/graphs.git

open graphs/graph.html # in your favorite browser; on Mac "open" will do the right thing

You can now hack on graph.html, js/common.js and js/graph-2.js (maybe js/embed.js and js/dashboard.js, if you're working on the embed or dashboard components). You'll be pulling live data from graphs-new.mozilla.org by default.

For most cases, that's it! If you need more, read on:

*What about the dashboard? I loaded index.html but there are no graphs!*

No problem; it's all in the INSTALL file, but here's the tl;dr version:

These images are generated by running node.js from cron, doing server-side HTML5 canvas and saving the result to a static image (PNG).

You need to install node.js and npm, then:

npm install canvas htmlparser jquery jsdom

mkdir images/dashboard

node ./scripts/static_graphs.js

You should now have static graph images in ./images/dashboard/ and index.html should look healthier.

*But I want to run the backend server, so I can post my own results!*

Check out the INSTALL file; it has an example apache config and lists the dependencies you'll need to install (note - only tested on RHEL 6, will accept patches/pull requests if you get it running elsewhere though).

*Ok, but I have my own backend server; can't I just provide my own JSON feed?*

Yes! The manifest file (for building the menu on the "Custom Chart" page) looks like http://graphs-new.mozilla.org/api/test?attribute=short and the individual test runs look like http://graphs-new.mozilla.org/api/test/runs?id=16&branchid=1&platformid=12

*Ok! But I fixed/added/rewrote something, how can I send a patch?*

Excellent! Send me (http://github.com/rhelmer) a pull request, or file a bug at bugzilla in product Webtools component Graphserver version 2.0, and thanks for contributing!

production graphs 2.0 server ready for use

2011-05-23T18:11:00-07:00

Hello,

The 2.0 version of graphs.mozilla.org is ready for use:

http://graphs-new.mozilla.org

We're not quite ready to take over graphs.mozilla.org yet - the plan is to do a phased rollout starting with this post, followed by advertising the new URL on graphs.m.o, and finally taking over graphs.m.o and moving the old server to graphs-old.m.o

This is the same version (with some minor tweaks based on feedback) as described in this webdev blog post:

http://blog.mozilla.com/webdev/2011/02/04/perfomatic2-0/

The primary difference between this and the staging server (now at graphs.allizom.org), is that graphs-new.m.o has realtime access to the production DB rather than using a nightly snapshot. The dashboard images are refreshed on 5-minute intervals, custom charts are pretty much real-time (with several layers of caching, could be a few minutes old in reality).

Thanks to everyone who has tested and provided feedback! More is welcome, we plan to continue making incremental (and perhaps not-so-incremental) improvements.

You can find more information at https://wiki.mozilla.org/Perfomatic:UI

Thanks!

rhelmer

P.S. one thing I should call out specifically - old-style graph URLs are not compatible, primarily because the new graphserver automatically shows the average of all machines in a platform rather than a separate line for each, and the old-style URLs refer to individual machines. If this is a show-stopper for anyone let's discuss, it's certainly in the realm of possibility to support.

Socorro development VMs available

2011-05-18T12:03:00-07:00

I have been working on a Vagrant virtual machine config for Socorro:

https://github.com/rhelmer/socorro-vagrant

Vagrant (http://vagrantup.com/) is a tool to automate setup of the VirtualBox VMs. It uses puppet to set up and maintain the VM (the puppet manifests are based on what we use at Mozilla for staging and production). Puppet will install and set up all dependencies such as HBase, Postgres, etc. and make sure the latest Socorro trunk is installed and configured for your dev environment.

This is still a work-in-progress and I am hoping to get continous integration up soon, (which will have the side-effect of generating downloadable VM appliances!), so in the meantime please let me know how it goes if you try it out.

perf-o-matic 2.0 news

2011-03-23T20:18:00-07:00

First of all, thank you for all the feedback for the new perf-o-matic 2.0 interface! It has been overwhelmingly positive and is utterly invaluable.

To avoid the time and expense of having to wheedle information out of me, here are the answers to some frequently asked questions:

the data on staging updates every night from production, at 00:01 Pacific. This can take a few hours, expect data to be spotty until 3 or so.
staging has been moved to graphs.allizom.org (the old advertised link will redirect)
Still using wiki.m.o/Perfomatic:UI for tracking high-level issues, but have started moving over to bugzilla and also triaging open bugs targeted for the older (current) version
production machine acquired (bug 627446), waiting on database access (bug 642258)

Current plan is to have the new production server hosted at graphs-new.m.o, which will have read-only access to production at first (I'd like to get some good automated tests in place before we start receiving data, since outages can mean tree closure).

Any thoughts? Feel free to comment here, ping me in irc, or file a bug in product Webtools component Graphserver version 2.0

gzip-encoding on tinderbox-stage needs testing

2010-06-29T11:00:00-07:00

Bug 574524 should make loading pages from Tinderbox much faster, especially the brief and full log reports. If you use Tinderbox and are interested in faster load times, please help test tinderbox-stage and comment in the bug if you think anything is broken due to this change.

canvas love

2009-02-18T23:39:00-08:00

I am reading about Bespin all over the place, with a lot of focus on SVG versus canvas, canvas not working in Internet Explorer, etc.

I don't know Bespin's plans in this area, but lots of projects which use canvas (such as flot) also test with and provide excanvas, which uses IE's VML support to provide the basic canvas API. I have read that excanvas does not work in IE8's standards mode, however it does work in quirks mode.

There seems to be lots of little explosions of creativity around the combination of faster Javascript interpreters and canvas, like these "3D in 2D" demos, Box2D physics (this works in IE8 thanks to excanvas).

I've been working on a project which does graphs and other data visualization in the browser. I ended up using jquery and flot although raphael (which uses SVG or VML, so supports IE) was in the running as well. Working with raphael is neat because everything you create is a DOM object so it's a lot like working with HTML, but in the end flot just has many more out-of-the-box features like selection support, timescales, and so on. Not having IE support is not an option, and I'd rather not depend on Flash or other plugin if at all possible; I am quite pleased that there are a ton of reasonable ways to acheive this given those constraints.

I know this stuff is obvious to most of us around here, but I'm surprised that excanvas doesn't come up more in these discussions. It is obviously not as ideal as having honest-to-goodness canvas or SVG support in all major browsers, but it's a very creative way to drag IE along, putting a Javascript wrapper around their similar-but-different native feature.

Tinderbox

2009-02-15T13:50:00-08:00

I have been meaning to respond to a bit of Aki's post which linked to me a while back.

I totally agree on quite a bit, although I'd argue that unless someone really steps up, takes a leadership role, and sets a clear future direction, then sticking with Tinderbox indefinitely is going to continue to give you diminishing returns. Tinderbox 1 has been in maintenance mode for a very long time, although cls, bear and reed do a great job of keeping it secure and limping along. Tinderbox 2 was maintained by bear for a while when he was at OSAF but he suggested Buildbot as a better alternative, and Tinderbox 3 looks like a great proof of concept but has been inactive for a very long time.

I feel that it's better to contribute to an already active community that has a lot of momentum behind it, instead of trying to build support behind home-grown products like Tinderbox and Bonsai, given the amount of work it is to build and maintain an active community and the current state of these projects. There were no active competing projects when these tools were released, and they really set the bar at a time when "continuous integration" had yet to be coined. Overall they've been hugely successful and delivered a lot of value to Mozilla and others, but without a driving force behind new development, they are not keeping up with demand. I could give you a bunch of little examples, but I think that the fact that the "blame" column (which is a critical feature) has been empty since the switch to hg says it all.

rhelmer covered the current tinderbox/buildbot split, and is among the voices I've heard/read calling for a move away from the waterfall view, which I don't completely understand. I do understand that the waterfall is far from ideal as a solitary view. But it does represent the activity of builds and build machines over a brief amount of time quite well. Even better when you have a guilty column ;-)

So, why not have both? Or multiple? Not to clutter, but to present different ways of accessing the data. Each with their own strengths.

I don't think that the waterfall is bad, it is actually quite brilliant for certain use cases; however the waterfall is at one end of the spectrum, with something like Dolske's isthetreegreen.com on the other side, and things like tinderboxpushlog somewhere in the middle. So in essence I agree, but I think the waterfall is actually not that useful in most cases. It's a pretty low-level, diagnostic type of interface.

Why do people visit Tinderbox? Here is what I think:

Should I pull the tree ("Will It Build?")
Can I check in ("Is the tree open?")
Who broke the build (and how)?
Has there been a regression in performance or other metrics?

Out of these, only the latter two are served by the waterfall, and that's only a starting point for this kind of investigation (which the waterfall does an OK job at).

I think that the first two are a much larger subset of users, and a huge and complex display is actively hurting them. Regression hunters need a much larger arsenal of tools, and the waterfall may not be the best place for them to start, and certainly isn't the last place to visit (they'll need build logs, graphs, etc.).

There's a ton of innovation going on around build and release right now, for example I really like how Hudson approaches the problems here, and also has direct support for release processes. Like Buildbot, it doesn't do everything Tinderbox does, and it has it's own tradeoffs. It's not a drop-in replacement for Tinderbox.

A drop-in replacement for Tinderbox is an interesting notion, but I think it's worth taking a step back and figuring out if you're really getting the value you could be. I think this says it better than I can:

The telephone destroyed the telegraph.

Here's why people liked the telegraph: It was universal, inexpensive, asynchronous and it left a paper trail.

The telephone offered not one of these four attributes. It was far from universal, and if someone didn't have a phone, you couldn't call them. It was expensive, even before someone called you. It was synchronous--if you weren't home, no call got made. And of course, there was no paper trail.

If the telephone guys had set out to make something that did what the telegraph does, but better, they probably would have failed. Instead, they solved a different problem, in such an overwhelmingly useful way that they eliminated the feature set of the competition.

The list of examples is long (YouTube vs. television, web vs. newspapers, Nike vs. sneakers). Your turn.

making updates easier

2008-07-30T13:37:00-07:00

For a few months now, I've been working in my spare time on a way to make configuring and serving updates to Mozilla-based applications easier.

Mozilla updates are MAR files, which are linked to by the Automatic Update Service (aka AUS2). Several tools are involved in the making of updates for production releases, chiefly Patcher, driven by the release automation framework for releases. Nightly updates use a simpler script which automatically determines where builds should be updated to; Patcher needs every update path to be explicitly specified in it's config file.

Both Patcher and the nightly script call the update-packaging tools to do the work of generating MAR files, which in turn use the "mar" utility (supports tar-like arguments to manipulate MAR files, e.g. "mar -t file.mar", "mar -x file.mar", etc.) and the "mbsdiff" utility, which generates binary patches using a modified version of bsdiff.

The update-packaging tools are in need of a makeover too, but that is a story for another day.

Getting back to how updates are served - Patcher's other job is to generate thousands of text files, which are used to configure AUS. Every possible update path, like this one for 3.0b3, is actually generated dynamically from two text files (partial.txt and complete.txt) which reside in a directory layout that is similar, but in a slightly different order, than the information in that URL (.../product/version/buildid/buildTarget/locale/channel/update.xml). These complete.txt and partial.txt files have gone through two revisions in their file format, in the first variables for the generated XML like updateType, URL to the MAR file, etc. are on a specific line number. In the second ("version=1"), key/value pairs are used.

AUS2 configuration files only reflect the current state of the system; for releases the history is in Patcher config files (Config::General). The release automation scripts automatically update and check this file into CVS, so it's not too painful to deal with in most situations. There are some outstanding bugs but overall it does what it is supposed to do.

However, it took me a very long time to get a handle on the above, and I think the separation between Patcher and the AUS server is not very useful. In fact, the method of explicit updates for all is downright unhelpful; every single release (e.g. 2.0.0.15), the following happens:

partial updates are generated from 2.0.0.14->2.0.0.15
every previous release (2.0.0.[1,2,3,4,...]) is pointed to the same 2.0.0.15 update

That means generating and publishing two text files for each (release * platform * locale) combination, which all contain exactly the same data. Also I think that taking a hint from the way the nightly system works would be useful here; 2.x should automatically point to the latest *unless* explicitly overridden, it should not require explicit configuration to do the norm. Finally, the nightly and production system should not be so different; every nightly update is a lost opportunity to test pre-releases of the production system, and having forked systems is bad for bugfixing and feature porting (note that there are no nightly updates for locales other than en-US, for example).

So, I've been thinking for a long time about how to make tools that are easier to use, understand and extend. One idea is to have the AUS server configuration be a database, not a giant tree of text files, and have the data in one place (not stored in a config file which is expanded to a giant tree of text files by a separate app). Another is to provide a simple API, and a few command line tools which use this API to modify update data and export it.

The conceptual model right now is that each release contains one update, which contains two patches (one partial, one complete). Both the database schema and the API reinforce this model.

Here's what I have working so far. In case it's not obvious, this is most definitely an early "throw the first one away" prototype:

an API for dealing with updates, in Python (Release, Update, Patch classes)
a simple database layer for storing and retrieving these objects from a MySQL database
an import plugin for AUS2 configuration, and an export plugin to straight update.xml files

The schema is based on Lars' fine work on the subject, although I did modify it slightly. This schema is not totally done yet either, for example foreign keys aren't actually hooked up, but there's enough there to see that it works. There's a run.py command in that directory that calls the importer and exporter correctly.

This means that you can read existing AUS2 data into a database (if you have it), and create or manipulate update information using the API from Python (or directly with SQL, if you like). You can generate update.xml files and put them straight onto a webserver.

What I've put together needs quite a lot more work, but I wanted to open it up for comment. Here's what I think is remaining, at least:

database should hold the history of updates, not just the current state
need a web service which talks directly to the database, as an alternative to pre-generating all update.xml files.
should use existing libs for the DB ORM (SQLAlchemy maybe?), generating XML, etc. not the home-grown things I threw together
I think it would be advantageous to make the model/schema/API more sophisticated and normalized (e.g. updates could belong in multiple channels), but I don't want to go beyond the essentials quite yet.
the new update-packaging tools should be able to read data from this system in order to automatically determine the appropriate "from" release to base partial MARs on, and also there should be some way to register that new updates are available, that access would be internal and append-only (e.g. only needs SELECT, INSERT).

I think that to solve the first, update paths should be explicitly configured once, but there needs to be business logic in the server app (or update.xml file generator) which overrides this when a newer release is available. For instance, if a user is on version 1.0 and version 1.1 is available which has a partial for 1.0, then the partial 1.0->1.1 should be served. However, if version 1.2 is available, then the complete 1.0->1.2 update should be served.

The second problem has more to do with the burden inherent in handling tens of thousands of text files (e.g. backing them up or restoring them can take a very long time), although I believe that it is useful to have the option to pregenerate the path/update.xml files, especially for people without so many updates as mozilla.org is pushing each release.

Anyway, comments welcome! Certainly feel free to nudge me if it looks like I'm going off the rails here, but I think this approach could make things a little better in update-land. I'll take patches too, but if anything serious comes of this I'll probably clean up and move over to Mozilla's repo, and rewrite a bunch, so don't take the current implementation too seriously..

tinderbox json examples back online

2008-07-15T12:17:00-07:00

Thanks to the intrepid Mozilla IT Team (in particular Trevor and Justin) for sending me the contents of people.mozilla.com/~rhelmer, I now have the Tinderbox JSON examples back online.

Since it's on my own server now and I have to pay for the bandwidth, I am not auto-refreshing the data anymore, because I don't want people actually using it :) Maybe I can hook up some kind of access to a Mozilla community server, I'll look into this later.

Here is the AJAX example, which apparently still works :). The Perf example which uses the tboxJsonApi is apparently borken :( I did a little debugging on it last night, not sure where it's breaking yet, it's probably the assumptions that my lame-o regex parsers use.

Anyway, I know that at least Cesar is working on stuff that uses this data, and I'd like to continue to make it better so file bugs.

releases on tap

2008-07-10T23:54:00-07:00

One of the things that was pounded into me while working at MoCo is the idea of having a bug tracker and using it. I literally can't work without one anymore. It's the first thing I really pushed for at my new job (they were using various ad-hoc systems for project management, but not a real bug tracker for the software dev side). I've realized that I just can't keep everything in my head, various notepads and text files, etc. and expect to get anything done, or let anyone know what my priorities are.

In return, I really tried to hammer in the idea of fast, automated release cycles. We spent a lot of time (and the release engineering team does still spend a lot of time) wrapping the build system and other tools so that they can be run and the output verified automatically, chasing that ideal of the Formula One-style hand-off to QA and to the users.

The way releases work now is incredible, just night and day from when I started at MoCo a little over two years ago. However, there's one thing that's always bugged me, and since I just had the opportunity to set up an automated build/release environment, I thought I'd expound a little bit on it.

The one thing is that nightly builds of Firefox just aren't the same as the release builds. The way updates work is different, branding is turned on, bits are signed (on Windows), the directory structure for files is different. Firefox releases are actually rebuilt from source for each release.

So what? None of these, even added up, are a big deal, right? Obviously releases work fine, and there are a ton of great people (and the tools they've made) that make sure that nothing is missed because of this. But wouldn't it be great if we could just take the nightly updates and builds that have already been put through the ringer by thousands of people, and give those straight to QA? Or if we can't have that, how about at least have the release builds put through the same tests and available to QA immediately after checkin?

Am I pushing some fanciful, architecture-astronaut utopian vision? I don't think so, because this is how I've done releases in the past, and this is how I do releases now. Let me tell you about it.

I use Hudson, which I can't recommend highly enough (well, if you're not allergic to Java, I guess). It makes this kind of process easy. It's not necessary to use it to achieve this of course, I'm just throwing this out as a data point.

On each checkin:

a unique build number is generated
a new build is generated (I also have it run unit tests, and install the software to run functional tests)
release files and other artifacts like build logs are archived, and checksums of the files are stored
if anything goes wrong, the team and the developer who checked in the latest change are notified

The software is available to QA as soon as this automated process is complete. When it's time to release, I can tag the build via the web UI (although it's easy enough to do outside of Hudson if you have the build number, which in turn contains the branch/datestamp/revision info needed).

Having the next release always "on tap" makes it easy for me to largely ignore the build/release side of things, and focus on developing software, writing tests, and tracking down problems.

Now, Mozilla's situation is way more complicated, which I alluded to a bit earlier. This post isn't a "see what I can do!" rant as much as a "look what's possible!" idea. I think that this kind of setup is totally doable for Mozilla's products, but there are some serious issues:

branding is turned on at compile time. having nightly builds not called "Firefox" is a *good* thing, as otherwise end-users would be very confused.
"--enable-tests", needed for unit tests, cannot be run in release builds at the moment (for technical reasons outside the scope of this post; I'm sure there are bugs on this)
release builds are signed and have a different filename format and directory structure (e.g. "firefox-3.0.pre.en-US.win32.installer.exe" for nightly versus "3.0/win32/en-US/Firefox Setup 3.0.exe")
release builds are cryptographically signed, to assure users that these files really were created by MoCo (regardless of what mirror or download site they may have come from).
nightly updates are only for en-US, and use a different set of tools to generate updates, and a different mode of the update server to serve updates (some ideas for fixing this problems are in bug 410806, but again this is outside of the scope of this post)

So all of these are pretty much good things (branding, signing, etc.) or technical issues that could surely be fixed (nightly updates, unit tests). Arguably, nightly users and release users tend to be very different people, with very different needs and expectations, so all of the "intentional problems" here are really good things. This pretty much eliminates the possibility (as far as I can see) that Firefox release engineers could take a nightly build and be able to ship that as a release build.

Even if the branding issue were solved (e.g. repackaging), signing still needs to be done, partial diff files would need to be regenerated, and probably other things that I'm overlooking. The automated tests that were run on the nightlies may not be applicable (you may scoff at the paranoia, but there was a bug regarding the size of the Vista icon in official branding found late in the Fx3 beta cycle which caused a bunch of grief. This situation was improved by making a Minefield version of the same icon, which is a good fix, but I think my point still stands).

Here's another option - why not create a real, honest-to-god Firefox release build, on each checkin (or at least alongside each nightly build)? This at least makes it available to QA as soon as humanly possible, and it could probably be opened up somehow to interested community testers (human-triggered builds are right now, just put into a special area).

Maybe I'm just spoiled working on little tiny projects, but I think even the already super-fast and extensively tested Firefox releases could be made super-faster and the tests extensivlier, at the cost of freeing up the release engineers of the need to babysit the One and Final Release Build.

on moving to buildbot for reals

2008-04-08T01:32:00-07:00

People are often very confused by the state of where Mozilla is with regard to Tinderbox versus Buildbot. They are both continuous integration systems, and you'd think that just jumping wholesale would be easier than the unholy marriage I've described in the past.

The big distinctions are these:

server vs. client - Buildbot clients and server are tightly coupled, and communicate through an active TCP connection (managed by Twisted). Tinderbox clients simply send email to the server, one for build start and one for build stop (build stop has the status specified, which changes color on Tinderbox server). The logfile for the build may be attached to the "end" email.
Tinderbox server vs. Buildbot server - tinderbox.mozilla.org puts up with a lot of load. Buildbot server can probably not handle this. Also, Tinderbox server has a bunch of features that Mozilla developers depend on, like setting status, etc.

Personally I feel that Tinderbox is the wrong way to visualize what developers actually need, but I'll save that for a later and more productive post :) For now, suffice to say that Tinderbox server does a lot more and can handle way more load than Buildbot server.

However, Buildbot server does have some very nice qualities, like being able to see the log in real-time, and being able to stop and force builds. So, an interim solution is to have Buildbot server send email to Tinderbox server on behalf of it's clients, so you get Buildbot as an administrative, developer-only interface, and Tinderbox server as the general, public interface.

The 1.8 and 1.9 nightly builders are already exposed to nightly users; there are a couple kinks to work out, so I won't link to it right now (I'll let the people that are actually maintaining it do that :P), but the glorious future is that developers can stop and kick builds as well as see real-time logs.

So, that's all well and good, and I think fairly well understood. Now here's the hairy part - the 1.8 and 1.9 nightly Buildbot clients are turning around and calling Tinderbox! WTF! (note that the unittest and moz2 buildbots do not do this, only the 1.8/1.9 nightly boxes). This is because Tinderbox client contains code to do a bunch of things:

mozilla-specific build process
performance testing
create updates
publish updates (nightly AUS only)
rebooting windows 9x between builds (not joking)
support for a bajillion products and platforms (mostly through huge "if" blocks)
support for hybrid depend/clobber builders
support for uploading to various locations on FTP
much, much more

Some of these features are very useful and not available elsewhere, and some are obviously not useful anymore. The error and log handling leaves a lot to be desired; it's not something trivially fixable, unfortunately (lots of people have tried, resulting in not one but two attempted rewrites).

Getting all of the useful bits of this into Buildbot has been a real challenge, but Ben Hearsum has all of the important bits worked out for moz2. I'm hoping to spend some time packaging that up as a BuildFactory, to make it easy to reuse this code for other branches and products (mostly because I'd really like to see bug 421586 get fixed), strictly as a community member of course :)

You can read more about Buildbot process-specific factories (that's a nice example of what a GNU Autoconf style project could use, which comes with Buildbot) but suffice to say it's a way of encapsulating the basic build process so you don't need to copy and paste "cvs co client.mk", "make -f client.mk MOZ_CO_PROJECT=blah" for each builder in your Buildbot master.cfg

This brings up the other big missing piece, which is that Buildbot's awesome Source class can't be used because it doesn't understand that it can't just update the whole "mozilla" CVS module, but needs to use the client.mk instead. This means that built-in clobber support and the built-in "tryserver" support can't be used (the current Mozilla implementations have a lot of custom code).

Bug 414031 suggests a possible way to implement support for it. Although it's kind of a pain to implement, using a driver script like this is fairly common in Java projects, so I think some kind of generic support might be feasible.

If you're not sure what I'm talking about here and why Source can't be used out of the box, the client.mk only does a partial checkout of the "mozilla" CVS module depending on which MOZ_CO_PROJECT is specified. Also, it can and does check out different versions of subdirectories, such as NSPR and NSS.

In other words, this is not your typical "checkout module && ./configure && make" project, although it is deceptively close in some ways :) It'd probably be better to have basic support for this flow, just based on principle of least surprise. I think that it also has material effect on tool support and new developers, too.

rel-o-mation slideware!

2008-04-04T10:39:00-07:00

I put this set of slides together to explain what state the release automation project is in. It probably makes more sense when I am sitting there to explain what each point means, but I figured I'd put it out there anyway :)

The current setup mimics ye olde manual release process, forged by Chase. Over the past few years we've worked on wrapping that process in scripts with this perl framework (aka "bootstrap"), which auto-generates configs for underlying systems like tinderbox and patcher, checks logs for errors, etc.

A lot of the current bugs come from underlying systems, especially the tinderbox client. Reducing some of the complexity here would both make the system more understandable and most likely faster as well. It's pretty tough to make changes when you're doing this level of wrapping, too.

Now that everything is driven by Buildbot, it probably makes the most sense to call the build system directly, instead of buildbot->bootstrap->tinderbox->build_system that we have today. There are bugs on all of this already, hopefully the slides and this post will make it clearer how they tie together.

Breakout!

2008-03-19T14:45:00-07:00

I was working through some Pygame tutorials last week and thought it'd be fun to see if Canvas/JS was fast enough in Fx3 to do some simple games.

So, I spent a couple evenings last weekend and made a really dumb Sprite class, and stole some reasonable "breakout physics" from this tutorial to make this Breakout clone in JS.

The collision detection for the bricks is a little sloppy (there's a little damage on bricks from time to time) and I haven't done any perf work yet, but it seems to work ok in Fx3 nightlies on my MBP. Safari works ok too, just not quite as fast.

Any activity in other tabs seems to have a huge impact on performance, there's probably a better way to do the sprite maneuvers etc. but I only had a few hours to spend on this so far. Pointers welcome :)

moving 1.8 nightlies to release machines March 5 2008

2008-03-04T12:34:00-08:00

As previously announced on Tinderbox and planet, we're migrating nightly production to running on the same machines as release production.

On the moz1.8 branch, we've been running the new nightlies in parallel with the "traditional" nightlies since Feb 15 2008, and are going to switchover live tomorrow.

The new machines:

* production-pacifica-vm

* production-prometheus-vm

* bm-xserve05

The old machines:

* pacifica-vm

* prometheus-vm

* bm-xserve02

Starting tomorrow, the performance machines will begin following the new machines. The new machines will publish updates and nightly builds to the usual location, and the old machines will be disabled (but kept around for a while, just in case).

If there is a reason that we should not proceed, or if you see any problems after the migration, please update bug 417147 or email build@mozilla.org.

Thanks!

Rob

moving nightly Mozilla1.8 Firefox to release automation system

2008-02-14T19:35:00-08:00

I've just enabled nightly builders from the release automation system on the Mozilla 1.8 tree (see bug 417147 for details).

I've blogged on this previously, but just to reiterate some of the reasons:

unify the (very fragmented) nightly and final release processes (tools, procedure, etc).
move away from Tinderbox client to Buildbot
use the same set of machines for both nightly and release

The first point is a really big one for me, using totally different tools for nightly and release means that we don't get much testing of our release-only procedure and tools, so we often hit unexpected bugs on release day, and it also leaves nightly users without the benefits we provide for releases like automated update verification, updates for all locales, thorough error checking and monitoring of build machines, automated staging runs before pushing changes live, for a start.

The current setup still uses Tinderbox, it's just being invoked by Buildbot, so developers should notice no change besides new hostnames. We're trying this out on 1.8 branch first before we tackle 1.9, so far it has been quite smooth but please let us know if you notice anything out of the ordinary. We have not switched over perf tests yet, but we expect the results to not change (although we may want to merge some graphs for developer convenience, etc). This will happen before the old machines are turned off.

We're planning on turning off the older 1.8 builders sometime after February 25th, so please do let us know if you see any problems. I've left a note with the names of the new builders at the top of the Mozilla1.8 Tinderbox tree.

This is only one tiny step towards improving life both for the build&release group and also developers and nightly testers, but it's quite significant from an infrastructure point of view, and has been brewing for a long time. I'm not sure what the next steps are going to be, but I've written up some thoughts on where I think we should go and why.

tinderboxJsonApi 0.1

2008-01-17T00:56:00-08:00

Many people have told me that they were excited about the JSON Tinderbox feed, but were quickly discouraged from doing anything fun due to the scary data structure that it presents; it's a straight dump of what the server uses, and is obviously optimized towards making a waterfall display (plus, it's just plain weird).

I set up an enhanced waterfall as an example a while back, but it's really hard to take it further without spending a lot of time digging around inside the tinderbox_data object.

I've often wished that I could just sort by column in Tinderbox, so instead of doing yet-another one-off script I put together a little web app that gives you a sortable table of the latest (non-talos) perf data: Analysis paralysis

Click on the headers, and you get data sorted by your criteria. The data is real-time, but does not auto-reload.

I started to hit a wall almost immediately due to the machinations required for the tinderbox_data structure, so I stepped back and took some time to write a tboxJsonApi.js instead of dealing directly with the data from Tinderbox. This lets me write code like:

::: <script src="http://tinderbox.mozilla.org/Firefox/json.js"><script>tree = new Tree(tinderbox_data);builds = tree.getBuilds();for (i in builds) { build = builds[i]; build.getName(); build.getStartTime(); build.getStatus();</script>

You can get checkins for a particular build, or test results (the scrape data is processed, right now it only supports anchor tags with "key: value" format link text, which is why Talos isn't yet supported).

There's a bunch more stuff I want to do before this will be generally useful to me, e.g. CSV export, merging all build, perf and test data for a checkin into one row, etc. but I think it's obvious that we could have more useful tools for tracking and analyzing the absolute mountian of data that mozilla.org produces every day.

Let me know if you find this useful, and/or have any questions or ideas for improvements. I was able to throw this all together in a few hours this evening, because I spent so much less time wrestling with data structures and more modeling the kind of app I wanted.

summarizing build-on-checkin feedback

2008-01-09T23:08:00-08:00

Lots of feedback on the build-on-checkin idea in my blog, the newsgroup, and especially joduinn's recent post on the subject. The primary concerns seem to be:

we need as many performance tests per checkin as possible

I've filed bug 410869 to track this. I think the way we do this now is wrong, and we'd get more performance cycles if we fixed this by separating the start time of the test from the revision that the test is for. Also, we should do a separate perf test for each checkin, not just the latest when the perf machine becomes available, to be able to track down regressions to a specific changeset.

sometimes the build breaks for non-checkin reasons, and someone needs to be hunted down to correct it if it's build-on-checkin

I think this is mainly a fault of not having adequate monitoring, auto-recovery, and load-balancing of the server farm, and not giving the right people access to force builds directly. bhearsum is rocking the monitoring side in bug 410019 so we'll know as soon as anything goes wrong at the machine level, and Buildbot can do the load-balancing and give developers an interface to force/clobber/stop builds as needed, without having to give everyone in the project a shell account or wait til the next checkin to pick up a CLOBBER file.

some people will still be stuck waiting for build cycles, this just moves the problem around

I think this is absolutely a valid concern, and the more I think about it, build-on-checkin isn't really all that valuable until we have multiple buildslaves able to run in parallel, so no one has to wait for the current cycle to finish in order to have their checkin tested. bug 411629 has been filed to track this.

CVS commits are not atomic, what if we pull a partial checkin?

Fortunately this goes away when we switch to hg for Moz2, but even for 1.8 and 1.9 branches we poll Bonsai (and can use the revision, aka branch+timestamp) that it contains, instead of just blindly pulling CVS. I don't *think* that Bonsai is susceptible to this kind of thing due to the way it groups checkins before reporting them, but please correct me if this is wrong. Also, isn't this a problem today, since Tinderbox client just blindly picks a timestamp and pulls it?

If I've missed or misrepresented anything, please let me know, and check out the dependency tree on bug 401936 for more information.

perf impact on nightly release automation move

2007-12-28T20:26:00-08:00

If you care about the behavior of the Firefox perf test machines, please check out my post moving Mozilla1.8 tinderboxes to Buildbot - perf impact in the mozilla.dev.performance newsgroup.

The big question is whether we can move to a model where we only build on checkin rather than continuously. This change would mean faster build turnaround times for developers, and a reduced load on build machines. It also means that the perf machines cycle less often. Currently, there's no way to disambiguate the start time of the run versus the latest revision in the build (for CVS, revision in this sense is branch+timestamp), Tinderbox and graph servers all expect build and perf run to be the same thing.

In case you're wondering why I'm worried about the Mozilla1.8 tree, if all goes well with this rollout we'll want to do it this way on Firefox tree as well; the Mozilla1.8 branch is stable and already on release automation, so we think it makes sense to start there first.

tinderbox to buildbot: moz18 branch

2007-12-19T03:53:00-08:00

I've set up the release automation staging server for the Mozilla 1.8 branch (Firefox 2.x) to also generate nightly builds and depend builds on checkin to the branch (using buildbot's BonsaiPoller). I outlined some of the advantages to this release automation/nightly+depend integration in my previous post.

You can see the results on the Mozilla1.8-Staging Tinderbox tree.

The main impediment to taking this live are the performance test machines. These machines currently only cycle when a new build is available, but ideally we'd want them to keep re-testing the same build as many times as possible, to get more stable test results. Since the Tinderbox-driven depend builds currently continously cycle instead of waiting for checkin, we tend to get several test builds for the same source code as a side effect.

These machines forge their start time to match that of the build it came from which allows for easily matching up checkins and build times to performance results, but this doesn't really make sense if we're doing multiple test runs per build.

I've started a thread in the mozilla.dev.builds newsgroup with the subject "moving Mozilla1.8 tinderboxes to Buildbot" for general discussion about this idea.

tinderbox to buildbot, step 1

2007-12-06T00:09:00-08:00

As I mentioned previously, I've been working on incrementally moving our Tinderbox client installs over to Buildbot.

The first step is to switch from driving Tinderbox from Buildbot and our release automation system, instead of being driven on each machine from the multi-tinderbox script. The release automation still calls Tinderbox client underneath, so features like CLOBBER support and all of your other favorites remain.

I have the staging automation builders publishing to the Tinderbox MozillaStage tree. Note that it's using Mozilla1.8 builders but firing off builds when it sees trunk checkins; this is because I want to make sure the Bonsai polling is working and Mozilla1.8 doesn't get very many checkins :) Also, I'm trying to stay out of the way of the ongoing trunk automation work that bhearsum is driving (AKA, letting him find and fix the trunk+release_automation bugs before I add nightly support). Expect to see trunk nightlies up there in the next few weeks.

This has several advantages right off the bat:

same release process we use for production (currently a very small subset
only build on checkin, should help cycle times
same builders used for nightly and production releases (admittedly, this is how it used to be before release automation; but now we can let Buildbot handle the queuing instead of either interrupting depend/nightly builds or running multiple builds on the same machine, which is slow)

As we continue to make the nightly and final release process more alike, we can start taking advantage of things like that only final releases have but are missing on nightlies:

updates for l10n (only en-US gets updates currently :( )
update verification
publishing the source tarball used to buld
using the same timestamp for all platforms

On the administration side, it should let us manage builders centrally, more quickly and easily deploy new builders, and with a little more work will let us parallelize builds (multiple build machines per column, or buildslaves per builder in Buildbot parlance), which should further help cycle time (no having to wait for the current build to finish to get a build started with your fresh checkin).

Comments/questions/concerns welcome! Feel free to email me, the build group, or post in mozilla.dev.builds newsgroup if you'd like to discuss further.

Migrating Tinderbox to Buildbot

2007-11-25T01:53:00-08:00

I've started working on migrating the Firefox nightly builds to use the same release automation system that we've been developing for the past year or so for maintenance releases (Firefox and Thunderbird 1.5.0x and 2.0.0.x). The reason this is important is that each nightly release (installer, update, etc.) is practice for the real thing, and we should be using the same tools and verification processes wherever possible (right now both Nightlies and Releases use Tinderbox client [version 1] for build and l10n repack, all other aspects of the release process are not tested until the first release candidate. Well, we have a staging server that tests the release automation in isolation, but it's not the same as having real nightly testers looking at the results :) ).

The scope of the current release automation framework (Bootstrap) has been to leave as much of our existing process in place as possible, and not try to simplify or optimize. This kind of low-risk approach is the right thing to do when you're overhauling the release process on a maintenance branch, but it has created many layers of frameworks:

Buildbot->Bootstrap->TinderboxClient->MozillaBuildSystem

As you can imagine this can be quite nightmarish to debug and add features to. I believe strongly in backwards compatibility and incremental development, but the Bootstrap and TinderboxClient client layers are largely invisible to anyone outside of the Mozilla Build&Release team.

I think where we really want to be is:

Buildbot->MozillaBuildSystem

Wherever possible, Buildbot should do the same things a developer would do, and the configuration should be as clear as possible to read and modify.

I have some thoughts on how to get there on the wiki. The first step is to slot Bootstrap into place which is actually pretty easy as it just calls Tinderbox Client anyway. The larger work here is moving to the more direct "Buildbot->MozillaBuildSystem" scenario, for which I have a working prototype and it's configuration, if anyone is interested in seeing more.

Note that I'm not talking about changing Tinderbox Server or any of the existing mechanisms that developers use to clobber builds or check build status. bhearsum added Tinderbox Server and Bonsai support to Buildbot a while back, so builds show up on Tinderbox and we can configure them to be triggered only on checkin (as opposed to the continuous loop that Tinderbox Client currently does).

I have started a newsgroup thread in mozilla.dev.builds (subject: "Migrating Tinderbox to Buildbot"), please follow up there if you'd like to discuss.

EDIT - fix typo

Tinderbox JSON - now with 100% more AJAX

2007-09-01T23:10:00-07:00

As promised, I have published a more AJAXy example of the classic Tinderbox waterfall, built using the new Tinderbox JSON output mode:

http://people.mozilla.org/~rhelmer/mockups/tinderbox/ajax.html

This version uses gzip encoding for the JSON data, only reloads the page when new data is available, and I've cleaned up the code quite a bit (split into separate functions for easier profiling, using innerHTML instead of document.write(), etc.).

I'm hoping to use this as a base to start making more fundamental improvements to the waterfall UI. Jesse suggests having the column headers always at the top as you scroll, which sounds pretty awesome to me. luser's test page now shows the percentage change from the last run for performance numbers, which I will merge into my version soon.

Towards human-free releases

2007-09-01T22:52:00-07:00

We took a big step towards truly hands-off releases by doing a (very early) Firefox 2.0.0.7 RC1 with the Buildbot-enabled release automation system. There are still some kinks to work out, but overall things are looking great.

The elapsed machine time from "code freeze" to "ready to ship" was ~15 hours, actual time was +12h or so waiting for someone to do the signing. This does not include time for QA, but a lot of that can be interleaved, and hopefully further automated for maintenance releases (as they generally include no new features).

I know that we're already very good (10FD ftw), but I know we can do better. Imagine with me, if you will, that we had a timeline like this:

Day 1 - security exploit announced

Day 2- RC available

Day 3 - fix available on auto-update

Are there any other software vendors that ship security fixes to a locally-installed application on such a compressed schedule? I'd really like to know; please leave me a comment or email me privately if it's sensitive. I'd love to be able to measure how we're doing, and that's tough without knowing how others measure this.

On a more general note, I think that release automation software should become a commodity just as web servers, continuous integration systems, etc. have. If you want to help out or just see what we're doing, check out the Mozilla release automation docs.

Tinderbox waterfall in Javascript

2007-08-31T00:37:00-07:00

Just a quick follow-up on my earlier post about Tinderbox's new JSON output mode - I've hacked up a quick working example of the waterfall page in Javascript. You can view the source to see how you could extract any of this info from Tinderbox.

Here's a screenshot in case it stops working :)

You can click that image to get to the live version. I used a bunch of code, images and ideas from Ted Mielczarek's better-written example. In particular, the idea to parse out the OS and purpose of each column, and display it more directly, and I took some bits of code that looked better than the way I was doing things.

We're already discussing changes that will break these pages, so don't get too cozy with them. I am pretty excited to have something to show for this so quickly though, especially having spent very little time so far.

I am anxious to start trying to improve the experience for tinderbox users (in particular by taking use cases into account), which for me is the whole point of this exercise. If you have ideas for better UI for displaying the kind of information that developers, testers and other Tinderbox users need, this is a great way to mock up a real-life example, so have at it!

Tinderbox JSON output

2007-08-29T16:55:00-07:00

Thanks to justdave for getting our Tinderbox server installation up to date!

One of the new features I'd like to highlight is a quick&dirty JSON output format.

This is different from all of the existing Tinderbox modes (e.g. quickparse) in that it's a dump of the internal data structure that Tinderbox uses to build the waterfall output. This means two things:

it's fairly messy
it's 100% complete

#1 we can deal with by cleaning up Tinderbox itself (this came from a proposal by cls, which I initially disagreed with but now see the point of).

#2 means that anything you can see on the waterfall page, even going back in time, is accessible.

Hopefully that will make this a good choice for doing things like creating alternatives to the waterfall display and any Tinderbox data mining.

A word of warning - the JSON output will change, as Tinderbox itself changes (and is hopefully cleaned up). This is a good thing though, as this data structure is pretty funky as it stands.

Here's the cached version, which is what you'll want to use most of the time:

http://tinderbox.mozilla.org/Firefox/json.js

This should be updated every time Tinderbox receives an update from a builder. It contains an object named "tinderbox_data" with a ton of data in it. For example, here is how you can pull the latest build status:

<script src="http://tinderbox.mozilla.org/Firefox/json.js"></script><script>for each (builder in tinderbox_data.build_table[0]) {   if (builder.buildname != undefined) {     document.write('build name: ' + builder.buildname);   }</script>

We're working on enabling cross-site XMLHttpRequest (bug 394207), so I'll give a more AJAXy example next time. I think that using XMLHttpRequest is going to be the preferred way to access this data, not only because you can build slicker UIs, but you can take advantage of the "If-Modified-Since" header to only pull data as-needed, as the JSON file is rather large.

visual cues

2007-05-29T15:22:00-07:00

The Tinderbox waterfall page is pretty detail-heavy. I see a lot of complaints specifically that the page gets too wide, making it hard to see at a glance if anything is failing to compile or failing a test (which for Mozilla means that you can't check in until it's fixed).

One way to make this better is to collapse the information and provide visual cues:

visual cues

Ok, well maybe not that exact picture :) But the point is that all Linux, Windows and Mac machines are represented by one column per build-type, not one column per machine.

Another useful thing to do would be to make use cases, and provide different front-ends instead of just the waterfall page. For example, there's the "can I check in right now" use case, which is different than the "are all machines reporting ok" and "what do the performance numbers look like today" use cases.

Right now, we have the graph server, and to track down e.g. a perf regression window can be pretty painful (mostly because you can't see checkins along the graph; this is probably most appropriate for the new graph server).

For the "can I check in now" use case, making the front page of Tinderbox more friendly would be a good first step.

pretty pictures

2007-05-25T14:18:00-07:00

preed always makes fun of my release process diagram (as seen on whiteboards everywhere):

release process

So I made a fancier one, showing the inside of each of these steps:

step

However, I still feel that preed doesn't appreciate it, so I dedicate this diagram to morgamic.

Bootstraps, pulling oneself up by one's

2007-05-19T22:57:00-07:00

We've been using the release automation scripts, aka Bootstrap, for the past several releases. We've hit some bumps but overall we've improved quality as we've been pushing changes into the scripts instead of having to document and remember to re-check a large list of "gotchas" whenever we run into problems, or need to add new steps to the release (e.g. DLL/EXE signing). Every release "step" has a set of verification tests, which we've been augmenting and then not having to think about next release.

Repeating the same set of steps every 6-8 weeks sounds pretty terrible; it's short enough so that it feels like you've done it a million times already, and it's just long enough that you get a little fuzzy on the details and have to constantly refer to the documentation. Even worse, you have about a dozen individual files to edit, hundreds of commands to run, and if any of it is incorrect you (generally) need to go back and start over from there. After all, you can't build without a tag, repack without a build, or generate (all of the) updates without repacks.

The part that's exciting, challenging and fun is when you start to break down the big, scary problem into a set of small, manageable problems. Abstract the small problems into discrete steps and automate them, so that you don't need to worry about the individual details every time you use them. You can examine each step separately, optimize it, test it, and try to get as close to absolute consistency as possible. Do paranoid, pedantic tests for correctness that would drive a person mad.

In short, it's basically refactoring and unit testing. When you start doing it after-the-fact there's a high ramp-up cost, but once you've got the ball rolling it starts picking up serious momentum.

The next big hurdle is end-to-end automation. Right now, with the automation and infrastructure as-is, a human has to:

log into Tag machine, kick off tag script
log into win32, linux and mac tinderboxes and kick off build script
verify builds and copy to the candidates directory
configure l10n, update generation/verification
kick off l10n build script on win32, linux and mac tinderboxes
verify l10n builds and copy to the candidates directory
sign win32 EXEs/DLLs
log into update machine, kick off patch generator
log into staging machine, kick off staging script
turn on test updates
sign installers
create bouncer links, push bits to mirrors
turn on updates

This is not including the huge number of config/version bumps that go along with all of these disparate systems. If you spend a lot of time going over all of these files, it seems pretty obvious that we could be putting in one set of info and generating all of this data.

We're actually now at the point where we can do all the tagging/version bumping automatically, generate Tinderbox mozconfig/tinder-config.pl based on the single bootstrap.cfg, and generate the patcher2 configs (which creates partial updates and configures AUS).

However, we still need to log into the individual machines described above, check out/update the scripts, and run them. Each of these processes generally take between 1 and 4 hours, so having them run back-to-back would not only reduce the total time to do a release (should be fine running all night, or over weekends), it should help to reduce mistakes and eliminate the time-wasting polling that we currently have to do (although bootstrap does support sending email notifications now, so at least it can be event-driven).

We've been looking at Buildbot to help tie this into a seamless process. Buildbot supports both the idea of BuildSets (e.g. win32, linux and mac builders all operating as one pass/fail operation) as well as dependent steps e.g. Tag -> Source -> BuildSet(linux,mac,win32) -> Repack(linux,mac,win32) -> Updates -> Stage.

My original idea for pushing this all together was to send Changes into Buildbot from Bootstrap everytime a new step was ready, but preed looked into it more and realized that buildsets and dependent steps already do what we need. This is great, because it moves us more incrementally from "Human logging into 10 machines to run 1000 commands" to "Human logging into 10 machines to run 1 script" to "Buildbot logging into 10 machines to run 1 script on each", without us having to write any additional code.

Anyway, we've got a lot of other things going on, but I'm really proud of all the work we've done to get this far, and confident that we'll be able to get this across the finish line soon. We've done it so incrementally that I don't feel like we've built this giant cathedral; it's more that we've just broken down our big problem into little bits that we can improve more quickly.

buildbot "try" support

2007-02-09T16:49:00-08:00

As many of you know by now, Ben Hearsum has been doing awesome work on Buildbot integration, such as:

* bonsai support

* publishing to tinderbox

* setup and administration of the seneca cluster

Now the awesomeness continues, as we're working on a Buildbot "try" server to allow developers to upload patches that will generate one-off builds, without having to check that patch in. This is great for experimental builds for proof-of-concept type code, as well as making sure your patch will compile on our supported OS environments.

Ben is doing most of the real work here, and we're working hard to document and publish details of our setup so others can benefit (and of course, help us out when we hit problems!). Brian Warner just launched Buildbot.net which looks much more useful to me than the old sourceforge page, so expect to see more HOWTOs appearing there in the near future.

Note that we're actually not using the "buildbot try" support, but instead using "buildbot sendchange" (however it's worth reading about "try", because it clearly explains what we're trying to do here). There are several reasons for this; one is that "buildbot try" assumes that Buildbot is able to check out directly from CVS (not through client.mk like we do), and it also assumes that developers have buildbot installed on their development machine and have direct access to the Buildbot server on a special Buildbot-specific port. Finally, "buildbot try" does not accept patches directly; it expects to be run in your checkout and generate an appropriate patch itself.

We could work around the client.mk problem, but the rest are a bigger deal; a lot of developers use different version control systems, and Mozilla is definitely not new to managing patches.

So, the shortest path to happiness seems to be:

* give developers access to a "patch upload" web interface, the version Ben put together looks like this:

Try upload dialog

* have a custom series of steps, which does:

clobber existing source tree
check out client.mk
download mozconfig
apply patch
configure
compile

* upload the build to somewhere useful; for now we'll probably just keep it on the "try" server, on the same webserver that hosts the patch upload UI. Each "try" request gets a unique ID, so we can use that to link to an appropriate output directory.

What you'd see on the Buildbot server page is something like this (NOTE - I hacked up this image to show you an interesting scenario, but didn't feel like waiting around for a real checkin to coincide with my "try" test.. so, the times not matching up 100% in the ETA section is OK)

Buildbot Try Waterfall example

The left-most column is who made the change (the one from coop is from the bonsaipoller, while the one from me is via the patch upload interface). The left-most build column is a normal build triggered by coop's change, while the right-most is triggered by my change.

In Buildbot, each build column represents a "Builder", behind which there can be one or more buildslaves (buildslaves are the actual hosts, similar to a tinderbox client). This means that you can have several simultaneous builds on the same column that are actually happening on different machines, and in fact you can share the same hosts between different columns!

When we actually have Windows and Mac buildslaves hooked up to this, you'll probably see one column for each OS, but that's it. We can add or remove hosts as needed behind that, and Buildbot will handle queuing of the incoming requests if all the available build machines are busy.

The Automaton

2006-11-01T15:24:00-08:00

The staging environment for the release automation project (aka "bootstrap") is up and running. This includes a CVS mirror, so everything from tagging to build to updates (with verification each step along the way) can be done without affecting the production environment.

The setup/teardown of the environment is scripted out, and bootstrap can now get through each individual release step and produce useful output (well, using the patches in my private tree).

The first big deliverable of this system is to automate the currently human-driven tasks we do as part of the Firefox and Thunderbird release process, and it is almost there. The next phase is more about what kinds of improvements and increased reliability/reproducability can be brought into the process, so I will have more to say on that later, once my patches have all landed and we try actually using this thing.

I've been using this project as an opportunity to try out some more interesting tools, the workflow I've been using is:

keep a private repo in SVK on my laptop, sync'd from Mozilla CVS
push to personal Subversion repo when ready for integration testing
Buildbot on my laptop runs unit tests automatically when it sees the Subversion checkin
pull changes from Subversion to staging environment and run staged release

When I am ready to post a patch to Bugzilla for review, I do it via "svk diff" against the Mozilla CVS version. Admittedly a lot of overhead for a small project, but it being so simple means I can simply drop one of these tools if something goes horribly wrong, and not suffer a huge setback.

Overall I like this workflow a lot, although I am thinking that next time around I will try another version control system (most likely Mercurial) instead of the SVK/SVN combo. I'd really like to try some newer features like having a patch queue, and it'd be excellent to write a few extensions to do things like pull a patch from Bugzilla and stick into the patch queue automatically (I do that now with "curl | patch", but it'd be nicer to just "hg bzpatch 1234" or something like that, and have it go into the queue).

Vlad is currently importing the Mozilla CVS trunk (using cvsps and hg-cvs-import) into a testing Mercurial repo, which makes deciding which system to try next an easier decision for me.

There is nothing wrong with SVK/SVN combo and in fact I like it quite a bit, but I really want to get away from the need to send patches around before they are ready for review; I'd rather just give someone a URL to my repo to pull from so we can stay integrated.

vacation, state of release automation

2006-10-13T22:23:00-07:00

I will be on vacation from October 16th through the 23rd in Toronto, and giving a lecture at Seneca on the 20th. If you are in the area and want to hang out, feel free to give me a buzz.

The release automation work is at a reasonably happy place, I have managed to write and test the following steps, and post patches for review (the high-level steps are described in the app's README file). :

* Tag

* Build

* Source

* Repack

* Updates

The Stage step is actually mostly tested, but I keep running out of disk space on the staging machine, so I'll need to get creative on that one. Sign is fairly simple and mostly manual (which is desirable), and Release fairly simple (but obviously critical!) - it's the act of copying the staged/signed bits to the official release directories.

One thing I feel I must mention is that this tool does not necessarily support what we consider the ideal process - it instead supports the process that we use, and that is known to work.

However, it is difficult to introduce benefitial changes and explore alternatives since we haven't had a good staging environment or set of verification tests to make sure that we haven't introduced any undesired side-effects.

The trickiest bit of this isn't so much the steps themselves as having some kind of automated verification that the step succeeded so we can trust that running the next won't be a waste of time.

Our current process is very human-time-intensive, since a release engineer needs to kick off and verify each step, and some of the steps take several hours by themselves (builds and update generation/verification, primarily). If something goes wrong (due to an unexpected change in the product, a bug in one of the tools, or just Murphy's Law) then we need to determine the last "good" step and restart from there.

Automated verification does of course have a point of diminishing returns, and Mozilla-based products are complicated enough that this doesn't really provide any direct QA benefit, besides not wasting our tester's valuable time on something a dumb computer can catch (like a bad tag, bad build, mismatched or nonexistant update paths, etc.).

The other big downside to a human operator being the default is that humans function much better with sleep and time off (prolonged focus being bad for overall concentration) and it's a bad use of creative energies. An automated process doesn't need to pause between steps, and won't introduce variation through attempts at creativity. The place to be creative isn't in the scope of a release, but in thinking about and improving the overall process (generally best done in between releases, based on the lessons learned from the past).

It should of course be possible for a human to jump in and drive the process if needed, especially fixing and rerunning steps which failed for an intermittent reason, bug in the tools, etc. It should hopefully not be the norm, but it's a reasonable use case for this kind of tool.

The ideal use case that I can think of right now would be: "code is frozen; declare and obtain sign-off for names/numbers/locales/etc. and kick off the release process". Respinning to pick up source changes is acheived by a variant of the Tag step, and the process is restarted and runs through the same Build->Stage steps until we're happy with it.