Rails migrations and Capistrano don’t mix

Last night I learned the hard way what happens when Rails migrations break.

My main project, the Wiki Ed Dashboard, is set up for automatic deployment — via Capistrano and travis-ci— whenever we push new commits to the staging or production branch. It’s mostly nice.

But I ran some migrations yesterday that I shouldn’t have. In particular, this one added three new columns to a table. When I pushed it to staging, the migration took about 5 minutes and then finished. Since the staging app was unresponsive during that time, I waited until the evening to deploy it to production. But things went much worse on production, which has a somewhat large database. The migration took more than 10 minutes — at which point, travis-ci decides that the build has become unresponsive, and kills it. The migration didn’t complete.

No problem, I thought, I’ll just run the migration again. Nope! It tursn out that the first column from that migration actually made it into the MySQL database. Running it again triggered a duplicate column error. Hmmm… okay. Maybe all the columns got added, but the migration didn’t get added to the database? So I manually added the migration id to the schema_migrations table. Alas, no. Things are still broken, because the other two columns didn’t actually get added.

That’s why Rails migrations have an up and a down version, right? I’ll just migrate that one down and back up. But with only one of three columns present, neither the up nor the down migration would run. I ended up writing an ad-hoc migration to add just the second and third columns, deploying it from my own machine, and then deleting the migration afterwards. I fixed production, but it wasn’t pretty.

My takeaway from this: if you deploy via Capistrano — and especially if you deploy straight from your continuous integration server — then write a separate migration for every little thing. When things go bad in the middle of deployment, you don’t want to be stuck with a half-completed migration.