A yak shave turned good: Switching from Poltergeist to Headless Chrome for Capybara browser tests

I just finished up migrating all the Capybara feature tests in my Rails/React app from Poltergeist to Headless Chrome. I ran into a few issues I didn’t see covered in other write-ups, so I thought I’d pull together what I learned and what I found useful.

This pull request shows all the changes I had to make.


Poltergeist is based on the no-longer-maintained PhantomJS headless browser. When we started using PhantomJS a few years ago, it was the only good way to run headless browser tests, but today there’s a good alternative in Selenium with Chrome’s headless mode. (Firefox has a headless mode now as well.) I briefly tried out a switch to headless Chrome a while ago, but I ran into too many problems early on and gave up.

This time, I decided to try it again after running into a weird bug — which I still don’t know the cause of. (Skip on down if you just want the migration guide.) This was a classic yak shave…

I got a request to add support for more langauges for one a particular feature, plots of the distribution of Wikipedia article quality before and after being worked on in class assignments. The graphs were generated with R and ggplot2, but this was really unreliable on the international version of the app. To get it working more reliably, I decided to try to reimplement the graphs client-side, using Vega.js. The type of graph I needed — kernel density estimation — was only available in a newer version of Vega, so first I needed to update all the other Vega plots in the app to work on the latest version of Vega, which had a lot of breaking changes from the version we’d been on. That was tough, but along the way I made some nice improvements to the other plots along with a great-looking implementation of the kernel density plot. But as I switched over to the minified versions of the Vega libraries and got ready to call it done, all of a sudden my feature specs were all failing. They passed fine with the non-minified versions, but the exact same assets in minified form — the distributed versions, straight from a CDN — caused Javascript errors that I couldn’t replicate in a normal browser. My best guess is that there’s some buggy interaction between Vega’s version of UglifyJS and the JS runtime in PhantomJS, which is triggered by something in Vega. In any case, after failing to find any other fixes, it seemed like the right time to take another shot at the Poltergeist → headless Chrome migration — which I’m happy to say, worked out nicely. So, after what started as an i18n support request, I’m happy to report that my app no longer relies on R (or rinruby to integrate between R and Ruby) and my feature tests are all running more smoothly and with fewer random failures on headless Chrome.


Using R in production was fun and interesting, but I definitely won’t be doing it again any time soon.

If you want to see that Vega plot that started it all, this is a good place to look. Just click ‘Change in Structural Completeness’. (Special thanks to Amit Joki, who added the interactive controls.)

Setting up Capybara

The basic setup is pretty simple: put selenium-webdriver and chromedriver-helper in the Gemfile, and then register the driver in the test setup file. For me it looked like this:

Capybara.register_driver :selenium do |app|
  options = Selenium::WebDriver::Chrome::Options.new(
    args: %w[headless no-sandbox disable-gpu --window-size=1024,1024]
  Capybara::Selenium::Driver.new(app, browser: :chrome, options: options)

Adding or removing the headless option makes it easy to switch between modes, so you can pop up a real browser to watch your tests run when you need to debug something.

Adding the chrome: stable addon in .travis.yml got it working on CI as well.

Dealing with the differences between Poltergeist and Selenium Chromedriver

You can run Capybara feature tests with a handful of different drivers, and the core features of the framework will work with any driver. But around the edges, there are some pretty big differences in behavior and capabilities between them. For Poltergeist vs. Selenium and Chrome, these are the main ones that I had to address during the migration:

More accurate rendering in Chrome

PhantomJS has some significant holes in CSS support, which is especially a problem when it comes to misrendering elements as overlapping when they should not be. Chrome does much more accurate rendering, closely matching what you’ll see using Chrome normally. Relatedly, Poltergeist implements .trigger('click'), which unlike the normal Capybara .click , can work even if the target element is underneath another one. A common error message with Poltergeist points you to try .trigger('click') when the normal .click fails, and I had to swap a lot of those back to .click.

Working with forms and inputs

The biggest problem I hit was interacting with date fields. In Poltergeist, I was using text input to set date fields, and this worked fine. Things started blowing up in Chrome, and it took me a while to figure out that I needed provide Capybara with Date objects instead of strings to make it work. Capybara maintainer Thomas Walpole (who is incredibly helpful) explained it to me:

fill_in with a string will send those keystrokes to the input — that works fine with poltergeist because it doesn’t actually support date inputs so they’re treated as standard text inputs so the with parameter is just entered into the field – Chrome however supports date inputs with it’s own UI.
By passing a date object, Capybara’s selenium driver will use JS to correctly set the date to the input across all locales the browser is run in. If you do want to send keystrokes to the input you’d need to send the keystrokes a user would have to type on the keyboard in the locale the browser is running in —In US English locale that would mean fill_in(‘campaign_start’, with: ’01/10/2016’)

Chromedriver is also pickier about which elements you can send input to. In particular, it must be focusable. With some trial and error, I was able to find focusable elements for all the inputs I was interacting with.

Handling Javascript errors

The biggest shortcoming with Selenium + Chrome is the lack of support for the js_errors: true option. With this option, a test will fail on any Javascript error that shows up in the console, even if the page is otherwise meeting the test requirements. Using that option was one of the main reasons we switched to Poltergeist in the first place, and it’s been extremely useful in preventing bugs and regressions in the React frontend.

Fortunately, there’s a fairly easy way to hack this feature back in with Chrome, as suggested by Alessandro Rodi. I modified Rodi’s version a bit, adding in the option to disable the error catching on individual tests — since a few of my tests involve testing error behavior. Here’s what it looks like, in my rails_helper.rb:

  # fail on javascript errors in feature specs
  config.after(:each, type: :feature, js: true) do |example|
    errors = page.driver.browser.manage.logs.get(:browser)
    # pass `js_error_expected: true` to skip JS error checking
    next if example.metadata[:js_error_expected]

    if errors.present?
      aggregate_failures 'javascript errrors' do
        errors.each do |error|
          # some specs test behavior for 4xx responses and other errors.
          # Don't fail on these.
          next if error.message =~ /Failed to load resource/

          expect(error.level).not_to eq('SEVERE'), error.message
          next unless error.level == 'WARNING'
          STDERR.puts 'WARN: javascript warning'
          STDERR.puts error.message

Different behavior using matchers to interact with specific parts of the page

Much of the Capybara API is driven with HTML/CSS selectors for isolating the part of the page you want to interact with.

I found a number of cases where these behaved differently between drivers, most often in the form of Chrome reporting an ambiguous selector that matches multiple elements when the same selector worked fine with Poltergeist. These were mostly cases where it was easy enough to write a more precise selector to get the intended element.

In a few cases with some of the intermittently failing specs, Selenium + Chrome also provided more precise and detailed error messages when a target element couldn’t be found — giving me enough information to fix the badly-specified selectors that were causing the occasional failures.

Blocking external urls

With Poltergeist, you can use the url_blacklist option to prevent loading specific domains. That’s not available with Chromedriver. We were using it just to reduce unnecessary network traffic and speed things up a bit, so I didn’t bother trying out the alternatives, the most popular of which seems to be to use Webmock to mock responses from the domains you want to block.

Status code

In Poltergeist, you can easily see what the HTTP status code for a webpage is: page.status_code. This feature is missing altogether in Selenium. I read about a few convoluted ways to get the status code, but for my test suite I decided to just do without explicit tests of status codes.

Other useful migration guides and resources

There are a bunch of blog posts and forum threads on this topic, but the two I found really useful are:

* https://about.gitlab.com/2017/12/19/moving-to-headless-chrome/
* https://github.com/teamcapybara/capybara/issues/1860

First impressions of Ruby branch coverage with DeepCover

Branch coverage for Ruby is finally on the horizon! The built-in coverage library is expected to ship in Ruby 2.5 with branch and method coverage options.

And a pure-Ruby gem is in development, too: DeepCover.

I gave DeepCover a try with my main project, the Wiki Education Dashboard, and the coverage of the Ruby portions of the app dropped from 100% to 96.75%. Most of that is just one-line guard clauses like this:

[ruby]return unless Features.wiki_ed?[/ruby]

But there are a few really useful things that DeepCover revealed. First off, unused scopes on Rails ActiveRecord models:

class Revision < ActiveRecord::Base
scope :after_date, ->(date) { where(‘date > ?’, date) }

Unlike default line coverage, DeepCover showed that this lambda wasn’t ever actually used. It was dead code that I removed today.

Similarly, DeepCover shows where default arguments in a method definition are never used, like this:
class Article < ActiveRecord::Base
def update(data={}, save=true)
self.attributes = data
self.save if save

This method was overriding the ActiveRecord update method, adding an option to update without saving. But we no longer use that second argument anywhere in the codebase, meaning I could delete the whole method and rely on the standard version of update.

Diderot — a Pebble watchface for finding nearby unillustrated Wikipedia articles

photo-nov-05-2-52-49-pmI published a watchface for Pebble smartwatches that shows you the nearest Wikipedia article that lacks a photograph. Have a Pebble and like to — or want to ­— contribute to Wikipedia? Try it out! It’s called Diderot. (Collaborators welcome!)

After using it myself for about a month and a half, I’ve finally added photographs to all the Wikipedia articles near my house within the range of Wikipedia’s ‘nearby’ API.

Extra thanks go to Albin Larrson, who built the WMF Labs API that my app uses to find nearby unillustrated articles. The great thing about it is that it filters out articles that have .png or .svg images, so you still find the articles that have only a map or logo rather than a real photograph.

Rails migrations and Capistrano don’t mix

Last night I learned the hard way what happens when Rails migrations break.

My main project, the Wiki Ed Dashboard, is set up for automatic deployment — via Capistrano and travis-ci— whenever we push new commits to the staging or production branch. It’s mostly nice.

But I ran some migrations yesterday that I shouldn’t have. In particular, this one added three new columns to a table. When I pushed it to staging, the migration took about 5 minutes and then finished. Since the staging app was unresponsive during that time, I waited until the evening to deploy it to production. But things went much worse on production, which has a somewhat large database. The migration took more than 10 minutes — at which point, travis-ci decides that the build has become unresponsive, and kills it. The migration didn’t complete.

No problem, I thought, I’ll just run the migration again. Nope! It tursn out that the first column from that migration actually made it into the MySQL database. Running it again triggered a duplicate column error. Hmmm… okay. Maybe all the columns got added, but the migration didn’t get added to the database? So I manually added the migration id to the schema_migrations table. Alas, no. Things are still broken, because the other two columns didn’t actually get added.

That’s why Rails migrations have an up and a down version, right? I’ll just migrate that one down and back up. But with only one of three columns present, neither the up nor the down migration would run. I ended up writing an ad-hoc migration to add just the second and third columns, deploying it from my own machine, and then deleting the migration afterwards. I fixed production, but it wasn’t pretty.

My takeaway from this: if you deploy via Capistrano — and especially if you deploy straight from your continuous integration server — then write a separate migration for every little thing. When things go bad in the middle of deployment, you don’t want to be stuck with a half-completed migration.

(among the) best programming podcasts

Since I started both running semi-regularly and also biking 30+minutes on the Burke a couple times per week, I’ve started listening to a lot of podcasts — mainly focusing on technology (especially free software, web development, and Ruby) and product management. I’ve listening to enough good, and enough bad, that I want to share some of the podcasts I’ve found most interesting and helpful.

Weekly podcasts

These are the most consistently good, consistently released ones I listen to. Not every episode is great, but they are worthwhile enough of the time that I usually at least sample a bit of each new episode.

  • Ruby Rogues – a great panel, featuring the awesome Coraline Ada Ehmke, among others. It’s excellent both for Rubyists specifically and as a general software discussion venue.
  • Talk Python To Me – a superb interview-based podcast. The focus is Python, but it’s very accessible even without a deep knowledge of the specific language, and I often get ideas from it that are relevant for my work. Episodes usually start with a lot of personal narrative about how the interviewee got to where they are, which is often really interesting.
  • CodePen Radio – this one is focused on the startup codepen.io, and is usually a fun listen. The range of topics — all drawn from running a web app-based startup — maps pretty nicely onto the things that are relevant for me, running the technology side of a small nonprofit. (I’ve still never used CodePen, and don’t feel like I need to in order to get value from the podcast.)
  • The Changelog – the best of the handful of free / open source software podcasts, this one is interview based, usually goes deeply into the background of each guest, and has consistently interesting guests.
  • Javascript Jabber – the javascript companion to Ruby Rogues, this one is a little more scattered and less consistently insightful, but still has a pretty high ratio of solid episodes.
  • Ruby5 – this short podcast comes out twice per week, and basically runs down interesting news and new releases in the Ruby and Rails worlds. It’s a little cheesy, but it’s worth your time if you work with Ruby or Rails.

Individual episodes

These are some of the podcast episodes that I recommend. Some come from the podcasts above, and others are individual episodes from podcasts that I otherwise don’t listen to regularly or wouldn’t recommend highly.

  • The Changelog: The Future of WordPress and Calypso with Matt Mullenweg – I wish I could just hang out all the time with Matt Mullenweg.
  • Data Skeptic: Wikipedia Revision Scoring as a Service – This interview with Aaron Halfaker is the best overview of Wikipedia’s editor trends that I’ve seen/heard.
  • Javascript Jabber: The Evolution of Flux Libraries – This late-2015 overview of React, Flux and Redux is the best of many React-related podcasts I’ve listened it. It helped clarify my thinking a lot.
  • Ruby Rogues: Neo4j – a nice Ruby-centered introduction to the concept of graph databases
  • Javascript Jabber: npm 3 – an interesting overview of the npm roadmap, which helped me understand a lot more about what npm does and what it’s trying to do
  • Ruby Rogues: Feature Toggles – a discussion of feature toggles as an key enabler of a trunk-based development git strategy
  • Ruby Rogues: The Crystal Programming Language – with the creator of Crystal, made me eager to start using Crystal
  • Ruby Rogues: The Leprechauns of Software Engineering – with the author of the book of the same title, super interesting