Track flaky specs automatically using this simple tweak in RSpec builds
Last editedJan 2020
GoCardless’ Product Development team recently had a summer Hackathon – during which, we developed a simple tweak for our RSpec builds to track flaky specs (or, intermittently failing tests).
The problem
There’s a lot said about flaky specs because they often lead to wasted time and effort. At one point or another, we have probably all been about to merge or deploy changes and a required build check stops us – only because a flaky spec acted-up!
When this happens, there’s no option but to retrigger the build to see it go green to unblock the change. We try and remind ourselves to write up a ticket and come back to this flaky spec but this isn’t always possible when you have a number of other things you are working on.
So, we needed an automated way to keep track of flaky specs so that they are visible and we are reminded about fixing them.
The solution
We depend on RSpec’s --only-failures
option which reruns only failures from the last run. This option requires an RSpec configuration to persist spec execution results to a file:
RSpec.configure do |config|
config.example_status_persistence_file_path = "/tmp/ci/example_status.txt"
end
Here’s what our build script looks like:
build_exit_status=0
execute_rspec {
bundle exec rspec
build_exit_status=$?
mv /tmp/ci/example_status.txt /tmp/ci/example_status.txt.run1
}
execute_rspec_failures_only {
bundle exec rspec --only-failures
mv /tmp/ci/example_status.txt /tmp/ci/example_status.txt.run2
}
track_flaky_specs {
// find failing specs from example_status.txt.run1
// check if they passed in example_status.txt.run2
// if it passed on second run, create a ticket to fix this flaky spec
// if it failed, the build fails because this is more likely to be a valid breakage
grep "| failed" /tmp/ci/example_status.txt.run1 | cut -d" " -f1 \
| xargs -I{} grep -F {} example_status.txt.run2 | grep "| passed" | cut -d"[" -f1 | uniq \
| xargs -I{} bundle exec rake create_ticket\["Flaky spec: {}","Build: $BUILD_URL"\]
}
execute_rspec || (execute_rspec_failures_only && track_flaky_specs)
exit $build_exit_status
The create_ticket
rake task creates a ticket to remind us to fix this flaky spec. If there’s an existing ticket with that title, it just drops a comment on it about a recurrence. This tells us how often a flaky spec affects developers.
GoCardless teams have a rotating First Responder role. A First Responder responds to tickets in the general Ticket Inbox coming from our deployed applications or teams outside of Product Development.
The flaky spec ticket is created in the Ticket Inbox. From here, the First Responder can assign it to the team that is best-placed to fix it.
The results
While we haven’t been running this on CI for long, we are already seeing the benefits.
The automatically created tickets have given us more visibility over flaky specs that would have otherwise gone unnoticed. Developers have actively fixed flaky specs they inadvertently introduced knowing they’re regularly affecting fellow developers.
Now that we have a list of flaky specs, it gives us a chance to identify patterns behind misbehaving specs.
One such pattern we identified was related to our feature flagging library. It had been configured incorrectly to use in-memory storage. This resulted in feature flags being True
in specs not expecting them to be enabled at all.
If you happen to try this out in your builds, we’d love to hear how that went @GoCardlessEng.