How to write your own rspec retry mechanism
Jan 17, 2016Imagine you have an rspec test suite filled with system tests. Each system test simulates how a real user would interact with your app by opening a browser session through which it fills out text fields, clicks on buttons, and sends data to public endpoints. Unfortunately, browser drivers are not without bugs and sometimes your tests will fail because of these. Wouldn’t it be nice if we could automatically retry these failed tests?
This article starts by investigating how rspec formatters can be used to help us keep track of failed tests. Next, we’ll use this information to take a first stab at creating a rake task that can automatically retry failed tests. Lastly, we’ll explore how to further improve our simple rake task so as to make it ready for use in production.
Note that any code shown in this post is only guaranteed to work with rspec 3.3. In the past I’ve written similar code for other rspec versions as well though. So don’t worry, it shouldn’t be too hard to get all of this to work on whatever rspec version you find yourself using.
Rspec formatters
Rspec generates its command line output by relying on formatters that receive messages on events like example_passed
and example_failed
. We can use these hooks to help us keep track of failed tests by having them write the descriptions of failed tests to a text file named tests_failed
. Our FailureFormatter
class does just that.
# failure_formatter.rb
require 'rspec/core/formatters/progress_formatter'
class FailureFormatter < RSpec::Core::Formatters::ProgressFormatter
RSpec::Core::Formatters.register self, :example_failed
def example_failed(notification)
super
File.open('tests_failed', 'a') do |file|
file.puts(notification.example.full_description)
end
end
end
We’ll soon have a look at how tests behave when we try to run them with the formatter shown above. But first, let’s prepare some example tests. We’ll create two tests. One of which will always pass, and another one which will always fail.
# my_fake_tests_spec.rb
describe 'my fake tests', :type => :feature do
it 'this scenario should pass' do
expect(true).to eq true
end
it 'this scenario should fail' do
expect(false).to eq true
end
end
Having done that, we can now run our tests with the FailureFormatter
we wrote earlier. As you can see below, we’ll have to pass both --require
and --format
params in order to get our formatter to work. I’m also using the --no-fail-fast
flag so as to prevent our test suite from exiting upon encountering its first failure.
$ bundle exec rspec --require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast
.F
Failures:
1) my fake tests this scenario should fail
Failure/Error: expect(false).to eq true
expected: true
got: false
(compared using ==)
# ./spec/my_fake_tests_spec.rb:8:in `block (2 levels) in <top (required)>'
Finished in 0.02272 seconds (files took 0.0965 seconds to load)
2 examples, 1 failure
Failed examples:
rspec ./spec/my_fake_tests_spec.rb:7 # my fake tests this scenario should fail
After running this, we should now have a tests_failed
file that contains a single line describing our failed test. As we can see in the snippet below, this is indeed the case.
$ cat tests_failed
my fake tests this scenario should fail
Take a moment to reflect on what we have just done. By writing just a few lines of code we have effectively created a logging mechanism that will help us keep track of failed tests. In the next section we will look at how we can make use of this mechanism to automatically rerun failed tests.
First pass at creating the retry task
In this section we will create a rake task that runs our rspec test suite and automatically retries any failed tests. The finished rake task is shown below. For now, have a look at this code and then we’ll go over its details in the next few paragraphs.
require 'fileutils'
task :rspec_with_retries, [:max_tries] do |_, args|
max_tries = args[:max_tries].to_i
# construct initial rspec command
command = 'bundle exec rspec --require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast'
max_tries.times do |t|
puts "\n"
puts '##########'
puts "### STARTING TEST RUN #{t + 1} OUT OF A MAXIMUM OF #{max_tries}"
puts "### executing command: #{command}"
puts '##########'
# delete tests_failed file left over by previous run
FileUtils.rm('tests_failed', :force => true)
# run tests
puts `#{command}`
# early out
exit 0 if $?.exitstatus.zero?
exit 1 if (t == max_tries - 1)
# determine which tests need to be run again
failed_tests = []
File.open('tests_failed', 'r') do |file|
failed_tests = file.readlines.map { |line| "\"#{line.strip}\"" }
end
# construct command to rerun just the failed tests
command = ['bundle exec rspec']
command += Array.new(failed_tests.length, '-e').zip(failed_tests).flatten
command += ['--require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast']
command = command.join(' ')
end
end
The task executes the bundle exec rspec
command a max_tries
number of times. The first iteration runs the full rspec test suite with the FailureFormatter
class and writes the descriptions of failed tests to a tests_failed
file. Subsequent iterations read from this file and use the -e option to rerun the tests listed there.
Note that these subsequent iterations use the FailureFormatter
as well. This means that any tests that failed during the second iteration will get written to the tests_failed
file to be retried by the third iteration. This continues until we reach the max number of iterations or until one of our iterations has all its tests pass.
Every iteration deletes the tests_failed
file from the previous iteration. For this we use the FileUtils.rm
method with the :force
flag set to true
. This flag ensures that the program doesn’t crash in case the tests_failed
file doesn’t exist. The above code relies on backticks to execute the bundle exec rspec
subprocess. Because of this we need to use the global variable $?
to access the exit status of this subprocess.
Below you can see the output of a run of our rake task. Notice how the first iteration runs both of our tests, whereas the second and third iterations rerun just the failed test. This shows our retry mechanism is indeed working as expected.
$ rake rspec_with_retries[3]
##########
### STARTING TEST RUN 1 OUT OF A MAXIMUM OF 3
### executing command: bundle exec rspec --require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast
##########
.F
Failures:
1) my fake tests this scenario should fail
Failure/Error: expect(false).to eq true
expected: true
got: false
(compared using ==)
# ./spec/my_fake_tests_spec.rb:8:in `block (2 levels) in <top (required)>'
Finished in 0.02272 seconds (files took 0.0965 seconds to load)
2 examples, 1 failure
Failed examples:
rspec ./spec/my_fake_tests_spec.rb:7 # my fake tests this scenario should fail
##########
### STARTING TEST RUN 2 OUT OF A MAXIMUM OF 3
### executing command: bundle exec rspec -e "my fake tests this scenario should fail" --require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast
##########
Run options: include {:full_description=>/my\ fake\ tests\ this\ scenario\ should\ fail/}
F
Failures:
1) my fake tests this scenario should fail
Failure/Error: expect(false).to eq true
expected: true
got: false
(compared using ==)
# ./spec/my_fake_tests_spec.rb:8:in `block (2 levels) in <top (required)>'
Finished in 0.02286 seconds (files took 0.09094 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/my_fake_tests_spec.rb:7 # my fake tests this scenario should fail
##########
### STARTING TEST RUN 3 OUT OF A MAXIMUM OF 3
### executing command: bundle exec rspec -e "my fake tests this scenario should fail" --require ./spec/formatters/failure_formatter.rb --format FailureFormatter --no-fail-fast
##########
Run options: include {:full_description=>/my\ fake\ tests\ this\ scenario\ should\ fail/}
F
Failures:
1) my fake tests this scenario should fail
Failure/Error: expect(false).to eq true
expected: true
got: false
(compared using ==)
# ./spec/my_fake_tests_spec.rb:8:in `block (2 levels) in <top (required)>'
Finished in 0.02378 seconds (files took 0.09512 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/my_fake_tests_spec.rb:7 # my fake tests this scenario should fail
The goal of this section was to introduce the general idea behind our retry mechanism. There are however several shortcomings in the code that we’ve shown here. The next section will focus on identifying and fixing these.
Perfecting the retry task
The code in the previous section isn’t all that bad, but there are a few things related to the bundle exec rspec
subprocess that we can improve upon. In particular, using backticks to initiate subprocesses has several downsides:
- the standard output stream of the subprocess gets written into a buffer which we cannot print until the subprocess finishes
- the standard error stream does not even get written to this buffer
- the backticks approach does not return the id of the subprocess to us
This last downside is especially bad as not having the subprocess id makes it hard for us to cancel the subprocess in case the rake task gets terminated. This is why I prefer to use the childprocess gem for handling subprocesses instead.
require 'fileutils'
require 'childprocess'
task :rspec_with_retries, [:max_tries] do |_, args|
max_tries = args[:max_tries].to_i
# exit hook to ensure rspec process gets stopped when CTRL+C (SIGTERM is pressed)
# needs to be set outside the times loop as otherwise each iteration would add its
# own at_exit hook
process = nil
at_exit do
process.stop unless process.nil?
end
# construct initial rspec command
command = ['bundle', 'exec', 'rspec', '--require', './spec/formatters/failure_formatter.rb', '--format', 'FailureFormatter', '--no-fail-fast']
max_tries.times do |t|
puts "\n"
puts '##########'
puts "### STARTING TEST RUN #{t + 1} OUT OF A MAXIMUM OF #{max_tries}"
puts "### executing command: #{command}"
puts '##########'
# delete tests_failed file left over by previous run
FileUtils.rm('tests_failed', :force => true)
# run tests in separate process
process = ChildProcess.build(*command)
process.io.inherit!
process.start
process.wait
# early out
exit 0 if process.exit_code.zero?
exit 1 if (t == max_tries - 1)
# determine which tests need to be run again
failed_tests = []
File.open('tests_failed', 'r') do |file|
failed_tests = file.readlines.map { |line| line.strip }
end
# construct command to rerun just the failed tests
command = ['bundle', 'exec', 'rspec']
command += Array.new(failed_tests.length, '-e').zip(failed_tests).flatten
command += ['--require', './spec/formatters/failure_formatter.rb', '--format', 'FailureFormatter', '--no-fail-fast']
end
end
As we can see from the line process = ChildProcess.build(*command)
, this gem makes it trivial to obtain the subprocess id. This then allows us to write an at_exit
hook that shuts this subprocess down upon termination of our rake task. For example, using ctrl+c to cease the rake task will now cause the rspec subprocess to stop as well.
This gem also makes it super easy to inherit the stdout and stderr streams from the parent process (our rake task). This means that anything that gets written to the stdout and stderr streams of the subprocess will now be written directly to the stdout and stderr streams of our rake task. Or in other words, our rspec subprocess is now able to output directly to the rake task’s terminal session. Having made these improvements, our rspec_with_retries
task is now ready for use in production.
Conclusion
I hope this post helped some people out there who find themselves struggling to deal with flaky tests. Please note that a retry mechanism such as this is really only possible because of rspec’s powerful formatters. Get in touch if you have any examples of other cool things built on top of this somewhat underappreciated feature!