The disaster that is Ruby's timeout method
Dec 19, 2015On paper, Ruby’s timeout method looks like an incredibly useful piece of code. Ever had a network request occasionally slow down your entire program because it just wouldn’t finish? That’s where timeout
comes in. It provides a hard guarantee that a block of code will be finished within a specified amount of time.
require 'timeout'
timeout(5) do
# block of code that should be interrupted if it takes more than 5 seconds
end
There’s one thing the documentation doesn’t tell you though. If any of the lines in that block of code introduces side effects that rely on the execution of later lines of code to leave things in a stable state, then using the timeout
method is a great way to introduce instability in your program. Examples of this include pretty much any program that is not entirely without stateful information. Let’s have a closer look at this method to try and figure out what’s going on here exactly.
Exceptions absolutely anywhere
The problem with timeout
is that it relies upon Ruby’s questionable ability to have one thread raise an exception absolutely anywhere in an entirely different thread. The idea is that when you place code inside a timeout
block, this code gets wrapped inside a new thread that executes in the background while the main thread goes to sleep for 5 seconds. Upon waking, the main thread grabs the background thread and forcefully stops it by raising a Timeout::Error
exception on it (actual implementation).
# raising_exceptions.rb
# threads can raise exceptions in other threads
thr = Thread.new do
puts '...initializing resource'
sleep 1
puts '...using resource'
sleep 1
puts '...cleaning resource'
sleep 1
end
sleep 1.5
thr.raise('raising an exception in the thread')
$ ruby raising_exeptions.rb
...initializing resource
...using resource
The problem with this approach is that the main thread does not care what code the background thread is executing when it raises the exception. This means that the engineer responsible for the code that gets executed by the background thread needs to assume an exception can get thrown from absolutely anywhere within her code. This is madness! No one can be expected to place exception catchers around every single block of code!
The following code further illustrates the problem of being able to raise an exception absolutely anywhere. Turns out that absolutely anywhere includes locations like the inside of ensure
blocks. These locations are generally not designed for handling any exceptions at all. I hope you weren’t using an ensure
block to terminate your database connection!
# ensure_block.rb
# raising exceptions inside an ensure block of another thread
# note how we never finish cleaning the resource here
thr = Thread.new do
begin
puts '...initializing resource'
sleep 1
raise 'something went wrong'
puts '...using resource'
sleep 1
ensure
puts '...started cleaning resource'
sleep 1
puts '...finished cleaning resource'
end
end
sleep 1.5
thr.raise('raising an exception in the thread')
# prevent program from immediately terminating after raising exception
sleep 5
$ ruby ensure_blocks.rb
...initializing resource
...started cleaning resource
Real world example
Recently, I spent a lot of time working with the curb http client. I ended up wrapping quite a few of my curb calls within timeout
blocks because of tight time constraints. However, this caused great instability within the system I was working on. Sometimes a call would work, whereas other times that very same call would throw an exception about an invalid handle. It was this that caused me to start investigating the timeout
method.
After having a bit of think, I came up with a proof of concept that showed beyond a doubt that the timeout
method was introducing instability in the very internals of my http client. The finished proof of concept code can look a bit complex, so rather than showing the final concept code straightaway, I’ll run you through my thought process instead.
Let’s start with the basics and write some code that uses the http client to fetch a random google page. A randomized parameter is added to the google url in order to circumvent any client-side caching. The page fetch itself is wrapped inside a timeout
block as we are interested in testing whether the timeout
method is corrupting the http client.
# basics.rb
# timeout doesn't get triggered
require 'curb'
require 'timeout'
timeout(1) do
Curl.get("http://www.google.com?foo=#{rand}")
end
This code will rarely timeout as a page fetch generally takes way less than one second to complete. This is why we’re going to wrap our page fetch inside an infinite while loop.
# infinite_loop.rb
# timeout gets triggered and Timeout::Error exception gets thrown
require 'curb'
require 'timeout'
timeout(1) do
while true
Curl.get("http://www.google.com?foo=#{rand}")
end
end
$ ruby infinite_loop.rb
/Users/vaneyckt/.rvm/gems/ruby-2.0.0-p594/gems/curb-0.8.8/lib/curl/easy.rb:68:
in 'perform': execution expired (Timeout::Error)
The above code is now timing out and throwing a Timeout::Error
exception. Next we want to determine whether the timing out of a page fetch could corrupt the internal state of the http client, thereby causing problems for a subsequent page fetch. We’ll need to make lots of page fetches to test this, so we’re going to wrap all of our current code inside another infinite while loop. Furthermore, we don’t want any Timeout::Error
exceptions to break us out of this while loop, so we’re going to catch and ignore these exceptions inside the while loop we just created. This gives us our finished proof of concept code.
# proof_of_concept.rb
# timeout corrupts the very internals of the curb http client
require 'curb'
require 'timeout'
while true
begin
timeout(1) do
while true
Curl.get("http://www.google.com?foo=#{rand}")
end
end
rescue Timeout::Error => e
end
end
$ ruby proof_of_concept.rb
/Users/vaneyckt/.rvm/gems/ruby-2.0.0-p594/gems/curb-0.8.8/lib/curl/easy.rb:67:
in 'add': CURLError: The easy handle is already added to a multi handle
(Curl::Err::MultiAddedAlready)
Running the above program will result in an exception being thrown after a few seconds. At some point, the timeout
method is causing a Timeout::Error
exception to be raised inside a critical code path of the http client. This badly timed Timeout::Error
exception leaves the client in an invalid state, which in turn causes the next page fetch to fail with the exception shown above. Hopefully this illustrates why you should avoid creating programs that can have Timeout::Error
exceptions pop up absolutely anywhere.
Conclusion
I hope this has convinced you there is nothing you can do to prevent timeout
from doing whatever it wants to your program’s internal state. There is just no way a program can deal with Timeout::Error
exceptions being able to potentially pop up absolutely anywhere. The only time you can really get away with using timeouts is when writing functional code that does not rely on any state. In all other cases, it is best to just avoid timeouts entirely.