An intro to I/O-bound and CPU-bound solutions

Job&Talent Engineering
Job&Talent Engineering
6 min readMar 28, 2023

--

Photo by Michael Dziedzic on Unsplash

Are you looking to optimise your Ruby on Rails program’s performance? In this article, I explore the definition of I/O-bound and CPU-bound applications, their differences, and their impact on resource allocation, hardware selection, and performance optimization. Whether you’re a beginner or an experienced Rails developer, this article will provide you with valuable insights.

By Ihor Pohasii

A bit of background

As a member of the UK Black Hole Team, I deal with different kinds of challenges, and one of them has been recently solved using the multi-threading approach, introduced by a colleague of mine.

That colleague showed me a solution improving one of the endpoints using Threads.

Threads is the concept which helps you use concurrency in Ruby. This made me review Threads for myself and, ultimately, write this article.

Reading the Ruby guidelines, we may encounter the following statement:

“As a result of using threads, you’ll have a multi-threaded Ruby program, which is able to get things done faster.

But one warning:

In the MRI (Matz’s Ruby Interpreter), the default way to run Ruby applications, you will only benefit from threads when running i/o bound applications”.

I remembered I/O bindings back from the university. Then, I immediately started to wonder what other kinds of bounds there are. This is how I ended up making a presentation on this topic during Jobandtalent’s backend bi-weekly meetings. This article is based on it.

Disclaimer: Please do treat this piece as an introduction to the topic, as there is much more than this to discover.

What are I/O and CPU-bound applications?

Let’s start with the definition.

I/O applications are those limited only by the speed of input and output operations.

On the other hand, CPU-bound applications are limited by the processing power of a CPU (central processing unit).

Take a look at the example.

The diagram below shows the execution of two programs.

The grey rectangular units are CPU bursts, i.e. the time in which the CPU is calculating stuff and solving various equations. The spaces between them are the time spent waiting for the input and output.

If a program uses from 90 to 100% of your CPU, then it is CPU-bound. The upper part of the diagram represents it.

Advanced mathematical calculations, sorting algorithms and machine learning models are well-known examples of CPU-bound applications.

The lower part of the image depicts an I/O-bound application. As you can see in the picture, such programs have much shorter CPU bursts and a longer waiting period for input and output.

Real-life examples of I/O-bound applications include programs which interact with databases.

Applications which handle large amounts of data can also be I/O-bound if they need to read or add these data to a disk.

Differences

That’s it for the high-level theory.

But why does even the Ruby documentation highlight the need of understanding the differences between I/O and CPU-bound solutions?

They actually make a huge impact when it comes to:

  • resource allocation (more informed decisions about assigning the right amount of memory, disk space, CPU power, etc.),
  • Hardware selection (different programs have different hardware requirements, e.g. a CPU-bound application might benefit from a faster CPU with multiple cords, while an I/O-bound one might benefit from faster storage or increased memory),
  • Performance optimization (identifying potential bottlenecks, etc.).

Real-life examples

I/O bound application

Single-thread example of code

Let’s take a look at an example of Ruby I/O-bound code run in a single thread.

In our example, I will use a .txt file with a size of ~400 mb. All the executions will be isolated in docker containers (in favour of limiting available resources during execution).

#  Start the timer

start = Time.now

file = File.open("input.txt", "r")
contents = file.read
result = contents.scan(/\w+/).size

elapsed = Time.now - start

puts "Elapsed time: #{elapsed} seconds"
#
# Elapsed time: 59.997594584 seconds

Reading the file and iterating through its content in a single thread took us almost 1 minute. As you see, there were no expensive operations that would cause long CPU-bursts, and most of the time was spent on reading the file. We could relate the logic of this program to the diagram, and you could tell without hesitation that this is a typical example of IO-bound code.

Multithreaded example of code

Let’s solve the very same problem as in the example above, but this time using a multithreaded approach.

# Define the number of threads
threads = 4

start = Time.now

# Split the input file into segments for each thread
file = File.open("input.txt", "r")
segment_size = (file.size.to_f / threads).ceil
segments = (0...threads).map do |i|
start_pos = i * segment_size
end_pos = [start_pos + segment_size - 1, file.size].min
file.seek(start_pos)
file.read([segment_size, file.size - start_pos].min)
end

# Start a new thread for each segment
threads.times.map do |i|
Thread.new(segments[i]) do |segment|
# Perform the IO-bound file reading operation on each segment
result = segment.scan(/\w+/).size
end
end.each(&:join)

elapsed = Time.now - start

# Print the elapsed time
puts "Elapsed time: #{elapsed} seconds"
#
# Elapsed time: 38.827828157 seconds

Now the elapsed time is much shorter.

Instead of reading the file from the beginning to the end in the same thread, we introduced 4 threads, split the file into 4 chunks (defining the start and end positions), and then scanned them concurrently.

Before making any conclusions, let’s jump to the CPU-bound example.

CPU-bound solution

For showing you the direct impact of CPU resources available for the execution environment, I will use the docker CPU-limitation for running containers.

From the docker documentation:

— cpuset-cpus — Limit specific CPUs or cores a container can use.

A comma-separated list or hyphen-separated range of CPUs a container can use, if you have more than one CPU. The first CPU is numbered 0. A valid value might be 0–3 (to use the first, second, third, and fourth CPU) or 1,3 (to use the second and fourth CPU).

Example program:

require 'parallel'

# Start the timer
start = Time.now

# Perform a CPU-bound calculation
def expensive_operation
iterations = 100_000_000
sum = 0

(1..iterations).each do |i|
sum += Math.sqrt(i)
end

sum
end

result = Parallel.map(1..3, in_processes: 3) do |i|
sum = expensive_operation

puts "Worker: #{Parallel.worker_number} completed!"
end

elapsed = Time.now - start
puts "Elapsed time: #{elapsed} seconds"

The first thing I did here was introduce the Parallel gem, which allows us to bypass the MRI limitation for using real parallelism.

You may have noticed that I stated in code that I wanted to use 3 processes, but, in reality, the number of processes running simultaneously will always be limited by the number of available CPUs.

In the code, I introduced an expensive operation, which should produce longer CPU-bursts and shorter waiting time, as well as executed this operation three times in a Parallel block.

Let’s start the container for our program with next limitations:

docker run — cpuset-cpus=”1" …

Running this code in a docker environment with a limitation of one available CPU will give us next output:

# Worker: 0 completed!
# Worker: 2 completed!
# Worker: 1 completed!
# Elapsed time: 18.445319911 seconds

Let’s try to execute very same program, while we will extend number of available CPUs to 3:

docker run — cpuset-cpus=”0–2"…

Running this code in a docker environment with a limitation of one available CPU will give us next output:

# Worker: 0 completed!
# Worker: 2 completed!
# Worker: 1 completed!
# Elapsed time: 6.349463044 seconds

Increasing the number of available CPU’s improved execution time by 3 times, without introducing any changes to code!

Summing up

Fixing performance for CPU-bound programs is more expensive, but easier to do. All you need is just to increase the number of CPUs. Improving performance for I/O-bound applications can be free of charge, but it can increase the complexity of code.

--

--

The magicians doing software, devops, front and backend, mobile apps, recommendation systems, machine learning, clouded infrastructure and all that jazz.