Side Effects in Programming

You may or may not have heard the term “side effect” before and maybe you do not fully understand what exactly it is or why it is important that we handle them in the correct way. I personally do not hear people discussing side effects all that much outside of the Haskell community where they have decided that side effects are so important that they have placed them in the type system to ensure that people always know that they are doing something that is out of the ordinary. So there has to be something to this, right?

But before we get too far into this, we should make sure to have some kind of definition for what a side effect is in programming. A side effect is any operation that uses some “thing” outside of its local environment. Generally speaking, the local environment is the scope of a function. This could be accessing a global variable, reading or writing to the file system, making an HTTP call, using a random number generator, reading user input, writing to the screen, etc. These all have their uses when programming. After all, a program that is unable to perform effects is probably not all that useful. Even a basic calculator needs some way of reading the users input and displaying the output.

And some quick terminology,

Pure Function - A pure function is any function that does NOT perform any kind of side effect.
Impure Function - An impure function is any function that DOES perform a side effect.

But why does any of this matter? Functions are functions, right? Well, an impure function is able to call a pure function. However, as soon as an otherwise pure function calls an impure function it is now considered impure itself. But if you can still call functions from other functions, why should anyone care?

It matters because pure functions are extremely easy to test and reason about. A pure function will “do” nothing except transform its input(s) and return some value. In fact, given the same inputs it will ALWAYS return the same value. This is what makes them so easy to test. This will allow us to have a test such as

assert 1 + 2 = 3

Imagine if doing simple addition sometimes returned a value you were not expecting because it decided to multiply instead on every third Tuesday of the month between the hours of 6 and 10. This would make it incredibly hard to reason about, let alone test. The fact that +, a pure function, will do nothing other than return a value based on its inputs makes it much easier to understand. The same applies for all pure functions.

Lets take a look at a couple of different kinds of impure functions and the best way to handle them.

Getting The Current Date/Time

For dates and times I like to be able to pass them into my context modules and their functions. The reason for this is mostly about testability. Lets say we had a system that needed to run some kind of daily report. If we had the following function,

defmodule MyApp.Insights do
  def daily_report() do
    now = DateTime.utc_now()
     
    ...
  end
end

this becomes fairly complex to test. In our tests, we would need to generate information for today as well as adding and subtracting from the current date in order to generate data outside of the expected window so that we can ensure that the current implementation only consumes the data that we were expecting.

The test may look something like

test "daily_report/0 only uses today" do
  now = DateTime.utc_now()
  start_of_day = Timex.beginning_of_day(now)
  end_of_day = Timex.end_of_day(now)
  
  tomorrow = Timex.shift(now, days: 1)
  yesterday = Timex.shift(now, days: -1)
  
  foo_fixture(%{started_at: now})
  foo_fixture(%{started_at: start_of_day})
  foo_fixture(%{started_at: end_of_day})

  foo_fixture(%{started_at: tomorrow})
  foo_fixture(%{started_at: yesterday})
  
  assert something = daily_report()
end

Contrast the above with passing the value into our function

defmodule MyApp.Insights do
  def daily_report(date) do
    ...
  end
end

This approach gives us two major benefits. The first is that we are now able to call this function with any arbitrary date we want. In this example it is for running reports, but it could be anything. The second benefit is that the function becomes a lot simpler to test because we are able to test with different inputs.

The equivalent test to the above would be

test "run_report/1 returns different for timeframe" do
  foo_fixture(%{
    started_at: ~U[2022-01-26 00:00:00Z]
  })
  foo_fixture(%{
    started_at: ~U[2022-01-26 15:34:00Z]
  })
  foo_fixture(%{
    started_at: ~U[2022-01-26 23:59:59Z]
  })

  foo_fixture(%{
    started_at: ~U[2022-01-27 15:56:00Z]
  })
  foo_fixture(%{
    started_at: ~U[2022-01-25 15:56:00Z]
  })
  
  assert something = run_report(~U[2022-01-26 00:00:00Z])
  
  assert something_else = run_report(~U[2022-01-27 00:00:00Z])
end

Calling An External Service

An external service in this case could be a few different things. A couple examples would be

A database
An HTTP endpoint
The file system

For cases like these it rarely makes sense to pass the result of the call into the context function as this can very frequently be business logic that does not belong outside of the context. My general approach is to contain these types of operations to the context module that I am calling.

Lets take a quick look at an example. Lets say that we were writing a system to keep track of inventory and orders for a warehouse. For a system like this, we may have an Orders context module that allows us to create and track the different orders in the system. One function that we would need would create an order. This would require the user to pay whatever the total is. It may look like this

defmodule MyApp.Orders do
  alias MyApp.Repo

  alias MyApp.Orders.Order
  
  @payment_client Application.get_env(:my_app, :payment_client)

  def create_order(user, items) do
    order = 
      %Order{}
      |> Order.create_changeset(%{
        user_id: user.id, 
        items: items
      })
      |> Repo.insert()
      
    user
    |> @payment_client.process_payment(order)
    |> update_order_status(order)
  end
  
  defp update_order_status(payment_response, order) do
    order 
    |> Order.complete_order_changeset(%{
      status: payment_response.status
    })
    |> Repo.update()
  end
end

defmodule MyApp.Orders.Order do
  use Ecto.Schema
  import Ecto.Changeset
  
  schema "orders" do
    ...
  end
  
  def create_changeset(order, attrs) do
    ... 
  end
  
  def complete_order_changeset(order, attrs) do
    ... 
  end
end

In the above case, we have three different side effects split across two different functions. We are first storing the order in our database based on the supplied input. We then take the newly created order and send the information off to our payment processor. Lastly we update our orders status in the database based on the response from the payment processor. Note how all of these side effects are constrained to the MyApp.Orders module and the MyApp.Orders.Order module contains only pure functions. This should be fairly simple to test as we only have a single public function that performs any side effects. The added complexity required should be contained within one function and the rest should be simple as they are pure.

If instead we allowed our MyApp.Orders.Order module to also perform side effects, we would have added complexity in both of the above modules because of the fact that any function that calls an impure function is impure itself. In our tests this would mean that we will need to take extra precautions to ensure we are doing the correct thing for both modules. When reading the code it becomes more complex because the effects may happen in several different places. The entire system becomes more complex for no gain.

Given the above examples I hope I have gotten you to think about your own projects and how you consider using side effects in the future. Keep in mind that these are meant to be guidelines and not hard rules. However, if we do take the time to stop and think about the kinds of side effects that our programs need to produce we should be able to create cleaner, simpler, and more easily testable code.