Fun with flat_map (in Ruby)

How I accidentally learned about...

Let's look at computing squares and square-roots using map.

irb
> [2].map { |i| i*i }
=> [4]

> [4].map { |i| Math.sqrt(i) }
=> [2.0]

> [-1].map { |i| Math.sqrt(i) }
Math::DomainError (Numerical argument is out of domain - "sqrt")

How do we represent that there's no answer? Let's use an array so it may or may not have an element in it.

> [-1].map { |i| i >= 0 ? [Math.sqrt(i)] : [] }
=> [[]]

> [9].map { |i| i >= 0 ? [Math.sqrt(i)] : [] }
=> [[3.0]]

So our input is an array of one element and the output is an array of an array of zero or one element. Let's get rid of the extra nested array.

> [9].map { |i| i >= 0 ? [Math.sqrt(i)] : [] }.flatten
=> [3.0]

> [-1].map { |i| i >= 0 ? [Math.sqrt(i)] : [] }.flatten
=> []

What does flatten actually do?

> [1, [2, 3], [4, [5, 6], [7, 8], 9], [10, 11], 12].flatten
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

Flatten will recursively undo the array nesting. We're only really dealing with one extra level of nesting so we can say flatten(1).

> [1, [2, 3], [4, [5, 6], [7, 8], 9], [10, 11], 12].flatten(1)
=> [1, 2, 3, 4, [5, 6], [7, 8], 9, 10, 11, 12]

> [9].map { |i| i >= 0 ? [Math.sqrt(i)] : [] }.flatten(1)
=> [3.0]

This idea of performing a map followed by a flatten(1) is a common pattern so it actually has its own combined operation: flat_map.

> [9].flat_map { |i| i >= 0 ? [Math.sqrt(i)] : [] }
=> [3.0]

Just for kicks, what if we were to give flat_map multiple values to perform computations upon?

> [4, -1, 9].flat_map { |i| i >= 0 ? [Math.sqrt(i)] : [] }
=> [2.0, 3.0]

It drops any empty array results from the output. Using map would collect into the output each array of each computation step, empty or otherwise. You can think of flat_map as concatenating each step's result to the final output array 'flatly' rather than creating nested array elements. Concatenating an empty array has no effect so does not appear in the final output.

Now let's look at the Optional type and see how it compares.

First set up a Ruby environment with bundler and the optional gem.

$ chruby 2.5.5
$ gem install bundler

Create a Gemfile with the following contents:

source 'gems' do
  gem 'optionals', '~> 1.0'
end

Install the bundler named gem.

$ bundle install

Now let's use it.

irb
> require 'optionals'
=> true
> Optionals::Optional
=> Optionals::Optional

> Optional.of(4)
=> #<Optionals::Some:0x00007fd7f214a2e0 @value=4>
> Optional.of(nil)
=> #<Optionals::None:0x00007fd7f48874c0>

> Optional.of(4).and_then { |i| Math.sqrt(i) }
=> 2.0
> Optional.of(-1).and_then { |i| Math.sqrt(i) }
Math::DomainError (Numerical argument is out of domain - "sqrt")

We have the similar situation as before with [-1].mapwith a block that returns numbers.

> Optional.of(4).and_then { |i| i >= 0 ? Optional.of(Math.sqrt(i)) : Optional.none }
=> #<Optionals::Some:0x00007fd7f2199d18 @value=2.0>
Optional.of(-1).and_then { |i| i >= 0 ? Optional.of(Math.sqrt(i)) : Optional.none }
=> #<SOptionals::None:0x00007fd7f48874c0>

Sure, we get all that, but what's the point of all this?

> maybe_root = { |i| i >= 0 ? Optional.of(Math.sqrt(i)) : Optional.none }
=> #<Proc:0x00007fd7f21c9658@(irb):17 (lambda)>
> maybe_root.call(4)
=> #<Optionals::Some:0x00007fd7f21daed0 @value=2.0>
> maybe_root.call(-1)
=> #<Optionals::None:0x00007fd7f48874c0>

> def maybe_root(i)
>   # for a known non-nil value, you should use `some` rather than `of`.
>   i >= 0 ? Optional.some(Math.sqrt(i)) : Optional.none
> end
=> :maybe_root

> Optional.of(16).and_then { |i| maybe_root(i) }
=> #<Optionals::Some:0x00007fbea8897e20 @value=4.0>
> Optional.of(16).and_then { |i| maybe_root(i) }.and_then { |i| maybe_root(i) }
=> #<Optionals::Some:0x00007fbea6991c60 @value=2.0>
> Optional.of(16).and_then { |i| maybe_root(i) }.and_then { |i| maybe_root(i) }.and_then { |i| maybe_root(i) }
=> #<Optionals::Some:0x00007fbea78405e0 @value=1.4142135623730951>
> Optional.none.and_then { |i| puts "I'm here!"; maybe_root(i) }
=> #<Optionals::None:0x00007fbea692ae98>

Notice now the last Optional.none.and_then never entered the block and merely returned None.

The purpose of making a function have the form that takes a plain argument and return a 'wrapped' value is that we can chain many operations without intervening checks for presence of values. If at any time during the chain of computation no value is produced, any further operations along the chain will continue to produce no value.

A better example than maybe_root as we defined it here would be a maybe_whole_root which only returns roots if the input is a square value. Then if the computation on the value 2 does not return a whole number, further chained computations will continue to return None.

There are other things than Optional which can be used with this behaviour. Using an array with a chain of flat_mapoperations has a similar effect.

> [16].flat_map { |i| [maybe_root(i).value_or(nil)].compact }.flat_map { |i| [maybe_root(i).value_or(nil)].compact }
=> [2.0]
> [1, -1, 16].flat_map { |i| [maybe_root(i).value_or(nil)].compact }.flat_map { |i| [maybe_root(i).value_or(nil)].compact }
=> [1.0, 2.0]

The Results::Result type is very much like an Optional with the added benefit that when there's no value, the reason for the missing value can be stored and later retrieved.

require 'results'

...to be continued?
> # Basically, Result also has an .and_then method which will preserve
> # the first error regardless of additional (not called) chained operations.

In addition to 'boxed' values there are other 'wrapped' forms which can be handled by functions that take a plain value and return a 'wrapped-in-some-context' value. Asynchronous computation is such another example.

In general, if we have a domain of input, say numbers denoted by A and a function that can turn any instance of A into M[A]where M is some context such as Optional, Result, Async, etc, then we can chain computations together using these function without examining or adapting their shapes in-between.

BTW, things like M are called Monads in Category Theory. Not only can you chain computations within the same context, you can compose contexts, e.g. M1[M2[M3[A]]] with each context handling a different computational aspect or concern. Some examples would be configuration, logging, generating metrics, sending notifications, and the list is quite endless when decomposing problems this way. In a highly-typed language, the type itself identifies the computations that exactly will have had to occurred (or at least attempted) to produce a value of the type.

PreviousWhat I'm Making and What I Use NextEnumerating Union Types in Elm

Last updated 5 years ago

Was this helpful?