Select and Map Are Good
This article argues that when able to one should break down
iteration operations over an array into #map
and #select
as opposed to
operating on the enumerable through an #each
.
The Examples
Throughout this article I will refer to the following, contrived, example:
You have an array of numbers [1, 2, 3, 4, 5]
and you want to subtract 3
from each of the items and then remove all items that are 0.
A) Using #each
you could express this as:
the_array = [1, 2, 3, 4, 5]
new_array = []
the_array.each do |item|
new_item = item - 3
if new_item != 0
new_array << new_item
end
end
B) Using #select
and #map
you could express this as:
the_array = [1, 2, 3, 4, 5]
new_array = the_array.map { |item| item - 3 }
new_array.select! { |item| item != 0 }
The Arguments
Better Seperation of Logic
Example A is doing two things in one block whereas example B is doing just one thing in each of the two blocks. In general, the less there is in a block the easier that block is to understand. Breaking a problem down into map and select means that you have broken the problem up into two distinct parts.
Clarity
Which brings us to what those two parts do. They are actually named. If I am reading example B and I am trying to find the part where items are removed, then I look in the select block. If I am looking for the part where the items are changed then I look in the map block. The method names tell how the block is to be used.
When reading example A, I have to read all of the #each
block if I am
looking for where the values are changed or when the items are
removed.
Potential Counter Arguments
I think some people may argue that speed is a big issue. The idea is that you are iterating over the enumerable twice so it is using more time.
Okay let us assume the time to to process item - 3
and assign it to
variable/add it to the array is a
and that time to check the
new_item/item != 0
and add it the array is b
, and the time to setup
each iteration of the array is c
. We will also assume i
is the number
of iterations to travese the array.
So example A will take i(a + b + c)
time and example B will take
i(a + c) + i(b + c)
. The difference between these two ends up being
B - A so i(a + b) + 2ic - (i(a + b) + ic)
= ic
. The difference
is going to be the time to setup the iterations.
Let us check the actual difference in time with a much bigger array:
require 'benchmark'
def with_each(the_array)
new_array = []
the_array.each do |item|
new_item = item - 3
if new_item != 0
new_array << new_item
end
end
new_array
end
def with_map_and_select(the_array)
new_array = the_array.map { |item| item - 3 }
new_array.select! { |item| item != 0 }
end
the_array = (-10000000..10000000).to_a
Benchmark.bmbm do |x|
x.report("With #each") { with_each(the_array) }
x.report("With #map and #select") { with_map_and_select(the_array) }
end
The output on my computer is:
user system total real
With #each 1.830000 0.040000 1.870000 ( 1.924606)
With #map and #select 2.440000 0.030000 2.470000 ( 2.468480)
So the time difference is about +25% for example B and this is a really simple example. The difference as a percentage will fall as the complexity of the operations increases.
So the time difference does exists but I don't think this is going to be a huge factor in most cases.