Introduction
Functional programming has been a topic of interest in the development sphere for many years, and a lot of languages have included ways of applying some of its principles, including Python. But what is functional programming? It is a programming paradigm that is based on the composition and application of functions. To be able to fully use this paradigm certain principles and elements need to be supported by the language. Some of the main functional principles present in any language are pure functions, immutability, higher-order functions, partial application, lazy evaluation, pattern matching, and type checking. All of this is supported in Python, but this article will focus on the first three mentioned. Using this paradigm may seem a bit confusing at first, but it comes with some advantages that will make code cleaner, more robust, and easier to maintain.
Pure functions
A function can be considered as an operation that with some kind of input, will return an output. Pure functions are ones where the same input will always return the same output. They are predictable, and consequently easy to work with.
In Python, creating pure functions is, of course, possible. Consider the following example:
def powerOfTwo(x):
return x**2
# powerOfTwo(4) always returns 16
This function is pure: no matter how many times it is executed with the same input for x
, the result will be the same. If the function is changed to:
y = 3
def powerOfTwo():
return y**2
# powerOfTwo() will return 9 when y is 3
# powerOfTwo() will return 4 when y is 2
The function has become impure, as it is using values external to itself. So executing it multiple times could present different outputs when y
changes.
Immutability
Immutability is a property of a value, that once defined can’t be changed. This may reduce some flexibility in how a program is structured, but it allows for a way to avoid any side effects and unexpected changes to the value used.
In Python, achieving immutability may seem quite easy, but it requires some attention to the way declarations are used. Python has two main groups of data types: mutable and immutable.
Immutable data types refer to data types that once declared cannot be changed. That does not mean that a variable cannot be reassigned, but that creating a copy of one existing declaration results in a totally new declaration (new assignment in memory). Consider the following:
text = "Hello"
text2 = text
# text and text2 have the same value,
# but changing one of them does not affect the other
Variables text
and text2
will store the same value, but they are totally different declarations (use different spaces in memory). So reassigning text2
to something like:
text2 = text2 + " World"
# text will continue to be 'Hello'
# text2 is now 'Hello World'
text
won't be affected in any way. That may seem obvious but is not always the case.
Immutable data types are int, float, decimal, bool, string, tuple, and range.
Mutable data types are the ones that once declared can be changed. Lists are one of these data types. In the case of the list, this mutation allows adding, removing, or replacing values in the list. At the same time, assigning a new variable, from an already existing list declaration, will not result in a new list definition, but instead, a shallow copy will be made (memory is shared). The changes done in one list will affect the other. Take the following:
list1 = [1,2,3]
list2 = list1
list2[1] = 5
# list1 was originally [1,2,3]
# with the changes in list2, list1 is [1,5,3]
# list2 is also [1,5,3]
Both list1
and list2
will end up being the same. This behavior can be an obstacle to achieving a fully immutable code, so they have to be used with care. Ways to avoid this are to make deep copies during assignation or avoid declaring a list based on an existing one. Other mutable data types are dictionaries, sets, and user-defined classes.
If complete immutability is desired, mutable data types should be avoided or in the case of data classes using the frozen argument can be helpful. But, that will have a lot of limitations, so an alternative is to use techniques that avoid any mutations on mutable data types.
Higher-order functions
These types of functions are the ones that can take other functions as their arguments or return a function as a result. With the help of these, functions can be composed together and more complex behavior can be achieved. When higher-order functions are available, it is said that the language in question considers functions as first-class citizens. Python is one of those languages. In Python, adding a function as a parameter of another is trivial and works like any other parameter. Take for example:
def sumFive(x):
return x + 5
# sumFive(6) will return 11
def doTwice(func, *val):
return func(func(*val))
doTwice(sumFive,5)
# doTwice(sumFive,5) will return 15
Common Functions
Pure functions, immutability, and higher-order functions are things that are achievable in Python, and in consequence, functional programming principles can be applied in Python. But, if that was the only thing available, using functional programming in Python would end up being complicated and would require more coding than expected. Due to that, there are a lot of useful common functions (see functools) that use functional programming principles that allow working with data structures and program flows. The functions to be covered are map, filter, reduce, which are the main ones that would allow for a functional approach while coding.
map
Map is a function that takes another function and then applies it to each element of an iterable object. In addition, map supports multiple iterables as attributes. The function can be described as:
map(function, iterable, ...)
In the case of multiple iterables, the function that is passed to map should accept as many arguments as iterables passed. Also, map will stop iterating when the shortest iterable is consumed. Something to note is that after map is executed, the resulting value would need to be transformed to the iterator desired. As can be seen, this function allows for the creation of a new iterator from an existing one. At the same time, it has some clear advantages over a traditional for clause (or a while). Consider the following example:
list1 = [1,2,3,4,5,6,7]
for index, value in enumerate(list1):
list1[index] = value*2
# The list is iterated over and is mutaded element by element
# The resulting list will be [2,4,6,8,10,12,14]
In this code block, what is being done is taking a list and then trying to duplicate each of its values. By itself, the code looks alright, but there are some things to consider. It is clear that an existing declaration has a mutation. To maintain a functional programming approach, this approach would not be acceptable. Also, this manipulation comes with risks if the code changes drastically in the future. Using map, the code would end up like this:
list1 = [1,2,3,4,5,6,7]
def byTwo(x):
return x*2
list2 = list(map(byTwo,list1))
# The map function create a new list.
# list1 is iterated over and byTwo is applied to each element
# list1 is not mutated
# list2 will be equal to [2,4,6,8,10,12,14]
In this case, manipulations are not explicitly present. Instead, a new list is created with map. In addition, the code that creates the new list is shorter and more concise. It can even be said that it is easier to read.
filter
Filter is quite similar to map, a function and an iterable (this time multiple iterables are not supported) are taken. But, instead of applying the function to each element of the iterable to change their values, the function determines if the value will be included in the new generated iterable. If the function returns a true value, the evaluated element will stay; if the function returns something false, the element will be discarded. Filter is defined like the following:
filter(function, iterable)
In the same way as map, after filter is executed, there is still a need to transform the result to the desired iterable. The advantages to using filter over a traditional for/while structure is still the same as before. Consider the following:
list1 = [1,2,3,4,5,6,7]
list2 = []
for val in list1:
if val % 2 == 0:
list2.append(val)
# filtering is done with two lists
# list1 does not change
# list2 mutates on each loop to add the filtered items
# list2 will be equal to[2,4,6]
This time, all elements that are even are being added to a new list. The approach works fine, and the first declaration is not being manipulated, but still, mutation exists in the second declaration (the empty list). Using filter avoids these issues, and ends up with cleaner looking code:
def isEven(x):
return x % 2 == 0
list1 = [1,2,3,4,5,6,7]
list2 = list(filter(isEven, list1))
# using the filter functions no mutations in code are seen
# list2 is created by applying a comparision function on each element on list1
# list2 will be equal to [2,4,6]
reduce
Reduce is a quite special function. First of all, it is not readily available as map or filter; you need to import it from a module called functools. Second, this function is quite versatile: it allows transforming an iterable to any other desired structure (could be an int, an object, a dictionary, etc). Reduce is defined as:
reduce(function, iterable[, initializer])
The way it is used is fairly simple; a function and an iterable are needed and if useful, an initial value, too. The function needs 2 attributes; the first one is the previous value calculated and the second attribute is the value in the current position of the iterable in the calculation. As this can be a bit confusing, let's take the following example:
from functools import reduce
def alt_sum(prev, current):
return prev + current
result = reduce(alt_sum, [1,2,3,4,5])
# all elements are reduce into one
# result should be equal to 15
result in this case will have a value of 15
. Decomposing the whole operation, the following would be observed: alt_sum(alt_sum(alt_sum(alt_sum(1,2),3),4),5)
. It is a recursive call over the whole iterable of the function provided. When no initial value is provided, the initial value will be the first value of the iterable.
With an initial value:
result = reduce(alt_sum, [1,2,3],10)
# reduce is also able to start from an initial value
# in this case result should equal 16
The result would be equivalent to: alt_sum(alt_sum(alt_sum(10,1),2),3)
. As can be seen, the initial value is taken in the first function call instead of the first element of the iterable.
In the same way as the previous functions explored, the advantages of using reduce are that they allow for shorter code and help in maintaining some form of immutability to the data used.
In addition, for some very simple operations that could be done with reduce, there are already some functions that simplify the process. For example, to sum all the values in a list, sum can be used:
sum([1,2,3])
Other functions that can be used are prod, max, min, len, all, any, etc. It is better to use these instead of reduce if they fill our needs. It is important to note that some of these functions are built-in into Python and others need to be imported from functools.
Comprehensions
As was seen previously, map and filter can be used as ways of creating new iterables from existing ones. They are pretty good ways to have a more functional approach in our code, but they are not the only way of achieving the same in Python. Comprehension is a way of building iterables in Python from existing ones with a simple and concise structure. There are four types of comprehensions in Python:
- List comprehension
- Dictionary comprehension
- Set comprehension
- Generator comprehension
Using a map, the way of using it is the following:
list1 = [1,2,3,4,5,6,7]
def byTwo(x):
return x*2
list2 = list(map(byTwo,list1))
With list comprehension, this can be changed to:
list2 = [x*2 for x in list1]
As it can be seen, it is much more concise (map can be as concise as this with the use of lambdas, also called anonymous functions) and more importantly, transforming the result to a list is not needed, as that is implied in the structure. Breaking this down, three parts exist:
- [ ] represent the resulting iterator.
[]
is for lists,{}
is for dictionaries and sets, and()
is for generators. - x*2 is the operation: it can be any kind of operation and be very complex (the simpler the better). No statements are allowed, though. In the case of dictionaries a key : value needs to be specified.
- for x in list1 is the iteration done. It can come in many forms, and iterations over any iterable are possible, even if the comprehension produces something different from it.
One difference from map is that the support to iterate multiple iterables at the same time is not as straightforward and requires more work.
Now, what about filtering? It works fairly similarly, it only needs to add a conditional. For example:
list2 = [x for x in list1 if x % 2 == 0]
That will return all even numbers, in the same way as the filter function. The conditional presented adds another dimension to this, and if an else statement is wanted, the structure will change slightly like this:
list2 = [x if x%2 == 0 else 0 for x in list1]
Overall, comprehensions are a good alternative to map and filter, and even better because most of the time they result in shorter and more concise code. But, things can get very complex with comprehension, and if not careful, one could end up with very hard to read code. For that, a good practice is to keep them simple, and if complex operations are needed, having another function that is called inside the comprehension is best.
Some Considerations
Having a functional approach to code in python can offer many advantages to how the code works and how easy to work with it is. Having pure functions, maintaining some sense of immutability and making use of higher order functions can make the code cleaner and avoid unexpected problems. The functions and tools presented (map, filter, reduce, comprehensions) are great ways for achieving this. But, there are many other tools that were not covered, that are very useful (decorators, pattern matching, etc), and more so, these approaches do not invalidate more common ways of iterating or building code. Here are some points to consider:
- Maintaining a functional approach is something that is mostly the responsibility of the programmer. The language (in this case Python) only gives tools and features that enable the approach but does not guarantee it. Comprehension, map, filter or reduce are really good for creating new structures. But overall, unless a mutation is involved a classic for loop is a good choice.
- Some functional patterns are fairly fast, sometimes a bit more than a traditional approach, but they will have some memory usage overhead if one is not careful. For example, achieving immutability means creating new structures from existing ones, so if we have 1MB in data, creating something new with this can end up with 2MB or more used. When used with very big data sets, it is important to keep the amount of data stored in memory in check.
- More compact code can also lead to less readable code if everything is in one line or has undescriptive names. Comprehensions, filter, map and reduce can make code shorter, but it is better to divide code into more functions so it is easier to read and understand. In addition dividing the code into smaller pieces helps make it more testable and maintainable.
- Functional programming is not something that needs to be fully used or not used at all in Python. It can be considered to be something akin to a “style” of programming. Its principles can be used with other paradigms based on the problems that need to be solved. At the end of the day, Python is an Object-Oriented language by design, so everything will use objects under the hood.