The patch will not be noticeable if the pattern is skilfully matched.
—Idabelle McGlauflin, Handicraft for Girls
Pattern matching is everywhere in Haskell. It's simple, but very interesting: there's more to it than just from left to right and from top to bottom. One way to better understand pattern matching is to answer two questions:
- What is the syntax of pattern matching?
- What are the semantics of pattern matching?
In order to summarize the different types of patterns for pattern
matching, we can study the definitions of three standard Haskell
functions: span
, unzip
, and fromMaybe
.
First, let's consider the span
function, which splits a list
into the longest prefix of elements that satisfy a predicate and the
remainder of the list, as defined in GHC.List
:
span :: (a -> Bool) -> [a] -> ([a],[a])
span _ xs@[] = (xs,xs)
span p xs@(x:xs')
| p x = let (ys,zs) = span p xs' in (x:ys,zs)
| otherwise = ([],xs)
The span
function has:
-
A variable,
p
, which matches anything and bindsp
to the value matched against. -
A wildcard pattern,
_
, which matches anything and binds nothing. -
Two as patterns,
xs@[]
andxs@(x:xs')
, which match something and bindxs
to[]
or appropriate values if[]
and(x:xs')
match something, respectively.
Now, let's take a look at the unzip
function, which
"transforms a list of pairs into a list of first components and a list
of second components," as defined in GHC.List
:
unzip :: [(a,b)] -> ([a],[b])
unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
The unzip
function contains:
- A lazy pattern,
~(as,bs)
, which matches anything, and bindsas
andbs
to appropriate values if(as,bs)
matches something, or toundefined
otherwise.
Finally, let's consider the fromMaybe
function, which
takes a default value and a Maybe
, and returns the value contained
in the Maybe
or the default value if there's nothing there, as
defined in Data.Maybe
:
fromMaybe :: a -> Maybe a -> a
fromMaybe d mx =
case mx of
Nothing -> d
Just x -> x
The fromMaybe
function contains regular patterns inside a case
expression. In reality, all patterns are transformed to case
expressions, and the (formal) semantics of pattern matching are
actually the semantics of case
expressions, as described in the
Haskell 2010 Language Report.
However, we can study how pattern matching works in terms of patterns
other than the ones inside case
expressions, that is, the informal
semantics of pattern matching.
In short, the informal semantics of pattern matching are as follows:
- Patterns are matched against values.
- Pattern matching may
- succeed,
- fail, or
- diverge (that is,
undefined
). - Pattern matching proceeds
- from left to right and
- from top to bottom.
- Patterns can be
- irrefutable, which always succeed, even when matched against
undefined
, or - refutable, which may succeed or fail, but always diverge when
matched against
undefined
.
Additionally, there are five different types of irrefutable patterns:
variables, wildcards, as patterns where the actual pattern is
irrefutable, and lazy patterns1. All other patterns (for instance,
[]
and Nothing
, which are constructors defined by data
) are
refutable.
(For more information about pattern matching and the (formal and informal) semantics of pattern matching, see section 3.7 of the Haskell 2010 Language Report or section 4 of A Gentle Introduction to Haskell 98.)
There are subtle differences between refutable and irrefutable patterns. As examples, let's analyze two computations taken from the Haskell 2010 Language Report:
> (\ ~(x,y) -> 0) undefined
0
Here, matching the lazy pattern ~(x,y)
against undefined
succeeds,
and binds both x
and y
to undefined
because matching the pattern
(x,y)
against undefined
would diverge. Since binding does not
imply evaluation and neither x
nor y
are used anywhere else, the
result of the computation is 0
.
What happens if we remove the lazy pattern?
> (\ (x,y) -> 0) undefined
undefined
In this case, matching (x,y)
against undefined
diverges and, for
that reason, the computation diverges, that is, undefined
.
> (\ (x:xs) -> x:x:xs) undefined
undefined
Here, the computation diverges because matching (x:xs)
against
undefined
diverges.
What happens if we add a lazy pattern?:
> (\ ~(x:xs) -> x:x:xs) undefined
undefined:undefined:undefined
Matching ~(x:xs)
against undefined
succeeds. Since matching
(x:xs)
against undefined
would diverge, both x
and xs
are
bound to undefined
. Thus, x:x:xs
becomes
undefined:undefined:undefined
, a list which GHCi would start
printing and stop after the first undefined
, that is, [undefined
.
In section 4.2 of A Gentle Introduction to Haskell
98, the authors discuss two slightly different
versions of the take
function to show how subtle changes in patterns
yield different functions.
To wrap up, let's compare two slightly different versions of the
drop
function:
drop1 :: Int -> [a] -> [a]
drop1 n xs | n <= 0 = xs
drop1 _ [] = []
drop1 n (x:xs) = drop1 (n - 1) xs
drop2 :: Int -> [a] -> [a]
drop2 _ [] = []
drop2 n xs | n <= 0 = xs
drop2 n (x:xs) = drop2 (n - 1) xs
(drop1
is just drop
as defined in GHC.List
.)
What happens if the first argument of drop
is undefined
?
> drop1 undefined []
undefined
> drop2 undefined []
[]
drop2
is more defined with respect to its first argument. In other
words, drop1
is strict in its first argument, whereas drop2
is
nonstrict2.
What happens if the second argument of drop
is undefined
?
> drop1 0 undefined
undefined
> drop2 0 undefined
undefined
In this case, both functions are equally undefined, that is, both are strict in its second argument.
Even though both functions satisfy the specification of the drop
function (they return the suffix of a list after a given number of
elements), their meaning is slightly different. As Hudak, Peterson,
and Fasel suggest, "it is difficult to say (...) which definition is
better. (...) In certain applications, it may make a difference."
This is by no means a thorough examination of pattern matching in
Haskell. If anything, these examples show that thinking in terms of
the semantics of pattern matching, and, more specifically, in terms of
undefined
, irrefutable and refutable patterns, and nonstrict and
strict functions, is interesting and can really help in better
understanding pattern matching and Haskell in general.
Happy pattern matching!