The Idris 2 Programming Language
by Stefan Höck, Nathan McCarty, and others
Welcome to the community Idris 2 tutorial! This book aims to be a comprehensive resource for learning the Idris 2 programming language.
This book is rendered from a collection of Idris source files structured as a normal Idris project, which you can download and play around with.
Introduction
Many of the Markdown files making up this book (those with a .md
file extension) are literate Idris files, consisting of a mixture of Markdown and Idris code, and can be type checked and built just like regular code by the Idris compiler. You can identify a document as a literate Idris document if it contains a module
declaration, like so:
module Tutorial.Intro
Even though this file (src/Tutorial/Intro.md
) has no actual code in it, by including that module
declaration, it qualifies as a literate Idris file. A module name consists of a list of identifiers separated by dots and must reflect the folder structure plus the module file's name, starting from the source directory. For instance, as this file's path, from the root of the src
directory is Tutorial/Intro.md
, it's module name must be Tutorial.Intro
.
Before starting this book, make sure you have the Idris compiler installed on your computer. While it is technically possible to work through this book without it, we recommend that you have the pack package manager installed and have a skeleton package setup as described in the Getting Started with pack and Idris2 appendix, as such a setup is assumed.
Later in the book, you will encounter various exercises. The solutions to these exercises can be found as regular Idris files in the src/Solutions
directory of the git repository, or in syntax highlight form in the "Exercise Solutions" section at the bottom of the navigation sidebar.
About the Idris Programming Language
Idris is a pure, dependently typed, total functional programming language.
Lets break that down and explore what each of those terms means on their own.
Functional Programming
In functional programming languages, functions are first-class constructs, meaning that they can be assigned to variables, passed as arguments to other functions, and returned as results from functions, just like any other value in the language. Unlike in, for instance, object-oriented languages, functions are the main form of abstraction in functional programming.
Whenever we find a common pattern or (almost) identical code in several parts of a project, we try to implement an abstraction over it to avoid write the same code multiple times. In functional programming, we do this by introducing one or more new functions implementing the required behavior, often trying to be as general as possible to maximize the versatility and re-usability of our functions.
Functional programming languages are concerned with the evaluation of functions, unlike imperative languages, which are concerned with the execution of statements.
Pure Functional Programming
Pure functional programming languages come with an additional important guarantee:
Functions don't have side effects, like writing to a file or mutating global state. They can only compute a result from their arguments possibly by invoking other pure functions, and nothing else. Given the same input, a pure function will always generate the same output, this property is known as referential transparency.
Pure functions have several advantages:
-
They are easy to test by specifying (possibly randomly generated) sets of input arguments alongside the expected results.
-
They are thread-safe. Since they don't mutate global state, they be used in several computations running in parallel without interfering with each other.
There are, of course, also some disadvantages:
-
Some algorithms are hard to implement efficiently with only pure functions.
-
Writing programs that actually do something (have some observable effect) is a bit tricky, but certainly possible.
Dependent Types
Idris is a strongly, statically typed programming language. Every expression is given a type (for instance: integer, list of strings, boolean, function from integer to boolean, etc.), and types are verified at compile time to rule out certain common programming errors.
For instance, if a function expects an argument of type String
(a sequence of unicode characters, such as "Hello123"
), it is a type error to invoke this function with an argument of type Integer
, and Idris will refuse to compile such an ill-typed program.
Being statically typed means that Idris will catch type errors at compile time, before it generates an executable program that can be run. This stands in contrast with dynamically typed languages such as Python, which check for type errors at runtime, while a program is already being executed. It is the goal of statically typed languages to catch as many type errors as possible before there even is a program that can be run.
Furthermore, Idris is dependently typed, which is one of its most characteristic properties in comparison to other programming languages. In Idris, types are first class: Types can be passed as arguments to functions, and functions can return types as their results. Types can also depend on other values, as one example, the return type of a function can depend on the value of one of its arguments. This is a quite abstract statement that may be difficult to grasp at first, but we will be exploring its meaning and the profound impact it has on programming through example as we move through this book.
Total Functions
A total function is a pure function which is guaranteed to return a value of its return type for every possible set of inputs in a finite number of computational steps. A total function will never fail with an exception or loop infinitely, although it can still take arbitrarily long to compute its result.
Idris comes with a totality checker built-in, which allows us to verify that the functions we write are provably total. Totality in Idris is opt-in, as checking the totality of an arbitrary computer program is undecidable in the general case (a dilemma you may recognize as the halting problem). However, if we annotate a function with the total
keyword, and the totality checker is unable to verify that the function is, indeed, total, Idris will fail with a type error. Notably, failing to determine a function is total is not the same as judging the function to be non-total.
Using the REPL
Idris comes with a REPL (Read Evaluate Print Loop), which is useful for tinkering with small ideas, and for quickly experimenting with the code we just wrote. To start a REPL session, run the following command in a terminal:
pack repl
Idris should now be ready to accept your commands:
____ __ _ ___
/ _/___/ /____(_)____ |__ \
/ // __ / ___/ / ___/ __/ / Version 0.5.1-3c532ea35
_/ // /_/ / / / (__ ) / __/ https://www.idris-lang.org
/___/\__,_/_/ /_/____/ /____/ Type :? for help
Welcome to Idris 2. Enjoy yourself!
Main>
We can go ahead and enter some simple arithmetic expressions, Idris will evaluate them and print the result:
Main> 2 * 4
8
Main> 3 * (7 + 100)
321
Since every expression in Idris has a type, we might want to inspect those as well:
Main> :t 2
2 : Integer
:t
is a command specific to the Idris REPL (it is not part of the Idris programming language), and it is used to inspect the type of an expression:
Main> :t 2 * 4
2 * 4 : Integer
Whenever we perform calculations involving integer literals without explicitly specifying the types involved, Idris will assume the Integer
type by default. Integer
is an arbitrary precision (there is no hard-coded maximum value) signed integer type. It is one of the primitive types built into the language. Other primitives include fixed precision signed and unsigned integral types (Bits8
, Bits16
, Bits32
Bits64
, Int8
, Int16
, Int32
, and Int64
), double precision (64 bit) floating point numbers (Double
), unicode characters (Char
) and strings of unicode characters (String
).
A First Idris Program
module Tutorial.Intro.FirstIdrisProgram
While we will often start with the REPL for tinkering with small parts of the Idris language, for reading some documentation, or for inspecting the content of an Idris module, lets go ahead and will write a minimal Idris program to get started with the language.
Here comes the mandatory Hello World:
main : IO ()
main = putStrLn "Hello World!"
We will inspect the code above in some detail in a moment, but first we'd like to compile and run it. If you have checked out this books source code, you can run the following from the root directory:
pack -o hello exec src/Tutorial/Intro/FirstIdrisProgram.md
This will create an executable called hello
in the build/exec
directory, which can be invoked from the command-line like so (without the dollar prefix; this is used here to distinguish the terminal command from its output):
$ build/exec/hello
Hello World!
The pack program requires an .ipkg
to be in scope (in the current directory or one of its parent directories), which provides other settings like the source directory to use (src
in our case). The optional -o
option provides a name to use for the executable to be generated. Pack comes up with a name of its own it this is not provided. Type pack help
for a list of available command-line options and commands, and pack help <cmd>
for help with a specific command.
You can also load this source file in a REPL session and invoke function main
from there:
pack repl src/Tutorial/Intro/FirstIdrisProgram.md
Tutorial.Intro> :exec main
Hello World!
Go ahead and try both ways of building and running main
on your system!
The Shape of an Idris Definition
module Tutorial.Intro.ShapeOfADef
Now that we have executed our first Idris program, lets talk a bit more about the code we had to write to define it.
A typical top level function in Idris consists of three things:
- The function's name (
main
in our case) - Its type (
IO ()
) - Its implementation (
putStrLn "Hello World"
)
Lets explore these parts through a couple of examples, starting out by defining a constant for the largest unsigned 8 bit integer:
maxBits8 : Bits8
maxBits8 = 255
The first line can be read as:
We'd like to declare a (nullary, or zero argument) function
maxBits8
. It is of typeBits8
.
This is called the function declaration, we declare that there shall be a function of the given name and type.
The second line reads:
The result of invoking
maxBits8
should be255
. (As you can see, we can use integer literals for other integral types and not justInteger
.)
This is called the function definition, the function maxBits8
should behave as described here when being evaluated.
We can inspect this at the REPL, load this source file into an Idris REPL (as described in the previous section, this time using src/Tutorial/Intro/ShapeOfADef.md
as the source file), and try running the following tests:
Tutorial.Intro> maxBits8
255
Tutorial.Intro> :t maxBits8
Tutorial.Intro.maxBits8 : Bits8
We can also use maxBits8
as part of another expression:
Tutorial.Intro> maxBits8 - 100
155
We previously described maxBits8
as a nullary function, which is just a fancy word for a constant. Let's write and test our first real function:
distanceToMax : Bits8 -> Bits8
distanceToMax n = maxBits8 - n
This introduces some new syntax and a new kind of type: Function types.
distanceToMax : Bits8 -> Bits8
can be read as:
distanceToMax
is a function of one argument, with typeBits8
, which returns a result of typeBits8
.
In the implementation, the argument is given a local identifier (a fancy term for "name") n
, which is then used in the calculation on the right hand side. Go ahead and try this function at the REPL:
Tutorial.Intro> distanceToMax 12
243
Tutorial.Intro> :t distanceToMax
Tutorial.Intro.distanceToMax : Bits8 -> Bits8
Tutorial.Intro> :t distanceToMax 12
distanceToMax 12 : Bits8
As a final example, let's implement a function that calculates the square of an integer:
square : Integer -> Integer
square n = n * n
We now learn a very important aspect of programming in Idris: Idris is a statically typed programming language. We are not allowed to freely mix types as we please, doing so will result in an error message from the type checker (which is part of Idris's compilation process). For instance, if we try the following at the REPL, we will get a type error:
Tutorial.Intro> square maxBits8
Error: ...
This is because square
expects an argument of type Integer
, but maxBits8
is of type Bits8
. Many primitive types can be converted back and forth between each other (sometimes with the risk of loss of precision) using function cast
(we will cover cast
in further detail in the section on Interfaces in the Prelude):
Tutorial.Intro> square (cast maxBits8)
65025
Notice that the above result is much larger than maxBits8
. This is because maxBits8
is first converted to an Integer
of the same value, which is then squared. If we instead squared maxBits8
directly, the result would be truncated to still fit in the range of valid Bits8
s:
Tutorial.Intro> maxBits8 * maxBits8
1
Where to get Help
There are several resources available online and in print, where you can find help and documentation about the Idris programming language. Here is a non-comprehensive list:
-
Type-Driven Development with Idris
The Idris book! This describes in great detail the core concepts for using Idris and dependent types to write robust and concise code. It uses Idris 1 in its examples, so parts of it have to be slightly adjusted when using Idris 2. There is also a list of required updates.
-
The official Idris 2 tutorial. A comprehensive but dense explanation of all features of Idris 2. I find this to be useful as a reference, and as such it is highly accessible. However, it is not an introduction to functional programming or type-driven development in general.
-
Look here for detailed installation instructions and some introductory material. There is also a wiki, where you can find a list of editor plugins, a list of external backends, and other useful information.
-
This is the listing of all the libraries included in pack's collection, which is currently the most comprehensive source of community contributed libraries for Idris 2.
-
If you get stuck with a piece of code, want to ask about some obscure language feature, want to promote your new library, or want to just hang out with other Idris programmers, this is the place to go. The discord channel is pretty active and very friendly towards newcomers.
-
The Idris REPL
Finally, a lot of useful information can be provided by Idris itself. Many users tend to kep at least one REPL open while working on an Idris project. Text editors can be set up to use the language server for Idris 2, which is incredibly useful. In the REPL,
- use
:t
to inspect the type of an expression or meta variable (hole)::t foldl
, - use
:ti
to inspect the type of a function including implicit arguments::ti foldl
, - use
:m
to list all meta variables (holes) in scope, - use
:doc
to access the documentation of a top level function (:doc the
), a data type plus all its constructors and available hints (:doc Bool
), a language feature (:doc case
,:doc let
,:doc interface
,:doc record
, or even:doc ?
), or an interface (:doc Uninhabited
), - use
:module
to import a module from one of the available packages::module Data.Vect
, - use
:browse
to list the names and types of all functions exported by a loaded module::browse Data.Vect
, - use
:help
to get a list of other commands plus a short description for each.
- use
Conclusion
In this introduction we learned about the most basic features of the Idris programming language. We used the REPL to tinker with our ideas and inspect the types of things in our code, and we used the Idris compiler to compile an Idris source file to an executable.
We also learned about the basic shape of a top level definition in Idris, which always consists of an identifier (its name), a type, and an implementation.
What's next?
In the next chapter, we start programming in Idris for real. We learn how to write our own pure functions, how functions compose, and how we can treat functions just like other values and pass them around as arguments to other functions.
Introduction To Functions
Idris is a functional programming language, functions are its main form of abstraction (unlike for instance in an object oriented language like Java, where objects and classes are the main form of abstraction). Thus, we expect Idris to make it very easy for us to compose and combine functions to create new functions. In fact, in Idris functions are first class, functions can take other functions as arguments and can return functions as their results.
This chapter will explore some of the basic tools Idris provides for combining and producing functions .
Functions with more than one Argument
module Tutorial.Functions1.FunctionsWithMultipleArguments
Let's implement a function, which checks if its three Integer
arguments form a Pythagorean triple, we'll need to use a new operator for this: ==
, the equality operator.
export
isTriple : Integer -> Integer -> Integer -> Bool
isTriple x y z = x * x + y * y == z * z
Let's give this a spin at the REPL before we talk a about the types:
Tutorial.Functions1> isTriple 1 2 3
False
Tutorial.Functions1> isTriple 3 4 5
True
As this example demonstrates, the type of a function of several arguments consists of a sequence of argument types (also called input types) chained by function arrows (->
), terminated by an output type (Bool
in this case).
The implementation looks like a mathematical equation: The arguments are listed on the left hand side of the =
, and the computation(s) to perform with them are described on the right hand side.
Function implementations in functional programming languages often have a more mathematical look compared to implementations in imperative languages, which often describe not what to compute, but instead how to compute it by describing an algorithm as a sequence of imperative statements. This imperative style is also available in Idris, and we will explore it in later chapters, but we prefer the declarative style whenever possible.
As shown in the above example, functions can be invoked by passing the arguments separated by whitespace. No parentheses are necessary, unless one of the expressions we pass as the function's arguments contains its own additional whitespace. This syntax provides for particularly ergonomic partial function application, a concept we will cover in a later section.
Note that, unlike Integer
or Bits8
, Bool
is not a primitive data type built into the Idris language but just a normal data type that you could have written yourself. We will cover data type definitions in the next chapter
Function Composition
module Tutorial.Functions1.FunctionComposition
Functions can be combined in several ways, the most direct probably being the dot (.
) operator:
export
square : Integer -> Integer
square n = n * n
times2 : Integer -> Integer
times2 n = 2 * n
squareTimes2 : Integer -> Integer
squareTimes2 = times2 . square
Give this a try at the REPL! Does it do what you'd expect?
We could have implemented squareTimes2
without using the dot operator as follows:
squareTimes2' : Integer -> Integer
squareTimes2' n = times2 (square n)
To get a better insight into how the dot operator works, let's implement our own version of it, instead called <.>
to avoid name collision with the built-in dot operator:
private infixr 9 <.>
(<.>) : (b -> c) -> (a -> b) -> a -> c
f <.> g = \x => (f (g x))
We'll cover more about functions that take other functions as arguments in the next section, but for now, it suffices to know that our <.>
is identical to the built-in .
, and can be used the same way:
squareTimes2'' : Integer -> Integer
squareTimes2'' = times2 <.> square
It is important to note that functions chained by the dot operator are invoked from right to left: times2 . square
is the same as \n => times2 (square n)
and not \n => square (times2 n)
. This can be seen in our definition of <.>
.
We can conveniently chain several functions using the dot operator to write more complex functions:
dotChain : Integer -> String
dotChain = reverse . show . square . square . times2 . times2
This will first multiply the argument by four, then square it twice before converting it to a string (show
) and reversing the resulting String
(functions show
and reverse
are part of the Idris Prelude and as such are available in every Idris program).
Higher-order Functions
module Tutorial.Functions1.HigherOrder
import Tutorial.Functions1.FunctionComposition
Functions can take other functions as arguments. This is an incredibly powerful concept which can be taken to an extreme very easily, but to keep things simple, we'll start slowly:
isEven : Integer -> Bool
isEven n = mod n 2 == 0
testSquare : (Integer -> Bool) -> Integer -> Bool
testSquare fun n = fun (square n)
In the above definition, isEven
uses the mod
function to check if an integer is divisible by two, and is defined in the same straightforward manor as the other functions we have defined so far.
testSquare
, however, is more interesting. It takes two arguments, the first argument having the type of a function from Integer
to Bool
, and the second having type Integer
. The second argument is squared before being passed to the first argument.
Let's give this a go at the REPL:
Tutorial.Functions1> testSquare isEven 12
True
Take your time to understand what's going on here. We pass the function isEven
as the first argument to testSquare
. The second argument is an integer, which will first be squared and then passed to isEven
. While this particular example is not very interesting, we will cover lots of use cases for passing functions as arguments to other functions as we continue.
As noted earlier, things can go to an extreme pretty easily. Consider the following example:
twice : (Integer -> Integer) -> Integer -> Integer
twice f n = f (f n)
And at the REPL:
Tutorial.Functions1> twice square 2
16
Tutorial.Functions1> (twice . twice) square 2
65536
Tutorial.Functions1> (twice . twice . twice . twice) square 2
*** huge number ***
You might be surprised about this behavior, so let's break it down. The following two expressions are identical in their behavior:
expr1 : Integer -> Integer
expr1 = (twice . twice . twice . twice) square
expr2 : Integer -> Integer
expr2 = twice (twice (twice (twice square)))
Let's walk through this:
square
raises its argument to the 2nd powertwice square
appliessquare
twice, raising its argument to the 4th powertwice (twice square)
raises it to the 16th power, by invokingtwice square
twice- And so on until
twice (twice (twice (twice square)))
, which raises it's argument to the 65536th power, giving an impressively huge result
Currying
module Tutorial.Functions1.Currying
import Tutorial.Functions1.FunctionsWithMultipleArguments
Once we start using higher-order functions, the concept of partial function application (also called currying after mathematician and logician Haskell Curry) becomes very important.
Load this file in a REPL session and try the following:
Tutorial.Functions1.Currying> :t testSquare isEven
testSquare isEven : Integer -> Bool
Tutorial.Functions1.Currying> :t isTriple 1
isTriple 1 : Integer -> Integer -> Bool
Tutorial.Functions1.Currying> :t isTriple 1 2
isTriple 1 2 : Integer -> Bool
Notice how in Idris we can partially apply a function with more than one argument and, as a result, get a new function back. For instance, isTriple 1
applies argument 1
to function isTriple
and returns a new function of type Integer -> Integer -> Bool
. We can even use the result of such a partially applied function in a new top level definition:
partialExample : Integer -> Bool
partialExample = isTriple 3 4
And at the REPL:
Tutorial.Functions1.Currying> partialExample 5
True
We already used partial function application in our twice
examples above to get some impressive results with very little code.
Anonymous Functions
module Tutorial.Functions1.Lambdas
import Tutorial.Functions1.FunctionsWithMultipleArguments
Sometimes we'd like to pass a small custom function to a higher-order function, but without the hassle writing a top level definition. For instance, in the following example, function someTest
is very specific and probably not very useful in general, but we'd still like to pass it to higher-order function testSquare
:
someTest : Integer -> Bool
someTest n = n >= 3 || n <= 10
Here's, how to pass it to testSquare
:
Tutorial.Functions1> testSquare someTest 100
True
Instead of defining and using someTest
, we can use an anonymous function:
Tutorial.Functions1> testSquare (\n => n >= 3 || n <= 10) 100
True
For clarity, lets use an anonymous function to reproduce the above definition:
someTest' : Integer -> Bool
someTest' = \n => n >= 3 || n <= 10
Anonymous functions are sometimes also called lambdas (from lambda calculus), and the backslash is chosen since it resembles the Greek letter lambda.
The \n =>
syntax introduces a new anonymous function of one argument called n
, the implementation of which is given on the right hand side of the function arrow. Like other top level functions, lambdas can have more than one arguments, separated by commas: \x,y => x * x + y
. When we pass lambdas as arguments to higher-order functions, they typically need to be wrapped in parentheses or separated by the dollar operator ($)
(see the next section about this).
Note that, in a lambda, arguments are not annotated with types, so Idris has to be able to infer them from the current context.
Operators
module Tutorial.Functions1.Operators
In Idris, infix operators like .
, *
or +
are not built into the language, they are instead just regular Idris function with some special support for using them in infix notation. When we use operators outside of infix notation, we have to wrap them in parentheses.
As an example, let us define a custom operator for sequencing functions of type Bits8 -> Bits8
:
infixr 4 >>>
(>>>) : (Bits8 -> Bits8) -> (Bits8 -> Bits8) -> Bits8 -> Bits8
f1 >>> f2 = f2 . f1
foo : Bits8 -> Bits8
foo n = 2 * n + 3
test : Bits8 -> Bits8
test = foo >>> foo >>> foo >>> foo
In addition to declaring and defining the operator itself, we also have to specify its fixity: infixr 4 >>>
means, that (>>>)
associates to the right (meaning, that f >>> g >>> h
is to be interpreted as f >>> (g >>> h)
) with a priority of 4
. You can also have a look at the fixity of operators exported by the Prelude in the REPL:
Tutorial.Functions1> :doc (.)
Prelude.. : (b -> c) -> (a -> b) -> a -> c
Function composition.
Totality: total
Fixity Declaration: infixr operator, level 9
When you mix infix operators in an expression, those with a higher priority bind more tightly. For instance, (+)
is left associated with a priority of 8, while (*)
is left associated with a priority of 9. Hence, a * b + c
is the same as (a * b) + c
instead of a * (b + c)
.
Operator Sections
Operators can be partially applied just like regular functions. In this case, the whole expression has to be wrapped in parentheses and is called an operator section. Here are two examples:
Tutorial.Functions1> testSquare (< 10) 5
False
Tutorial.Functions1> testSquare (10 <) 5
True
As you can see, there is a difference between (< 10)
and (10 <)
. The first tests whether its argument is less than 10, and the second tests whether 10 is less than its argument.
One exception where operator sections will not work is with the minus operator (-)
. Here is an example to demonstrate this:
applyToTen : (Integer -> Integer) -> Integer
applyToTen f = f 10
This is just a higher-order function applying the number ten to its function argument. This works very well in the following example:
Tutorial.Functions1> applyToTen (* 2)
20
However, if we want to subtract five from ten, the following will fail:
Tutorial.Functions1> applyToTen (- 5)
Error: Can't find an implementation for Num (Integer -> Integer).
(Interactive):1:12--1:17
1 | applyToTen (- 5)
The problem here is that Idris treats - 5
as an integer literal instead of an operator section. In this special case, we have to use an anonymous function instead:
Tutorial.Functions1> applyToTen (\x => x - 5)
5
Infix Notation for Non-Operators
In Idris, it is possible to use infix notation for regular binary functions by wrapping them in backticks. It is even possible to define a precedence (fixity) for these and use them in operator sections, just like regular operators:
infixl 8 `plus`
infixl 9 `mult`
plus : Integer -> Integer -> Integer
plus = (+)
mult : Integer -> Integer -> Integer
mult = (*)
arithTest : Integer
arithTest = 5 `plus` 10 `mult` 12
arithTest' : Integer
arithTest' = 5 + 10 * 12
Operators exported by the Prelude
Here is a list of important operators exported by the Prelude:
(.)
: Function composition(+)
: Addition(*)
: Multiplication(-)
: Subtraction(/)
: Division(==)
: True, if two values are equal(/=)
: True, if two values are not equal(<=)
,(>=)
,(<)
, and(>)
: Comparison operators($)
: Function application
Most of these are constrained, that is they work only for types implementing a certain interface. Don't worry about this right now. We will learn about interfaces their own chapter later, and the operators behave as they intuitively should. For instance, addition and multiplication work for all numeric types, and comparison operators work for almost all types in the Prelude with the exception of functions.
The most special of the above is the last one, ($)
. It has a priority of 0, so all other operators bind more tightly. In addition, function application binds more tightly, so this can be used to reduce the number of parentheses required in an expression. For instance, instead of writing isTriple 3 4 (2 + 3 * 1)
we can write isTriple 3 4 $ 2 + 3 * 1
, with exactly the same meaning. Sometimes this helps readability, other times it doesn't, you will naturally build an intuition for which form of a given expression is more readable with experience, especially from rereading your own code after some time has passed. The important thing to remember is that fun $ x y
is just the same as fun (x y)
.
Introductory Function Exercises
module Tutorial.Functions1.Exercises
The solutions to these exercises can be found in src/Solutions/Functions1.idr
.
Exercise 1
Reimplement functions testSquare
and twice
by using the dot operator and dropping the second arguments (have a look at the implementation of squareTimes2
to get an idea where this should lead you). This highly concise way of writing function implementations is sometimes called point-free style and is often the preferred way of writing small utility functions.
Exercise 2
Declare and implement function isOdd
by combining functions isEven
from above and not
(from the Idris Prelude). Use point-free style.
Exercise 3
Declare and implement function isSquareOf
, which checks whether its first Integer
argument is the square of the second argument.
Exercise 4
Declare and implement function isSmall
, which checks whether its Integer
argument is less than or equal to 100. Use one of the comparison operators <=
or >=
in your implementation.
Exercise 5
Declare and implement function absIsSmall
, which checks whether the absolute value of its Integer
argument is less than or equal to 100. Use functions isSmall
and abs
(from the Idris Prelude) in your implementation, which should be in point-free style.
Exercise 6
In this slightly extended exercise we are going to implement some utilities for working with Integer
predicates (functions from Integer
to Bool
). Implement the following higher-order functions (use boolean operators &&
, ||
, and function not
in your implementations):
-- return true, if and only if both predicates hold
and : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
-- return true, if and only if at least one predicate holds
or : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
-- return true, if the predicate does not hold
negate : (Integer -> Bool) -> Integer -> Bool
After solving this exercise, give it a go in the REPL. In the example below, we use binary function and
in infix notation by wrapping it in backticks. This is just a syntactic convenience to make certain function applications more readable:
Tutorial.Functions1> negate (isSmall `and` isOdd) 73
False
Exercise 7
As explained above, Idris allows us to define our own infix operators. Even better, Idris supports overloading of function names, that is, two functions or operators can have the same name, but different types and implementations. Idris will make use of the types to distinguish between equally named operators and functions.
This allows us, to reimplement functions and
, or
, and negate
from Exercise 6 by using the existing operator and function names from boolean algebra:
-- return true, if and only if both predicates hold
(&&) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
x && y = and x y
-- return true, if and only if at least one predicate holds
(||) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
-- return true, if the predicate does not hold
not : (Integer -> Bool) -> Integer -> Bool
Implement the other two functions and test them at the REPL:
Tutorial.Functions1> not (isSmall && isOdd) 73
False
Conclusion
In this chapter, we learned:
-
A function in Idris can take an arbitrary number of arguments, separated by
->
in the function's type. -
Functions can be combined sequentially using the dot operator
(.)
, which leads to concise code. -
Functions can be partially applied by passing them fewer arguments than they expect, resulting in a new function expecting the remaining arguments. This technique is called currying.
-
Functions can be passed as arguments to other functions, which allows us to easily combine small units of code to create more complex behavior.
-
If writing a corresponding top level function would be too cumbersome, we can pass anonymous functions (lambdas) to higher-order functions.
-
Idris allows us to define our own infix operators. These have to be written in parentheses unless they are being used in infix notation.
-
Infix operators can also be partially applied. These operator sections have to be wrapped in parentheses, and the position of the argument determines whether it is used as the operator's first or second argument.
-
Idris supports name overloading, functions can have the same names but different implementations. Idris will decide which function to use based to the types involved.
Please note, that function and operator names within an individual a module must be unique. In order to define two functions with the same name, they have to be declared in distinct modules. If Idris is not able to decide which of the two functions to use, we can help name resolution by prefixing a function with (a part of) its namespace:
Tutorial.Functions1> :t Prelude.not
Prelude.not : Bool -> Bool
Tutorial.Functions1> :t Functions1.not
Tutorial.Functions1.not : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
What's next
In the next section, we will learn how to define our own data types and how to construct and deconstruct values of these new types. We will also learn about generic types and functions.
Algebraic Data Types
In the previous chapter, we learned how to write our own functions and combine them to create more complex functionality. Of equal importance is the ability to define our own data types and use them as arguments and results in functions.
This is a lengthy chapter, densely packed with information. If you are new to Idris and functional programming, make sure to follow along slowly, experimenting with the examples, and possibly coming up with your own. Make sure to try and solve all exercises.
module Tutorial.DataTypes
Enumerations
module Tutorial.DataTypes.Enumerations
Let's start with a data type for the days of the week as an example.
public export
data Weekday = Monday
| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
| Sunday
The declaration above defines a new type (Weekday
) and several new values (Monday
to Sunday
) of the given type. Go ahead, and verify this at the REPL:
Tutorial.DataTypes> :t Monday
Tutorial.DataTypes.Monday : Weekday
Tutorial.DataTypes> :t Weekday
Tutorial.DataTypes.Weekday : Type
So, Monday
is of type Weekday
, while Weekday
itself is of type Type
.
It is important to note that a value of type Weekday
can only ever be one of the values listed above. It is a type error to use anything else where a Weekday
is expected.
Pattern Matching
In order to use our new data type as a function argument, we need to learn about an important concept in functional programming languages: Pattern matching. Let's implement a function which calculates the successor of a weekday:
total
next : Weekday -> Weekday
next Monday = Tuesday
next Tuesday = Wednesday
next Wednesday = Thursday
next Thursday = Friday
next Friday = Saturday
next Saturday = Sunday
next Sunday = Monday
In order to inspect a Weekday
argument, we match on the different possible values and return a result for each of them. This is a very powerful concept, as it allows us to match on and extract values from deeply nested data structures. The different cases in a pattern match are inspected from top to bottom, each being compared against the current function argument. Once a matching pattern is found, the computation on the right hand side of this pattern is evaluated. Later patterns are then ignored.
For instance, if we invoke next
with argument Thursday
, the first three patterns (Monday
, Tuesday
, and Wednesday
) will be checked against the argument, but they do not match. The fourth pattern is a match, and result Friday
is being returned. Later patterns are then ignored, even if they would also match the input (this becomes relevant with catch-all patterns, which we will talk about in a moment).
The function above is provably total. Idris knows about the possible values of type Weekday
, and can therefore figure out that our pattern match covers all possible cases. We can therefore annotate the function with the total
keyword, and Idris will answer with a type error if it can't verify the function's totality. (Go ahead, and try removing one of the clauses in next
to get an idea about how an error message from the coverage checker looks like.)
Please remember that these are very strong guarantees from the type checker: Given enough resources, a provably total function will always return a result of the given type in a finite amount of time (resources here meaning computational resources like memory or, in case of recursive functions, stack space).
Catch-all Patterns
Sometimes, it is convenient to only match on a subset of the possible values and collect the remaining possibilities in a catch-all clause:
export
total
isWeekend : Weekday -> Bool
isWeekend Saturday = True
isWeekend Sunday = True
isWeekend _ = False
The final line with the catch-all pattern is only invoked if the argument is not equal to Saturday
or Sunday
. Remember: Patterns in a pattern match are matched against the input from top to bottom, and the first match decides which path on the right hand side will be taken.
We can use catch-all patterns to implement an equality test for Weekday
(we will not yet use the ==
operator for this; this will have to wait until we learn about interfaces):
total
eqWeekday : Weekday -> Weekday -> Bool
eqWeekday Monday Monday = True
eqWeekday Tuesday Tuesday = True
eqWeekday Wednesday Wednesday = True
eqWeekday Thursday Thursday = True
eqWeekday Friday Friday = True
eqWeekday Saturday Saturday = True
eqWeekday Sunday Sunday = True
eqWeekday _ _ = False
Enumeration Types in the Prelude
Data types like Weekday
consisting of a finite set of values are sometimes called enumerations. The Idris Prelude defines some common enumerations for us: for instance, Bool
and Ordering
. As with Weekday
, we can use pattern matching when implementing functions on these types:
-- this is how `not` is implemented in the *Prelude*
total
negate : Bool -> Bool
negate False = True
negate True = False
The Ordering
data type describes an ordering relation between two values. For instance:
total
compareBool : Bool -> Bool -> Ordering
compareBool False False = EQ
compareBool False True = LT
compareBool True True = EQ
compareBool True False = GT
Here, LT
means that the first argument is less than the second, EQ
means that the two arguments are equal and GT
means, that the first argument is greater than the second.
Case Expressions
Sometimes we need to perform a computation with one of the arguments and want to pattern match on the result of this computation. We can use case expressions in this situation:
-- returns the larger of the two arguments
total
maxBits8 : Bits8 -> Bits8 -> Bits8
maxBits8 x y =
case compare x y of
LT => y
_ => x
The first line of the case expression (case compare x y of
) will invoke function compare
with arguments x
and y
. On the following (indented) lines, we pattern match on the result of this computation. This is of type Ordering
, so we expect one of the three constructors LT
, EQ
, or GT
as the result. On the first line, we handle the LT
case explicitly, while the other two cases are handled with an underscore as a catch-all pattern.
Note that indentation matters here: The case block as a whole must be indented (if it starts on a new line), and the different cases must also be indented by the same amount of whitespace.
Function compare
is overloaded for many data types. We will learn how this works when we talk about interfaces.
If Then Else
When working with Bool
, there is an alternative to pattern matching common to most programming languages:
total
maxBits8' : Bits8 -> Bits8 -> Bits8
maxBits8' x y = if compare x y == LT then y else x
Note that the if then else
expression always returns a value and, therefore, the else
branch cannot be dropped. This is different from the behavior in typical imperative languages, where if
is a statement with possible side effects.
Naming Conventions: Identifiers
While we are free to use lower-case and upper-case identifiers for function names, type- and data constructors must be given upper-case identifiers in order not to confuse Idris (operators are also fine). For instance, the following data definition is not valid, and Idris will complain that it expected upper-case identifiers:
data foo = bar | baz
The same goes for similar data definitions like records and sum types (both will be explained below):
-- not valid Idris
record Foo where
constructor mkfoo
On the other hand, we typically use lower-case identifiers for function names, unless we plan to use them mostly during type checking (more on this later). This is not enforced by Idris, however, so if you are working in a domain where upper-case identifiers are preferable, feel free to use those:
foo : Bits32 -> Bits32
foo = (* 2)
Bar : Bits32 -> Bits32
Bar = foo
Exercises part 1
module Tutorial.DataTypes.Exercises1
-
Use pattern matching to implement your own versions of boolean operators
(&&)
and(||)
calling themand
andor
respectively.Note: One way to go about this is to enumerate all four possible combinations of two boolean values and give the result for each. However, there is a shorter, more clever way, requiring only two pattern matches for each of the two functions.
-
Define your own data type representing different units of time (seconds, minutes, hours, days, weeks), and implement the following functions for converting between time spans using different units. Hint: Use integer division (
div
) when going from seconds to some larger unit like hours).data UnitOfTime = Second -- add additional values -- calculate the number of seconds from a -- number of steps in the given unit of time total toSeconds : UnitOfTime -> Integer -> Integer -- Given a number of seconds, calculate the -- number of steps in the given unit of time total fromSeconds : UnitOfTime -> Integer -> Integer -- convert the number of steps in a given unit of time -- to the number of steps in another unit of time. -- use `fromSeconds` and `toSeconds` in your implementation total convert : UnitOfTime -> Integer -> UnitOfTime -> Integer
-
Define a data type for representing a subset of the chemical elements: Hydrogen (H), Carbon (C), Nitrogen (N), Oxygen (O), and Fluorine (F).
Declare and implement function
atomicMass
, which for each element returns its atomic mass in dalton:Hydrogen : 1.008 Carbon : 12.011 Nitrogen : 14.007 Oxygen : 15.999 Fluorine : 18.9984
Sum Types
module Tutorial.DataTypes.SumTypes
Assume we'd like to write some web form, where users of our web application can decide how they like to be addressed. We give them a choice between two common predefined forms of address (Mr and Mrs), but also allow them to decide on a customized form. The possible choices can be encapsulated in an Idris data type:
public export
data Title = Mr | Mrs | Other String
This looks almost like an enumeration type, with the exception that there is a new thing, called a data constructor, which accepts a String
argument (actually, the values in an enumeration are also called (nullary) data constructors). If we inspect the types at the REPL, we learn the following:
Tutorial.DataTypes> :t Mr
Tutorial.DataTypes.Mr : Title
Tutorial.DataTypes> :t Other
Tutorial.DataTypes.Other : String -> Title
So, Other
is a function from String
to Title
. This means, that we can pass Other
a String
argument and get a Title
as the result:
public export
total
dr : Title
dr = Other "Dr."
Again, a value of type Title
can only consist of one of the three choices listed above, and again, we can use pattern matching to implement functions on the Title
data type in a provably total way:
export
total
showTitle : Title -> String
showTitle Mr = "Mr."
showTitle Mrs = "Mrs."
showTitle (Other x) = x
Note, how in the last pattern match, the string value stored in the Other
data constructor is bound to local variable x
. Also, the Other x
pattern has to be wrapped in parentheses, as otherwise Idris would think Other
and x
were to distinct function arguments.
This is a very common way to extract the values from data constructors. We can use showTitle
to implement a function for creating a courteous greeting:
export
total
greet : Title -> String -> String
greet t name = "Hello, " ++ showTitle t ++ " " ++ name ++ "!"
In the implementation of greet
, we use string literals and the string concatenation operator (++)
to assemble the greeting from its parts.
At the REPL:
Tutorial.DataTypes> greet dr "Höck"
"Hello, Dr. Höck!"
Tutorial.DataTypes> greet Mrs "Smith"
"Hello, Mrs. Smith!"
Data types like Title
are called sum types as they consist of the sum of their different parts: A value of type Title
is either a Mr
, a Mrs
, or a String
wrapped up in Other
.
Here's another (drastically simplified) example of a sum type. Assume we allow two forms of authentication in our web application: Either by entering a username plus a password (for which we'll use an unsigned 64 bit integer here), or by providing username plus a (very complex) secret key. Here's a data type to encapsulate this use case:
data Credentials = Password String Bits64 | Key String String
As an example of a very primitive login function, we can hard-code some known credentials:
total
login : Credentials -> String
login (Password "Anderson" 6665443) = greet Mr "Anderson"
login (Key "Y" "xyz") = greet (Other "Agent") "Y"
login _ = "Access denied!"
As can be seen in the example above, we can also pattern match against primitive values by using integer and string literals. Give login
a go at the REPL:
Tutorial.DataTypes> login (Password "Anderson" 6665443)
"Hello, Mr. Anderson!"
Tutorial.DataTypes> login (Key "Y" "xyz")
"Hello, Agent Y!"
Tutorial.DataTypes> login (Key "Y" "foo")
"Access denied!"
Exercises part 2
module Tutorial.DataTypes.Exercises2
-
Implement an equality test for
Title
(you can use the equality operator(==)
for comparing twoString
s):total eqTitle : Title -> Title -> Bool
-
For
Title
, implement a simple test to check, whether a custom title is being used:total isOther : Title -> Bool
-
Given our simple
Credentials
type, there are three ways for authentication to fail:- An unknown username was used.
- The password given does not match the one associated with the username.
- An invalid key was used.
Encapsulate these three possibilities in a sum type called
LoginError
, but make sure not to disclose any confidential information: An invalid username should be stored in the corresponding error value, but an invalid password or key should not. -
Implement function
showError : LoginError -> String
, which can be used to display an error message to the user who unsuccessfully tried to login into our web application.
Records
module Tutorial.DataTypes.Records
import Tutorial.DataTypes.Enumerations
import Tutorial.DataTypes.SumTypes
It is often useful to group together several values as a logical unit. For instance, in our web application we might want to group information about a user in a single data type. Such data types are often called product types (see below for an explanation). The most common and convenient way to define them is the record
construct:
record User where
constructor MkUser
name : String
title : Title
age : Bits8
The declaration above creates a new type called User
, and a new data constructor called MkUser
. As usual, have a look at their types in the REPL:
Tutorial.DataTypes> :t User
Tutorial.DataTypes.User : Type
Tutorial.DataTypes> :t MkUser
Tutorial.DataTypes.MkUser : String -> Title -> Bits8 -> User
We can use MkUser
(which is a function from String
to Title
to Bits8
to User
) to create values of type User
:
total
agentY : User
agentY = MkUser "Y" (Other "Agent") 51
total
drNo : User
drNo = MkUser "No" dr 73
We can also use pattern matching to extract the fields from a User
value (they can again be bound to local variables):
total
greetUser : User -> String
greetUser (MkUser n t _) = greet t n
In the example above, the name
and title
field are bound to two new local variables (n
and t
respectively), which can then be used on the right hand side of greetUser
's implementation. For the age
field, which is not used on the right hand side, we can use an underscore as a catch-all pattern.
Note, how Idris will prevent us from making a common mistake: If we confuse the order of arguments, the implementation will no longer type check. We can verify this by putting the erroneous code in a failing
block: This is an indented code block, which will lead to an error during elaboration (type checking). We can give part of the expected error message as an optional string argument to a failing block. If this does not match part of the error message (or the whole code block does not fail to type check) the failing
block itself fails to type check. This is a useful tool to demonstrate that type safety works in two directions: We can show that valid code type checks but also that invalid code is rejected by the Idris elaborator:
failing "Mismatch between: String and Title"
greetUser' : User -> String
greetUser' (MkUser n t _) = greet n t
In addition, for every record field, Idris creates an extractor function of the same name. This can either be used as a regular function, or it can be used in postfix notation by appending it to a variable of the record type separated by a dot. Here are two examples for extracting the age from a user:
getAgeFunction : User -> Bits8
getAgeFunction u = age u
getAgePostfix : User -> Bits8
getAgePostfix u = u.age
Syntactic Sugar for Records
As was already mentioned in the introduction, Idris is a pure functional programming language. In pure functions, we are not allowed to modify global mutable state. As such, if we want to modify a record value, we will always create a new value with the original value remaining unchanged: Records and other Idris values are immutable. While this can have a slight impact on performance, it has the benefit that we can freely pass a record value to different functions, without fear of the functions modifying the value by in-place mutation. These are, again, very strong guarantees, which makes it drastically easier to reason about our code.
There are several ways to modify a record, the most general being to pattern match on the record and adjust each field as desired. If, for instance, we'd like to increase the age of a User
by one, we could do the following:
total
incAge : User -> User
incAge (MkUser name title age) = MkUser name title (age + 1)
That's a lot of code for such a simple thing, so Idris offers several syntactic conveniences for this. For instance, using record syntax, we can just access and update the age
field of a value:
total
incAge2 : User -> User
incAge2 u = { age := u.age + 1 } u
Assignment operator :=
assigns a new value to the age
field in u
. Remember, that this will create a new User
value. The original value u
remains unaffected by this.
We can access a record field, either by using the field name as a projection function (age u
; also have a look at :t age
in the REPL), or by using dot syntax: u.age
. This is special syntax and not related to the dot operator for function composition ((.)
).
The use case of modifying a record field is so common that Idris provides special syntax for this as well:
total
incAge3 : User -> User
incAge3 u = { age $= (+ 1) } u
Here, I used an operator section ((+ 1)
) to make the code more concise. As an alternative to an operator section, we could have used an anonymous function like so:
total
incAge4 : User -> User
incAge4 u = { age $= \x => x + 1 } u
Finally, since our function's argument u
is only used once at the very end, we can drop it altogether, to get the following, highly concise version:
total
incAge5 : User -> User
incAge5 = { age $= (+ 1) }
As usual, we should have a look at the result at the REPL:
Tutorial.DataTypes> incAge5 drNo
MkUser "No" (Other "Dr.") 74
It is possible to use this syntax to set and/or update several record fields at once:
total
drNoJunior : User
drNoJunior = { name $= (++ " Jr."), title := Mr, age := 17 } drNo
Tuples
I wrote above that a record is also called a product type. This is quite obvious when we consider the number of possible values inhabiting a given type. For instance, consider the following custom record:
record Foo where
constructor MkFoo
wd : Weekday
bool : Bool
How many possible values of type Foo
are there? The answer is 7 * 2 = 14
, as we can pair every possible Weekday
(seven in total) with every possible Bool
(two in total). So, the number of possible values of a record type is the product of the number of possible values for each field.
The canonical product type is the Pair
, which is available from the Prelude:
total
weekdayAndBool : Weekday -> Bool -> Pair Weekday Bool
weekdayAndBool wd b = MkPair wd b
Since it is quite common to return several values from a function wrapped in a Pair
or larger tuple, Idris provides some syntactic sugar for working with these. Instead of Pair Weekday Bool
, we can just write (Weekday, Bool)
. Likewise, instead of MkPair wd b
, we can just write (wd, b)
(the space is optional):
total
weekdayAndBool2 : Weekday -> Bool -> (Weekday, Bool)
weekdayAndBool2 wd b = (wd, b)
This works also for nested tuples:
total
triple : Pair Bool (Pair Weekday String)
triple = MkPair False (Friday, "foo")
total
triple2 : (Bool, Weekday, String)
triple2 = (False, Friday, "foo")
In the example above, triple2
is converted to the form used in triple
by the Idris compiler.
We can even use tuple syntax in pattern matches:
total
bar : Bool
bar = case triple of
(b,wd,_) => b && isWeekend wd
As Patterns
Sometimes, we'd like to take apart a value by pattern matching on it but still retain the value as a whole for using it in further computations:
total
baz : (Bool,Weekday,String) -> (Nat,Bool,Weekday,String)
baz t@(_,_,s) = (length s, t)
In baz
, variable t
is bound to the triple as a whole, which is then reused to construct the resulting quadruple. Remember, that (Nat,Bool,Weekday,String)
is just sugar for Pair Nat (Bool,Weekday,String)
, and (length s, t)
is just sugar for MkPair (length s) t
. Hence, the implementation above is correct as is confirmed by the type checker.
Exercises part 3
-
Define a record type for time spans by pairing a
UnitOfTime
with an integer representing the duration of the time span in the given unit of time. Define also a function for converting a time span to anInteger
representing the duration in seconds. -
Implement an equality check for time spans: Two time spans should be considered equal, if and only if they correspond to the same number of seconds.
-
Implement a function for pretty printing time spans: The resulting string should display the time span in its given unit, plus show the number of seconds in parentheses, if the unit is not already seconds.
-
Implement a function for adding two time spans. If the two time spans use different units of time, use the smaller unit of time to ensure a lossless conversion.
Generic Data Types
module Tutorial.DataTypes.GenericDataTypes
import Tutorial.DataTypes.Enumerations
Sometimes, a concept is general enough that we'd like to apply it not only to a single type, but to all kinds of types. For instance, we might not want to define data types for lists of integers, lists of strings, and lists of booleans, as this would lead to a lot of code duplication. Instead, we'd like to have a single generic list type parameterized by the type of values it stores. This section explains how to define and use generic types.
Maybe
Consider the case of parsing a Weekday
from user input. Surely, such a function should return Saturday
, if the string input was "Saturday"
, but what if the input was "sdfkl332"
? We have several options here. For instance, we could just return a default result (Sunday
perhaps?). But is this the behavior programmers expect when using our library? Maybe not. To silently continue with a default value in the face of invalid user input is hardly ever the best choice and may lead to a lot of confusion.
In an imperative language, our function would probably throw an exception. We could do this in Idris as well (there is function idris_crash
in the Prelude for this), but doing so, we would abandon totality! A high price to pay for such a common thing as a parsing error.
In languages like Java, our function might also return some kind of null
value (leading to the dreaded NullPointerException
s if not handled properly in client code). Our solution will be similar, but instead of silently returning null
, we will make the possibility of failure visible in the types! We define a custom data type, which encapsulates the possibility of failure. Defining new data types in Idris is very cheap (in terms of the amount of code needed), therefore this is often the way to go in order to increase type safety. Here's an example how to do this:
data MaybeWeekday = WD Weekday | NoWeekday
total
readWeekday : String -> MaybeWeekday
readWeekday "Monday" = WD Monday
readWeekday "Tuesday" = WD Tuesday
readWeekday "Wednesday" = WD Wednesday
readWeekday "Thursday" = WD Thursday
readWeekday "Friday" = WD Friday
readWeekday "Saturday" = WD Saturday
readWeekday "Sunday" = WD Sunday
readWeekday _ = NoWeekday
But assume now, we'd also like to read Bool
values from user input. We'd now have to write a custom data type MaybeBool
and so on for all types we'd like to read from String
, and the conversion of which might fail.
Idris, like many other programming languages, allows us to generalize this behavior by using generic data types. Here's an example:
data Option a = Some a | None
total
readBool : String -> Option Bool
readBool "True" = Some True
readBool "False" = Some False
readBool _ = None
It is important to go to the REPL and look at the types:
Tutorial.DataTypes> :t Some
Tutorial.DataTypes.Some : a -> Option a
Tutorial.DataTypes> :t None
Tutorial.DataTypes.None : Option a
Tutorial.DataTypes> :t Option
Tutorial.DataTypes.Option : Type -> Type
We need to introduce some jargon here. Option
is what we call a type constructor. It is not yet a saturated type: It is a function from Type
to Type
. However, Option Bool
is a type, as is Option Weekday
. Even Option (Option Bool)
is a valid type. Option
is a type constructor parameterized over a parameter of type Type
. Some
and None
are Option
s data constructors: The functions used to create values of type Option a
for a type a
.
Let's see some other use cases for Option
. Below is a safe division operation:
total
safeDiv : Integer -> Integer -> Option Integer
safeDiv n 0 = None
safeDiv n k = Some (n `div` k)
The possibility of returning some kind of null value in the face of invalid input is so common, that there is a data type like Option
already in the Prelude: Maybe
, with data constructors Just
and Nothing
.
It is important to understand the difference between returning Maybe Integer
in a function, which might fail, and returning null
in languages like Java: In the former case, the possibility of failure is visible in the types. The type checker will force us to treat Maybe Integer
differently than Integer
: Idris will not allow us to forget to eventually handle the failure case. Not so, if null
is silently returned without adjusting the types. Programmers may (and often will) forget to handle the null
case, leading to unexpected and sometimes hard to debug runtime exceptions.
Either
While Maybe
is very useful to quickly provide a default value to signal some kind of failure, this value (Nothing
) is not very informative. It will not tell us what exactly went wrong. For instance, in case of our Weekday
reading function, it might be interesting later on to know the value of the invalid input string. And just like with Maybe
and Option
above, this concept is general enough that we might encounter other types of invalid values. Here's a data type to encapsulate this:
data Validated e a = Invalid e | Valid a
Validated
is a type constructor parameterized over two type parameters e
and a
. It's data constructors are Invalid
and Valid
, the former holding a value describing some error condition, the latter the result in case of a successful computation. Let's see this in action:
total
readWeekdayV : String -> Validated String Weekday
readWeekdayV "Monday" = Valid Monday
readWeekdayV "Tuesday" = Valid Tuesday
readWeekdayV "Wednesday" = Valid Wednesday
readWeekdayV "Thursday" = Valid Thursday
readWeekdayV "Friday" = Valid Friday
readWeekdayV "Saturday" = Valid Saturday
readWeekdayV "Sunday" = Valid Sunday
readWeekdayV s = Invalid ("Not a weekday: " ++ s)
Again, this is such a general concept that a data type similar to Validated
is already available from the Prelude: Either
with data constructors Left
and Right
. It is very common for functions to encapsulate the possibility of failure by returning an Either err val
, where err
is the error type and val
is the desired return type. This is the type safe (and total!) alternative to throwing a catchable exception in an imperative language.
Note, however, that the semantics of Either
are not always "Left
is an error and Right
a success". A function returning an Either
just means that it can have to different types of results, each of which are tagged with the corresponding data constructor.
List
One of the most important data structures in pure functional programming is the singly linked list. Here is its definition (called Seq
in order for it not to collide with List
, which is of course already available from the Prelude):
data Seq a = Nil | (::) a (Seq a)
This calls for some explanations. Seq
consists of two data constructors: Nil
(representing an empty sequence of values) and (::)
(also called the cons operator), which prepends a new value of type a
to an already existing list of values of the same type. As you can see, we can also use operators as data constructors, but please do not overuse this. Use clear names for your functions and data constructors and only introduce new operators when it truly helps readability!
Here is an example of how to use the List
constructors (I use List
here, as this is what you should use in your own code):
total
ints : List Int64
ints = 1 :: 2 :: -3 :: Nil
However, there is a more concise way of writing the above. Idris accepts special syntax for constructing data types consisting exactly of the two constructors Nil
and (::)
:
total
ints2 : List Int64
ints2 = [1, 2, -3]
total
ints3 : List Int64
ints3 = []
The two definitions ints
and ints2
are treated identically by the compiler. Note, that list syntax can also be used in pattern matches.
There is another thing that's special about Seq
and List
: Each of them is defined in terms of itself (the cons operator accepts a value and another Seq
as arguments). We call such data types recursive data types, and their recursive nature means, that in order to decompose or consume them, we typically require recursive functions. In an imperative language, we might use a for loop or similar construct to iterate over the values of a List
or a Seq
, but these things do not exist in a language without in-place mutation. Here's how to sum a list of integers:
total
intSum : List Integer -> Integer
intSum Nil = 0
intSum (n :: ns) = n + intSum ns
Recursive functions can be hard to grasp at first, so I'll break this down a bit. If we invoke intSum
with the empty list, the first pattern matches and the function returns zero immediately. If, however, we invoke intSum
with a non-empty list - [7,5,9]
for instance - the following happens:
-
The second pattern matches and splits the list into two parts: Its head (
7
) is bound to variablen
and its tail ([5,9]
) is bound tons
:7 + intSum [5,9]
-
In a second invocation,
intSum
is called with a new list:[5,9]
. The second pattern matches andn
is bound to5
andns
is bound to[9]
:7 + (5 + intSum [9])
-
In a third invocation
intSum
is called with list[9]
. The second pattern matches andn
is bound to9
andns
is bound to[]
:7 + (5 + (9 + intSum [])
-
In a fourth invocation,
intSum
is called with list[]
and returns0
immediately:7 + (5 + (9 + 0)
-
In the third invocation,
9
and0
are added and9
is returned:7 + (5 + 9)
-
In the second invocation,
5
and9
are added and14
is returned:7 + 14
-
Finally, our initial invocation of
intSum
adds7
and14
and returns21
.
Thus, the recursive implementation of intSum
leads to a sequence of nested calls to intSum
, which terminates once the argument is the empty list.
Generic Functions
In order to fully appreciate the versatility that comes with generic data types, we also need to talk about generic functions. Like generic types, these are parameterized over one or more type parameters.
Consider for instance the case of breaking out of the Option
data type. In case of a Some
, we'd like to return the stored value, while for the None
case we provide a default value. Here's how to do this, specialized to Integer
s:
total
integerFromOption : Integer -> Option Integer -> Integer
integerFromOption _ (Some y) = y
integerFromOption x None = x
It's pretty obvious that this, again, is not general enough. Surely, we'd also like to break out of Option Bool
or Option String
in a similar fashion. That's exactly what the generic function fromOption
does:
total
fromOption : a -> Option a -> a
fromOption _ (Some y) = y
fromOption x None = x
The lower-case a
is again a type parameter. You can read the type signature as follows: "For any type a
, given a value of type a
, and an Option a
, we can return a value of type a
." Note, that fromOption
knows nothing else about a
, other than it being a type. It is therefore not possible, to conjure a value of type a
out of thin air. We must have a value available to deal with the None
case.
The pendant to fromOption
for Maybe
is called fromMaybe
and is available from module Data.Maybe
from the base library.
Sometimes, fromOption
is not general enough. Assume we'd like to print the value of a freshly parsed Bool
, giving some generic error message in case of a None
. We can't use fromOption
for this, as we have an Option Bool
and we'd like to return a String
. Here's how to do this:
total
option : b -> (a -> b) -> Option a -> b
option _ f (Some y) = f y
option x _ None = x
total
handleBool : Option Bool -> String
handleBool = option "Not a boolean value." show
Function option
is parameterized over two type parameters: a
represents the type of values stored in the Option
, while b
is the return type. In case of a Just
, we need a way to convert the stored a
to a b
, an that's done using the function argument of type a -> b
.
In Idris, lower-case identifiers in function types are treated as type parameters, while upper-case identifiers are treated as types or type constructors that must be in scope.
Exercises part 4
If this is your first time programming in a purely functional language, the exercises below are very important. Do not skip any of them! Take your time and work through them all. In most cases, the types should be enough to explain what's going on, even though they might appear cryptic in the beginning. Otherwise, have a look at the comments (if any) of each exercise.
Remember, that lower-case identifiers in a function signature are treated as type parameters.
-
Implement the following generic functions for
Maybe
:-- make sure to map a `Just` to a `Just`. total mapMaybe : (a -> b) -> Maybe a -> Maybe b -- Example: `appMaybe (Just (+2)) (Just 20) = Just 22` total appMaybe : Maybe (a -> b) -> Maybe a -> Maybe b -- Example: `bindMaybe (Just 12) Just = Just 12` total bindMaybe : Maybe a -> (a -> Maybe b) -> Maybe b -- keep the value in a `Just` only if the given predicate holds total filterMaybe : (a -> Bool) -> Maybe a -> Maybe a -- keep the first value that is not a `Nothing` (if any) total first : Maybe a -> Maybe a -> Maybe a -- keep the last value that is not a `Nothing` (if any) total last : Maybe a -> Maybe a -> Maybe a -- this is another general way to extract a value from a `Maybe`. -- Make sure the following holds: -- `foldMaybe (+) 5 Nothing = 5` -- `foldMaybe (+) 5 (Just 12) = 17` total foldMaybe : (acc -> el -> acc) -> acc -> Maybe el -> acc
-
Implement the following generic functions for
Either
:total mapEither : (a -> b) -> Either e a -> Either e b -- In case of both `Either`s being `Left`s, keep the -- value stored in the first `Left`. total appEither : Either e (a -> b) -> Either e a -> Either e b total bindEither : Either e a -> (a -> Either e b) -> Either e b -- Keep the first value that is not a `Left` -- If both `Either`s are `Left`s, use the given accumulator -- for the error values total firstEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a -- Keep the last value that is not a `Left` -- If both `Either`s are `Left`s, use the given accumulator -- for the error values total lastEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a total fromEither : (e -> c) -> (a -> c) -> Either e a -> c
-
Implement the following generic functions for
List
:total mapList : (a -> b) -> List a -> List b total filterList : (a -> Bool) -> List a -> List a -- re-implement list concatenation (++) such that e.g. (++) [1, 2] [3, 4] = [1, 2, 3, 4] -- note that because this function conflicts with the standard -- Prelude.List.(++), if you use it then you will need to prefix it with -- the name of your module, like DataTypes.(++) or Ch3.(++). alternatively -- you could simply call the function something unique like myListConcat or concat' total (++) : List a -> List a -> List a -- return the first value of a list, if it is non-empty total headMaybe : List a -> Maybe a -- return everything but the first value of a list, if it is non-empty total tailMaybe : List a -> Maybe (List a) -- return the last value of a list, if it is non-empty total lastMaybe : List a -> Maybe a -- return everything but the last value of a list, -- if it is non-empty total initMaybe : List a -> Maybe (List a) -- accumulate the values in a list using the given -- accumulator function and initial value -- -- Examples: -- `foldList (+) 10 [1,2,7] = 20` -- `foldList String.(++) "" ["Hello","World"] = "HelloWorld"` -- `foldList last Nothing (mapList Just [1,2,3]) = Just 3` total foldList : (acc -> el -> acc) -> acc -> List el -> acc
-
Assume we store user data for our web application in the following record:
record Client where constructor MkClient name : String title : Title age : Bits8 passwordOrKey : Either Bits64 String
Using
LoginError
from an earlier exercise, implement functionlogin
, which, given a list ofClient
s plus a value of typeCredentials
will return either aLoginError
in case no valid credentials where provided, or the firstClient
for whom the credentials match. -
Using your data type for chemical elements from an earlier exercise, implement a function for calculating the molar mass of a molecular formula.
Use a list of elements each paired with its count (a natural number) for representing formulae. For instance:
ethanol : List (Element,Nat) ethanol = [(C,2),(H,6),(O,1)]
Hint: You can use function
cast
to convert a natural number to aDouble
.
Alternative Syntax for Data Definitions
module Tutorial.DataTypes.AltSyntax
While the examples in the section about parameterized data types are short and concise, there is a slightly more verbose but much more general form for writing such definitions, which makes it much clearer what's going on. In my opinion, this more general form should be preferred in all but the most simple data definitions.
Here are the definitions of Option
, Validated
, and Seq
again, using this more general form (I put them in their own namespace, so Idris will not complain about identical names in the same source file):
-- GADT is an acronym for "generalized algebraic data type"
namespace GADT
data Option : Type -> Type where
Some : a -> Option a
None : Option a
data Validated : Type -> Type -> Type where
Invalid : e -> Validated e a
Valid : a -> Validated e a
data Seq : Type -> Type where
Nil : Seq a
(::) : a -> GADT.Seq a -> Seq a
Here, Option
is clearly declared as a type constructor (a function of type Type -> Type
), while Some
is a generic function of type a -> Option a
(where a
is a type parameter) and None
is a nullary generic function of type Option a
(a
again being a type parameter). Likewise for Validated
and Seq
. Note, that in case of Seq
we had to disambiguate between the different Seq
definitions in the recursive case. Since we will usually not define several data types with the same name in a source file, this is not necessary most of the time.
Conclusion
We covered a lot of ground in this chapter, so I'll summarize the most important points below:
-
Enumerations are data types consisting of a finite number of possible values.
-
Sum types are data types with more than one data constructor, where each constructor describes a choice that can be made.
-
Product types are data types with a single constructor used to group several values of possibly different types.
-
We use pattern matching to deconstruct immutable values in Idris. The possible patterns correspond to a data type's data constructors.
-
We can bind variables to values in a pattern or use an underscore as a placeholder for a value that's not needed on the right hand side of an implementation.
-
We can pattern match on an intermediary result by introducing a case block.
-
The preferred way to define new product types is to define them as records, since these come with additional syntactic conveniences for setting and modifying individual record fields.
-
Generic types and functions allow us generalize certain concepts and make them available for many types by using type parameters instead of concrete types in function and type signatures.
-
Common concepts like nullary values (
Maybe
), computations that might fail with some error condition (Either
), and handling collections of values of the same type at once (List
) are example use cases of generic types and functions already provided by the Prelude.
What's next
In the next section, we will introduce interfaces, another approach to function overloading.
Interfaces
Function overloading - the definition of functions with the same name but different implementations - is a concept found in many programming languages. Idris natively supports overloading of functions: Two functions with the same name can be defined in different modules or namespaces, and Idris will try to disambiguate between these based on the types involved. Here is an example:
module Tutorial.Interfaces
%default total
namespace Bool
export
size : Bool -> Integer
size True = 1
size False = 0
namespace Integer
export
size : Integer -> Integer
size = id
namespace List
export
size : List a -> Integer
size = cast . length
Here, we defined three different functions called size
, each in its own namespace. We can disambiguate between these by prefixing them with their namespace:
Tutorial.Interfaces> :t Bool.size
Tutorial.Interfaces.Bool.size : Bool -> Integer
However, this is usually not necessary:
mean : List Integer -> Integer
mean xs = sum xs `div` size xs
As you can see, Idris can disambiguate between the different size
functions, since xs
is of type List Integer
, which unifies only with List a
, the argument type of List.size
.
Interface Basics
module Tutorial.Interfaces.Basics
While function overloading as described above works well, there are use cases, where this form of overloaded functions leads to a lot of code duplication.
As an example, consider a function cmp
(short for compare, which is already exported by the Prelude), for describing an ordering for the values of type String
:
cmp : String -> String -> Ordering
We'd also like to have similar functions for many other data types. Function overloading allows us to do just that, but cmp
is not an isolated piece of functionality. From it, we can derive functions like greaterThan'
, lessThan'
, minimum'
, maximum'
, and many others:
lessThan' : String -> String -> Bool
lessThan' s1 s2 = LT == cmp s1 s2
greaterThan' : String -> String -> Bool
greaterThan' s1 s2 = GT == cmp s1 s2
minimum' : String -> String -> String
minimum' s1 s2 =
case cmp s1 s2 of
LT => s1
_ => s2
maximum' : String -> String -> String
maximum' s1 s2 =
case cmp s1 s2 of
GT => s1
_ => s2
We'd need to implement all of these again for the other types with a cmp
function, and most if not all of these implementations would be identical to the ones written above. That's a lot of code repetition.
One way to solve this is to use higher-order functions. For instance, we could define function minimumBy
, which takes a comparison function as its first argument and returns the smaller of the two remaining arguments:
minimumBy : (a -> a -> Ordering) -> a -> a -> a
minimumBy f a1 a2 =
case f a1 a2 of
LT => a1
_ => a2
This solution is another proof of how higher-order functions allow us to reduce code duplication. However, the need to explicitly pass around the comparison function all the time can get tedious as well. It would be nice, if we could teach Idris to come up with such a function on its own.
Interfaces solve exactly this issue. Here's an example:
public export
interface Comp a where
comp : a -> a -> Ordering
export
implementation Comp Bits8 where
comp = compare
export
implementation Comp Bits16 where
comp = compare
The code above defines interface Comp
providing function comp
for calculating the ordering for two values of a type a
, followed by two implementations of this interface for types Bits8
and Bits16
. Note, that the implementation
keyword is optional.
The comp
implementations for Bits8
and Bits16
both use function compare
, which is part of a similar interface from the Prelude called Ord
.
The next step is to look at the type of comp
at the REPL:
Tutorial.Interfaces> :t comp
Tutorial.Interfaces.comp : Comp a => a -> a -> Ordering
The interesting part in the type signature of comp
is the initial Comp a =>
argument. Here, Comp
is a constraint on type parameter a
. This signature can be read as: "For any type a
, given an implementation of interface Comp
for a
, we can compare two values of type a
and return an Ordering
for these." Whenever we invoke comp
, we expect Idris to come up with a value of type Comp a
on its own, hence the new =>
arrow. If Idris fails to do so, it will answer with a type error.
We can now use comp
in the implementations of related functions. All we have to do is to also prefix these derived functions with a Comp
constraint:
lessThan : Comp a => a -> a -> Bool
lessThan s1 s2 = LT == comp s1 s2
greaterThan : Comp a => a -> a -> Bool
greaterThan s1 s2 = GT == comp s1 s2
minimum : Comp a => a -> a -> a
minimum s1 s2 =
case comp s1 s2 of
LT => s1
_ => s2
maximum : Comp a => a -> a -> a
maximum s1 s2 =
case comp s1 s2 of
GT => s1
_ => s2
Note, how the definition of minimum
is almost identical to minimumBy
. The only difference being that in case of minimumBy
we had to pass the comparison function as an explicit argument, while for minimum
it is provided as part of the Comp
implementation, which is passed around by Idris for us.
Thus, we have defined all these utility functions once and for all for every type with an implementation of interface Comp
.
Exercises Part 1
-
Implement function
anyLarger
, which should returnTrue
, if and only if a list of values contains at least one element larger than a given reference value. Use interfaceComp
in your implementation. -
Implement function
allLarger
, which should returnTrue
, if and only if a list of values contains only elements larger than a given reference value. Note, that this is trivially true for the empty list. Use interfaceComp
in your implementation. -
Implement function
maxElem
, which tries to extract the largest element from a list of values with aComp
implementation. Likewise forminElem
, which tries to extract the smallest element. Note, that the possibility of the list being empty must be considered when deciding on the output type. -
Define an interface
Concat
for values like lists or strings, which can be concatenated. Provide implementations for lists and strings. -
Implement function
concatList
for concatenating the values in a list holding values with aConcat
implementation. Make sure to reflect the possibility of the list being empty in your output type.
More About Interfaces
module Tutorial.Interfaces.More
import Tutorial.Interfaces.Basics
In the last section, we learned about the very basics of interfaces: Why they are useful and how to define and implement them. In this section, we will learn about some slightly advanced concepts: Extending interfaces, interfaces with constraints, and default implementations.
Extending Interfaces
Some interfaces form a kind of hierarchy. For instance, for the Concat
interface used in exercise 4, there might be a child interface called Empty
, for those types, which have a neutral element with relation to concatenation. In such a case, we make an implementation of Concat
a prerequisite for implementing Empty
:
interface Concat a where
concat : a -> a -> a
implementation Concat String where
concat = (++)
interface Concat a => Empty a where
empty : a
implementation Empty String where
empty = ""
Concat a => Empty a
should be read as: "An implementation of Concat
for type a
is a prerequisite for there being an implementation of Empty
for a
." But this also means that, whenever we have an implementation of interface Empty
, we must also have an implementation of Concat
and can invoke the corresponding functions:
concatListE : Empty a => List a -> a
concatListE [] = empty
concatListE (x :: xs) = concat x (concatListE xs)
Note, how in the type of concatListE
we only used an Empty
constraint, and how in the implementation we were still able to invoke both empty
and concat
.
Constrained Implementations
Sometimes, it is only possible to implement an interface for a generic type, if its type parameters implement this interface as well. For instance, implementing interface Comp
for Maybe a
makes sense only if type a
itself implements Comp
. We can constrain interface implementations with the same syntax we use for constrained functions:
implementation Comp a => Comp (Maybe a) where
comp Nothing Nothing = EQ
comp (Just _) Nothing = GT
comp Nothing (Just _) = LT
comp (Just x) (Just y) = comp x y
This is not the same as extending an interface, although the syntax looks very similar. Here, the constraint lies on a type parameter instead of the full type. The last line in the implementation of Comp (Maybe a)
compares the values stored in the two Just
s. This is only possible, if there is a Comp
implementation for these values as well. Go ahead, and remove the Comp a
constraint from the above implementation. Learning to read and understand Idris' type errors is important for fixing them.
The good thing is, that Idris will solve all these constraints for us:
maxTest : Maybe Bits8 -> Ordering
maxTest = comp (Just 12)
Here, Idris tries to find an implementation for Comp (Maybe Bits8)
. In order to do so, it needs an implementation for Comp Bits8
. Go ahead, and replace Bits8
in the type of maxTest
with Bits64
, and have a look at the error message Idris produces.
Default Implementations
Sometimes, we'd like to pack several related functions in an interface to allow programmers to implement each in the most efficient way, although they could be implemented in terms of each other. For instance, consider an interface Equals
for comparing two values for equality, with functions eq
returning True
if two values are equal and neq
returning True
if they are not. Surely, we can implement neq
in terms of eq
, so most of the time when implementing Equals
, we will only implement the latter. In this case, we can give an implementation for neq
already in the definition of Equals
:
interface Equals a where
eq : a -> a -> Bool
neq : a -> a -> Bool
neq a1 a2 = not (eq a1 a2)
If in an implementation of Equals
we only implement eq
, Idris will use the default implementation for neq
as shown above:
Equals String where
eq = (==)
If on the other hand we'd like to provide explicit implementations for both functions, we can do so as well:
Equals Bool where
eq True True = True
eq False False = True
eq _ _ = False
neq True False = True
neq False True = True
neq _ _ = False
Exercises part 2
-
Implement interfaces
Equals
,Comp
,Concat
, andEmpty
for pairs, constraining your implementations as necessary. (Note, that multiple constraints can be given sequentially like other function arguments:Comp a => Comp b => Comp (a,b)
.) -
Below is an implementation of a binary tree. Implement interfaces
Equals
andConcat
for this type.data Tree : Type -> Type where Leaf : a -> Tree a Node : Tree a -> Tree a -> Tree a
Interfaces in the Prelude
module Tutorial.Interfaces.Prelude
The Idris Prelude provides several interfaces plus implementations that are useful in almost every non-trivial program. I'll introduce the basic ones here. The more advanced ones will be discussed in later chapters.
Most of these interfaces come with associated mathematical laws, and implementations are assumed to adhere to these laws. These laws will be given here as well.
Eq
Probably the most often used interface, Eq
corresponds to interface Equals
we used above as an example. Instead of eq
and neq
, Eq
provides two operators (==)
and (/=)
for comparing two values of the same type for being equal or not. Most of the data types defined in the Prelude come with an implementation of Eq
, and whenever programmers define their own data types, Eq
is typically one of the first interfaces they implement.
Eq
Laws
We expect the following laws to hold for all implementations of Eq
:
-
(==)
is reflexive:x == x = True
for allx
. This means, that every value is equal to itself. -
(==)
is symmetric:x == y = y == x
for allx
andy
. This means, that the order of arguments passed to(==)
does not matter. -
(==)
is transitive: Fromx == y = True
andy == z = True
followsx == z = True
. -
(/=)
is the negation of(==)
:x == y = not (x /= y)
for allx
andy
.
In theory, Idris has the power to verify these laws at compile time for many non-primitive types. However, out of pragmatism this is not required when implementing Eq
, since writing such proofs can be quite involved.
Ord
The pendant to Comp
in the Prelude is interface Ord
. In addition to compare
, which is identical to our own comp
it provides comparison operators (>=)
, (>)
, (<=)
, and (<)
, as well as utility functions max
and min
. Unlike Comp
, Ord
extends Eq
, so whenever there is an Ord
constraint, we also have access to operators (==)
and (/=)
and related functions.
Ord
Laws
We expect the following laws to hold for all implementations of Ord
:
(<=)
is reflexive and transitive.(<=)
is antisymmetric: Fromx <= y = True
andy <= x = True
followsx == y = True
.x <= y = y >= x
.x < y = not (y <= x)
x > y = not (y >= x)
compare x y = EQ
=>x == y = True
compare x y == GT = x > y
compare x y == LT = x < y
Semigroup
and Monoid
Semigroup
is the pendant to our example interface Concat
, with operator (<+>)
(also called append) corresponding to function concat
.
Likewise, Monoid
corresponds to Empty
, with neutral
corresponding to empty
.
These are incredibly important interfaces, which can be used to combine two or more values of a data type into a single value of the same type. Examples include but are not limited to addition or multiplication of numeric types, concatenation of sequences of data, or sequencing of computations.
As an example, consider a data type for representing distances in a geometric application. We could just use Double
for this, but that's not very type safe. It would be better to use a single field record wrapping values type Double
, to give such values clear semantics:
record Distance where
constructor MkDistance
meters : Double
There is a natural way for combining two distances: We sum up the values they hold. This immediately leads to an implementation of Semigroup
:
Semigroup Distance where
x <+> y = MkDistance $ x.meters + y.meters
It is also immediately clear, that zero is the neutral element of this operation: Adding zero to any value does not affect the value at all. This allows us to implement Monoid
as well:
Monoid Distance where
neutral = MkDistance 0
Semigroup
and Monoid
Laws
We expect the following laws to hold for all implementations of Semigroup
and Monoid
:
(<+>)
is associative:x <+> (y <+> z) = (x <+> y) <+> z
, for all valuesx
,y
, andz
.neutral
is the neutral element with relation to(<+>)
:neutral <+> x = x <+> neutral = x
, for allx
.
Show
The Show
interface is mainly used for debugging purposes, and is supposed to display values of a given type as a string, typically closely resembling the Idris code used to create the value. This includes the proper wrapping of arguments in parentheses where necessary. For instance, experiment with the output of the following function at the REPL:
showExample : Maybe (Either String (List (Maybe Integer))) -> String
showExample = show
And at the REPL:
Tutorial.Interfaces> showExample (Just (Right [Just 12, Nothing]))
"Just (Right [Just 12, Nothing])"
We will learn how to implement instances of Show
in an exercise.
Overloaded Literals
Literal values in Idris, such as integer literals (12001
), string literals ("foo bar"
), floating point literals (12.112
), and character literals ('$'
) can be overloaded. This means, that we can create values of types other than String
from just a string literal. The exact workings of this has to wait for another section, but for many common cases, it is sufficient for a value to implement interfaces FromString
(for using string literals), FromChar
(for using character literals), or FromDouble
(for using floating point literals). The case of integer literals is special, and will be discussed in the next section.
Here is an example of using FromString
. Assume, we write an application where users can identify themselves with a username and password. Both consist of strings of characters, so it is pretty easy to confuse and mix up the two things, although they clearly have very different semantics. In these cases, it is advisable to come up with new types for the two, especially since getting these things wrong is a security concern.
Here are three example record types to do this:
record UserName where
constructor MkUserName
name : String
record Password where
constructor MkPassword
value : String
record User where
constructor MkUser
name : UserName
password : Password
In order to create a value of type User
, even for testing, we'd have to wrap all strings using the given constructors:
hock : User
hock = MkUser (MkUserName "hock") (MkPassword "not telling")
This is rather cumbersome, and some people might think this to be too high a price to pay just for an increase in type safety (I'd tend to disagree). Luckily, we can get the convenience of string literals back very easily:
FromString UserName where
fromString = MkUserName
FromString Password where
fromString = MkPassword
hock2 : User
hock2 = MkUser "hock" "not telling"
Numeric Interfaces
The Prelude also exports several interfaces providing the usual arithmetic operations. Below is a comprehensive list of the interfaces and the functions each provides:
-
Num
(+)
: Addition(*)
: MultiplicationfromInteger
: Overloaded integer literals
-
Neg
negate
: Negation(-)
: Subtraction
-
Integral
div
: Integer divisionmod
: Modulo operation
-
Fractional
(/)
: Divisionrecip
: Calculates the reciprocal of a value
As you can see: We need to implement interface Num
to use integer literals for a given type. In order to use negative integer literals like -12
, we also have to implement interface Neg
.
Cast
The last interface we will quickly discuss in this section is Cast
. It is used to convert values of one type to values of another via function cast
. Cast
is special, since it is parameterized over two type parameters unlike the other interfaces we looked at so far, with only one type parameter.
So far, Cast
is mainly used for interconversion between primitive types in the standard libraries, especially numeric types. When you look at the implementations exported from the Prelude (for instance, by invoking :doc Cast
at the REPL), you'll see that there are dozens of implementations for most pairings of primitive types.
Although Cast
would also be useful for other conversions (for going from Maybe
to List
or for going from Either e
to Maybe
, for instance), the Prelude and base seem not to introduce these consistently. For instance, there are Cast
implementations from going from SnocList
to List
and vice versa, but not for going from Vect n
to List
, or for going from List1
to List
, although these would be just as feasible.
Exercises part 3
These exercises are meant to make you comfortable with implementing interfaces for your own data types, as you will have to do so regularly when writing Idris code.
While it is immediately clear why interfaces like Eq
, Ord
, or Num
are useful, the usability of Semigroup
and Monoid
may be harder to appreciate at first. Therefore, there are several exercises where you'll implement different instances for these.
-
Define a record type
Complex
for complex numbers, by pairing two values of typeDouble
. Implement interfacesEq
,Num
,Neg
, andFractional
forComplex
. -
Implement interface
Show
forComplex
. Have a look at data typePrec
and functionshowPrec
and how these are used in the Prelude to implement instances forEither
andMaybe
.Verify the correct behavior of your implementation by wrapping a value of type
Complex
in aJust
andshow
the result at the REPL. -
Consider the following wrapper for optional values:
record First a where constructor MkFirst value : Maybe a
Implement interfaces
Eq
,Ord
,Show
,FromString
,FromChar
,FromDouble
,Num
,Neg
,Integral
, andFractional
forFirst a
. All of these will require corresponding constraints on type parametera
. Consider implementing and using the following utility functions where they make sense:pureFirst : a -> First a mapFirst : (a -> b) -> First a -> First b mapFirst2 : (a -> b -> c) -> First a -> First b -> First c
-
Implement interfaces
Semigroup
andMonoid
forFirst a
in such a way, that(<+>)
will return the first non-nothing argument andneutral
is the corresponding neutral element. There must be no constraints on type parametera
in these implementations. -
Repeat exercises 3 and 4 for record
Last
. TheSemigroup
implementation should return the last non-nothing value.record Last a where constructor MkLast value : Maybe a
-
Function
foldMap
allows us to map a function returning aMonoid
over a list of values and accumulate the result using(<+>)
at the same time. This is a very powerful way to accumulate the values stored in a list. UsefoldMap
andLast
to extract the last element (if any) from a list.Note, that the type of
foldMap
is more general and not specialized to lists only. It works also forMaybe
,Either
and other container types we haven't looked at so far. We will learn about interfaceFoldable
in a later section. -
Consider record wrappers
Any
andAll
for boolean values:record Any where constructor MkAny any : Bool record All where constructor MkAll all : Bool
Implement
Semigroup
andMonoid
forAny
, so that the result of(<+>)
isTrue
, if and only if at least one of the arguments isTrue
. Make sure thatneutral
is indeed the neutral element for this operation.Likewise, implement
Semigroup
andMonoid
forAll
, so that the result of(<+>)
isTrue
, if and only if both of the arguments areTrue
. Make sure thatneutral
is indeed the neutral element for this operation. -
Implement functions
anyElem
andallElems
usingfoldMap
andAny
orAll
, respectively:-- True, if the predicate holds for at least one element anyElem : (a -> Bool) -> List a -> Bool -- True, if the predicate holds for all elements allElems : (a -> Bool) -> List a -> Bool
-
Record wrappers
Sum
andProduct
are mainly used to hold numeric types.record Sum a where constructor MkSum value : a record Product a where constructor MkProduct value : a
Given an implementation of
Num a
, implementSemigroup (Sum a)
andMonoid (Sum a)
, so that(<+>)
corresponds to addition.Likewise, implement
Semigroup (Product a)
andMonoid (Product a)
, so that(<+>)
corresponds to multiplication.When implementing
neutral
, remember that you can use integer literals when working with numeric types. -
Implement
sumList
andproductList
by usingfoldMap
together with the wrappers from Exercise 9:sumList : Num a => List a -> a productList : Num a => List a -> a
-
To appreciate the power and versatility of
foldMap
, after solving exercises 6 to 10 (or by loadingSolutions.Inderfaces
in a REPL session), run the following at the REPL, which will - in a single list traversal! - calculate the first and last element of the list as well as the sum and product of all values.> foldMap (\x => (pureFirst x, pureLast x, MkSum x, MkProduct x)) [3,7,4,12] (MkFirst (Just 3), (MkLast (Just 12), (MkSum 26, MkProduct 1008)))
Note, that there are also
Semigroup
implementations for types with anOrd
implementation, which will return the smaller or larger of two values. In case of types with an absolute minimum or maximum (for instance, 0 for natural numbers, or 0 and 255 forBits8
), these can even be extended toMonoid
. -
In an earlier exercise, you implemented a data type representing chemical elements and wrote a function for calculating their atomic masses. Define a new single field record type for representing atomic masses, and implement interfaces
Eq
,Ord
,Show
,FromDouble
,Semigroup
, andMonoid
for this. -
Use the new data type from exercise 12 to calculate the atomic mass of an element and compute the molecular mass of a molecule given by its formula.
Hint: With a suitable utility function, you can use
foldMap
once again for this.
Final notes: If you are new to functional programming, make sure to give your implementations of exercises 6 to 10 a try at the REPL. Note, how we can implement all of these functions with a minimal amount of code and how, as shown in exercise 11, these behaviors can be combined in a single list traversal.
Conclusion
- Interfaces allow us to implement the same function with different behavior for different types.
- Functions taking one or more interface implementations as arguments are called constrained functions.
- Interfaces can be organized hierarchically by extending other interfaces.
- Interfaces implementations can themselves be constrained requiring other implementations to be available.
- Interface functions can be given a default implementation, which can be overridden by implementers, for instance for reasons of efficiency.
- Certain interfaces allow us to use literal values such as string or integer literals for our own data types.
Note, that I did not yet tell the whole story about literal values in this section. More details for using literals with types that accept only a restricted set of values can be found in the chapter about primitives.
What's next
In the next chapter, we have a closer look at functions and their types. We will learn about named arguments, implicit arguments, and erased arguments as well as some constructors for implementing more complex functions.
Functions Part 2
So far, we learned about the core features of the Idris language, which it has in common with several other pure, strongly typed programming languages like Haskell: (Higher-order) Functions, algebraic data types, pattern matching, parametric polymorphism (generic types and functions), and ad hoc polymorphism (interfaces and constrained functions).
In this chapter, we start to dissect Idris functions and their types for real. We learn about implicit arguments, named arguments, as well as erasure and quantities. But first, we'll look at let
bindings and where
blocks, which help us implement functions too complex to fit on a single line of code. Let's get started!
module Tutorial.Functions2
%default total
Let Bindings and Local Definitions
module Tutorial.Functions2.LetBindings
%default total
The functions we looked at so far were simple enough to be implemented directly via pattern matching without the need of additional auxiliary functions or variables. This is not always the case, and there are two important language constructs for introducing and reusing new local variables and functions. We'll look at these in two case studies.
Use Case 1: Arithmetic Mean and Standard Deviation
In this example, we'd like to calculate the arithmetic mean and the standard deviation of a list of floating point values. There are several things we need to consider.
First, we need a function for calculating the sum of a list of numeric values. The Prelude exports function sum
for this:
Main> :t sum
Prelude.sum : Num a => Foldable t => t a -> a
This is - of course - similar to sumList
from Exercise 10 of the last section, but generalized to all container types with a Foldable
implementation. We will learn about interface Foldable
in a later section.
In order to also calculate the variance, we need to convert every value in the list to a new value, as we have to subtract the mean from every value in the list and square the result. In the previous section's exercises, we defined function mapList
for this. The Prelude - of course - already exports a similar function called map
, which is again more general and works also like our mapMaybe
for Maybe
and mapEither
for Either e
. Here's its type:
Main> :t map
Prelude.map : Functor f => (a -> b) -> f a -> f b
Interface Functor
is another one we'll talk about in a later section.
Finally, we need a way to calculate the length of a list of values. We use function length
for this:
Main> :t List.length
Prelude.List.length : List a -> Nat
Here, Nat
is the type of natural numbers (unbounded, unsigned integers). Nat
is actually not a primitive data type but a sum type defined in the Prelude with data constructors Z : Nat
(for zero) and S : Nat -> Nat
(for successor). It might seem highly inefficient to define natural numbers this way, but the Idris compiler treats these and several other number-like types specially, and replaces them with primitive integers during code generation.
We are now ready to give the implementation of mean
a go. Since this is Idris, and we care about clear semantics, we will quickly define a custom record type instead of just returning a tuple of Double
s. This makes it clearer, which floating point number corresponds to which statistic entity:
square : Double -> Double
square n = n * n
record Stats where
constructor MkStats
mean : Double
variance : Double
deviation : Double
stats : List Double -> Stats
stats xs =
let len := cast (length xs)
mean := sum xs / len
variance := sum (map (\x => square (x - mean)) xs) / len
in MkStats mean variance (sqrt variance)
As usual, we first try this at the REPL:
Tutorial.Functions2> stats [2,4,4,4,5,5,7,9]
MkStats 5.0 4.0 2.0
Seems to work, so let's digest this step by step. We introduce several new local variables (len
, mean
, and variance
), which all will be used more than once in the remainder of the implementation. To do so, we use a let
binding. This consists of the let
keyword, followed by one or more variable assignments, followed by the final expression, which has to be prefixed by in
. Note, that whitespace is significant again: We need to properly align the three variable names. Go ahead, and try out what happens if you remove a space in front of mean
or variance
. Note also, that the alignment of assignment operators :=
is optional. I do this, since I thinks it helps readability.
Let's also quickly look at the different variables and their types. len
is the length of the list cast to a Double
, since this is what's needed later on, where we divide other values of type Double
by the length. Idris is very strict about this: We are not allowed to mix up numeric types without explicit casts. Please note, that in this case Idris is able to infer the type of len
from the surrounding context. mean
is straight forward: We sum
up the values stored in the list and divide by the list's length. variance
is the most involved of the three: We map each item in the list to a new value using an anonymous function to subtract the mean and square the result. We then sum up the new terms and divide again by the number of values.
Use Case 2: Simulating a Simple Web Server
In the second use case, we are going to write a slightly larger application. This should give you an idea about how to design data types and functions around some business logic you'd like to implement.
Assume we run a music streaming web server, where users can buy whole albums and listen to them online. We'd like to simulate a user connecting to the server and getting access to one of the albums they bought.
We first define a bunch of record types:
record Artist where
constructor MkArtist
name : String
record Album where
constructor MkAlbum
name : String
artist : Artist
record Email where
constructor MkEmail
value : String
record Password where
constructor MkPassword
value : String
record User where
constructor MkUser
name : String
email : Email
password : Password
albums : List Album
Most of these should be self-explanatory. Note, however, that in several cases (Email
, Artist
, Password
) we wrap a single value in a new record type. Of course, we could have used the unwrapped String
type instead, but we'd have ended up with many String
fields, which can be hard to disambiguate. In order not to confuse an email string with a password string, it can therefore be helpful to wrap both of them in a new record type to drastically increase type safety at the cost of having to reimplement some interfaces. Utility function on
from the Prelude is very useful for this. Don't forget to inspect its type at the REPL, and try to understand what's going on here.
Eq Artist where (==) = (==) `on` name
Eq Email where (==) = (==) `on` value
Eq Password where (==) = (==) `on` value
Eq Album where (==) = (==) `on` \a => (a.name, a.artist)
In case of Album
, we wrap the two fields of the record in a Pair
, which already comes with an implementation of Eq
. This allows us to again use function on
, which is very convenient.
Next, we have to define the data types representing server requests and responses:
record Credentials where
constructor MkCredentials
email : Email
password : Password
record Request where
constructor MkRequest
credentials : Credentials
album : Album
data Response : Type where
UnknownUser : Email -> Response
InvalidPassword : Response
AccessDenied : Email -> Album -> Response
Success : Album -> Response
For server responses, we use a custom sum type encoding the possible outcomes of a client request. In practice, the Success
case would return some kind of connection to start the actual album stream, but we just wrap up the album we found to simulate this behavior.
We can now go ahead and simulate the handling of a request at the server. To emulate our user data base, a simple list of users will do. Here's the type of the function we'd like to implement:
DB : Type
DB = List User
handleRequest : DB -> Request -> Response
Note, how we defined a short alias for List User
called DB
. This is often useful to make lengthy type signatures more readable and communicate the meaning of a type in the given context. However, this will not introduce a new type, nor will it increase type safety: DB
is identical to List User
, and as such, a value of type DB
can be used wherever a List User
is expected and vice versa. In more complex programs it is therefore usually preferable to define new types by wrapping values in single-field records.
The implementation will proceed as follows: It will first try and lookup a User
by is email address in the data base. If this is successful, it will compare the provided password with the user's actual password. If the two match, it will lookup the requested album in the user's list of albums. If all of these steps succeed, the result will be an Album
wrapped in a Success
. If any of the steps fails, the result will describe exactly what went wrong.
Here's a possible implementation:
handleRequest db (MkRequest (MkCredentials email pw) album) =
case lookupUser db of
Just (MkUser _ _ password albums) =>
if password == pw then lookupAlbum albums else InvalidPassword
Nothing => UnknownUser email
where lookupUser : List User -> Maybe User
lookupUser [] = Nothing
lookupUser (x :: xs) =
if x.email == email then Just x else lookupUser xs
lookupAlbum : List Album -> Response
lookupAlbum [] = AccessDenied email album
lookupAlbum (x :: xs) =
if x == album then Success album else lookupAlbum xs
I'd like to point out several things in this example. First, note how we can extract values from nested records in a single pattern match. Second, we defined two local functions in a where
block: lookupUser
, and lookupAlbum
. Both of these have access to all variables in the surrounding scope. For instance, lookupUser
uses the email
variable from the pattern match in the implementation's first line. Likewise, lookupAlbum
makes use of the album
variable.
A where
block introduces new local definitions, accessible only from the surrounding scope and from other functions defined later in the same where
block. These need to be explicitly typed and indented by the same amount of whitespace.
Local definitions can also be introduced before a function's implementation by using the let
keyword. This usage of let
is not to be confused with let bindings described above, which are used to bind and reuse the results of intermediate computations. Below is how we could have implemented handleRequest
with local definitions introduced by the let
keyword. Again, all definitions have to be properly typed and indented:
handleRequest' : DB -> Request -> Response
handleRequest' db (MkRequest (MkCredentials email pw) album) =
let lookupUser : List User -> Maybe User
lookupUser [] = Nothing
lookupUser (x :: xs) =
if x.email == email then Just x else lookupUser xs
lookupAlbum : List Album -> Response
lookupAlbum [] = AccessDenied email album
lookupAlbum (x :: xs) =
if x == album then Success album else lookupAlbum xs
in case lookupUser db of
Just (MkUser _ _ password albums) =>
if password == pw then lookupAlbum albums else InvalidPassword
Nothing => UnknownUser email
Exercises
The exercises in this section are supposed to increase you experience in writing purely functional code. In some cases it might be useful to use let
expressions or where
blocks, but this will not always be required.
Exercise 3 is again of utmost importance. traverseList
is a specialized version of the more general traverse
, one of the most powerful and versatile functions available in the Prelude (check out its type!).
-
Module
Data.List
in base exports functionsfind
andelem
. Inspect their types and use these in the implementation ofhandleRequest
. This should allow you to completely get rid of thewhere
block. -
Refactor
handleRequest
to useEither
, such thathandleRequest : DB -> Request -> Either Failure Album
, wheredata Failure : Type where UnknownUser : Email -> Failure InvalidPassword : Failure AccessDenied : Email -> Album -> Failure
Hint: You may find nested
case
statements helpful. -
Define an enumeration type listing the four nucleobases occurring in DNA strands. Define also a type alias
DNA
for lists of nucleobases. Declare and implement functionreadBase
for converting a single character (typeChar
) to a nucleobase. You can use character literals in your implementation like so:'A'
,'a'
. Note, that this function might fail, so adjust the result type accordingly. -
Implement the following function, which tries to convert all values in a list with a function, which might fail. The result should be a
Just
holding the list of converted values in unmodified order, if and only if every single conversion was successful.traverseList : (a -> Maybe b) -> List a -> Maybe (List b)
You can verify, that the function behaves correctly with the following test:
traverseList Just [1,2,3] = Just [1,2,3]
. -
Implement function
readDNA : String -> Maybe DNA
using the functions and types defined in exercises 2 and 3. You will also need functionunpack
from the Prelude. -
Implement function
complement : DNA -> DNA
to calculate the complement of a strand of DNA.
The Truth about Function Arguments
module Tutorial.Functions2.TheTruth
%default total
So far, when we defined a top level function, it looked something like the following:
zipEitherWith : (a -> b -> c) -> Either e a -> Either e b -> Either e c
zipEitherWith f (Right va) (Right vb) = Right (f va vb)
zipEitherWith f (Left e) _ = Left e
zipEitherWith f _ (Left e) = Left e
Function zipEitherWith
is a generic higher-order function combining the values stored in two Either
s via a binary function. If either of the Either
arguments is a Left
, the result is also a Left
.
This is a generic function with type parameters a
, b
, c
, and e
. However, there is a more verbose type for zipEitherWith
, which is visible in the REPL when entering :ti zipEitherWith
(the i
here tells Idris to include implicit
arguments). You will get a type similar to this:
zipEitherWith' : {0 a : Type}
-> {0 b : Type}
-> {0 c : Type}
-> {0 e : Type}
-> (a -> b -> c)
-> Either e a
-> Either e b
-> Either e c
In order to understand what's going on here, we will have to talk about named arguments, implicit arguments, and quantities.
Named Arguments
In a function type, we can give each argument a name. Like so:
fromMaybe : (deflt : a) -> (ma : Maybe a) -> a
fromMaybe deflt Nothing = deflt
fromMaybe _ (Just x) = x
Here, the first argument is given name deflt
, the second ma
. These names can be reused in a function's implementation, as was done for deflt
, but this is not mandatory: We are free to use different names in the implementation. There are several reasons, why we'd choose to name our arguments: It can serve as documentation, but it also allows us to pass the arguments to a function in arbitrary order when using the following syntax:
extractBool : Maybe Bool -> Bool
extractBool v = fromMaybe { ma = v, deflt = False }
Or even :
extractBool2 : Maybe Bool -> Bool
extractBool2 = fromMaybe { deflt = False }
The arguments in a record's constructor are automatically named in accordance with the field names:
record Dragon where
constructor MkDragon
name : String
strength : Nat
hitPoints : Int16
gorgar : Dragon
gorgar = MkDragon { strength = 150, name = "Gorgar", hitPoints = 10000 }
For the use cases described above, named arguments are merely a convenience and completely optional. However, Idris is a dependently typed programming language: Types can be calculated from and depend on values. For instance, the result type of a function can depend on the value of one of its arguments. Here's a contrived example:
IntOrString : Bool -> Type
IntOrString True = Integer
IntOrString False = String
intOrString : (v : Bool) -> IntOrString v
intOrString False = "I'm a String"
intOrString True = 1000
If you see such a thing for the first time, it can be hard to understand what's going on here. First, function IntOrString
computes a Type
from a Bool
value: If the argument is True
, it returns type Integer
, if the argument is False
it returns String
. We use this to calculate the return type of function intOrString
based on its boolean argument v
: If v
is True
, the return type is (in accordance with IntOrString True = Integer
) Integer
, otherwise it is String
.
Note, how in the type signature of intOrString
, we must give the argument of type Bool
a name (v
) in order to reference it in the result type IntOrString v
.
You might wonder at this moment, why this is useful and why we would ever want to define a function with such a strange type. We will see lots of very useful examples in due time! For now, suffice to say that in order to express dependent function types, we need to name at least some of the function's arguments and refer to them by name in the types of other arguments.
Implicit Arguments
Implicit arguments are arguments, the values of which the compiler should infer and fill in for us automatically. For instance, in the following function signature, we expect the compiler to infer the value of type parameter a
automatically from the types of the other arguments (ignore the 0 quantity for the moment; I'll explain it in the next subsection):
maybeToEither : {0 a : Type} -> Maybe a -> Either String a
maybeToEither Nothing = Left "Nope"
maybeToEither (Just x) = Right x
-- Please remember, that the above is
-- equivalent to the following:
maybeToEither' : Maybe a -> Either String a
maybeToEither' Nothing = Left "Nope"
maybeToEither' (Just x) = Right x
As you can see, implicit arguments are wrapped in curly braces, unlike explicit named arguments, which are wrapped in parentheses. Inferring the value of an implicit argument is not always possible. For instance, if we enter the following at the REPL, Idris will fail with an error:
Tutorial.Functions2> show (maybeToEither Nothing)
Error: Can't find an implementation for Show (Either String ?a).
Idris is unable to find an implementation of Show (Either String a)
without knowing what a
actually is. Note the question mark in front of the type parameter: ?a
. If this happens, there are several ways to help the type checker. We could, for instance, pass a value for the implicit argument explicitly. Here's the syntax to do this:
Tutorial.Functions2> show (maybeToEither {a = Int8} Nothing)
"Left "Nope""
As you can see, we use the same syntax as shown above for explicit named arguments and the two forms of argument passing can be mixed.
We could also specify the type of the whole expression using utility function the
from the Prelude:
Tutorial.Functions2> show (the (Either String Int8) (maybeToEither Nothing))
"Left "Nope""
It is instructive to have a look at the type of the
:
Tutorial.Functions2> :ti the
Prelude.the : (0 a : Type) -> a -> a
Compare this with the identity function id
:
Tutorial.Functions2> :ti id
Prelude.id : {0 a : Type} -> a -> a
The only difference between the two: In case of the
, the type parameter a
is an explicit argument, while in case of id
, it is an implicit argument. Although the two functions have almost identical types (and implementations!), they serve quite different purposes: the
is used to help type inference, while id
is used whenever we'd like to return an argument without modifying it at all (which, in the presence of higher-order functions, happens surprisingly often).
Both ways to improve type inference shown above are used quite often, and must be understood by Idris programmers.
Multiplicities
Finally, we need to talk about the zero multiplicity, which appeared in several of the type signatures in this section. Idris 2, unlike its predecessor Idris 1, is based on a core language called quantitative type theory (QTT): Every variable in Idris 2 is associated with one of three possible multiplicities:
0
, meaning that the variable is erased at runtime.1
, meaning that the variable is used exactly once at runtime.- Unrestricted (the default), meaning that the variable is used an arbitrary number of times at runtime.
We will not talk about the most complex of the three, multiplicity 1
, here. We are, however, often interested in multiplicity 0
: A variable with multiplicity 0
is only relevant at compile time. It will not make any appearance at runtime, and the computation of such a variable will never affect a program's runtime performance.
In the type signature of maybeToEither
we see that type parameter a
has multiplicity 0
, and will therefore be erased and is only relevant at compile time, while the Maybe a
argument has unrestricted multiplicity.
It is also possible to annotate explicit arguments with multiplicities, in which case the argument must again be put in parentheses. For an example, look again at the type signature of the
.
Underscores
It is often desirable, to only write as little code as necessary and let Idris figure out the rest. We have already learned about one such occasion: Catch-all patterns. If a variable in a pattern match is not used on the right hand side, we can't just drop it, as this would make it impossible for Idris to know, which of several arguments we were planning to drop, but we can use an underscore as a placeholder instead:
isRight : Either a b -> Bool
isRight (Right _) = True
isRight _ = False
But when we look at the type signature of isRight
, we will note that type parameters a
and b
are also only used once, and are therefore of no importance. Let's get rid of them:
isRight' : Either _ _ -> Bool
isRight' (Right _) = True
isRight' _ = False
In the detailed type signature of zipEitherWith
, it should be obvious for Idris that the implicit arguments are of type Type
. After all, all of them are later on applied to the Either
type constructor, which is of type Type -> Type -> Type
. Let's get rid of them:
zipEitherWith'' : {0 a : _}
-> {0 b : _}
-> {0 c : _}
-> {0 e : _}
-> (a -> b -> c)
-> Either e a
-> Either e b
-> Either e c
Consider the following contrived example:
foo : Integer -> String
foo n = show (the (Either String Integer) (Right n))
Since we wrap an Integer
in a Right
, it is obvious that the second argument in Either String Integer
is Integer
. Only the String
argument can't be inferred by Idris. Even better, the Either
itself is obvious! Let's get rid of the unnecessary noise:
foo' : Integer -> String
foo' n = show (the (_ String _) (Right n))
Please note, that using underscores as in foo'
is not always desirable, as it can quite drastically obfuscate the written code. Always use a syntactic convenience to make code more readable, and not to show people how clever you are.
Programming with Holes
module Tutorial.Functions2.Holes
%default total
Solved all the exercises so far? Got angry at the type checker for always complaining and never being really helpful? It's time to change that. Idris comes with several highly useful interactive editing features. Sometimes, the compiler is able to implement complete functions for us (if the types are specific enough). Even if that's not possible, there's an incredibly useful and important feature, which can help us when the types are getting too complicated: Holes. Holes are variables, the names of which are prefixed with a question mark. We can use them as placeholders whenever we plan to implement a piece of functionality at a later time. In addition, their types and the types and quantities of all other variables in scope can be inspected at the REPL (or in your editor, if you setup the necessary plugin). Let's see them holes in action.
Remember the traverseList
example from an Exercise earlier in this section? If this was your first encounter with applicative list traversals, this might have been a nasty bit of work. Well, let's just make it a wee bit harder still. We'd like to implement the same piece of functionality for functions returning Either e
, where e
is a type with a Semigroup
implementation, and we'd like to accumulate the values in all Left
s we meet along the way.
Here's the type of the function:
traverseEither : Semigroup e
=> (a -> Either e b)
-> List a
-> Either e (List b)
As an optional exercise, you may wish to attempt this yourself first. You've seen everything you need. Consider:
- semigroups have an append operation
<+> : e -> e -> e
that combines two values into one - the empty list will succeed vacuously
- if any of the function applications fail, you'll return a consolidation of all of the errors
e
- if all of the function applications succeed, you'll return a list with all of the results
b
- if you get it to compile, there are some test functions and variables at the bottom of this section for you to confirm that it's working as intended
Now, in order to follow along, you might want to start your own Idris source file, load it into a REPL session and adjust the code as described here. The first thing we'll do, is write a skeleton implementation with a hole on the right hand side:
traverseEither fun as = ?impl
When you now go to the REPL and reload the file using command :r
, you can enter :m
to list all the metavariables:
Tutorial.Functions2> :m
1 hole:
Tutorial.Functions2.impl : Either e (List b)
Next, we'd like to display the hole's type (including all variables in the surrounding context plus their types):
Tutorial.Functions2> :t impl
0 b : Type
0 a : Type
0 e : Type
as : List a
fun : a -> Either e b
------------------------------
impl : Either e (List b)
So, we have some erased type parameters (a
, b
, and e
), a value of type List a
called as
, and a function from a
to Either e b
called fun
. Our goal is to come up with a value of type Either a (List b)
.
We could just return a Right []
, but that only make sense if our input list is indeed the empty list. We therefore should start with a pattern match on the list:
traverseEither fun [] = ?impl_0
traverseEither fun (x :: xs) = ?impl_1
The result is two holes, which must be given distinct names. When inspecting impl_0
, we get the following result:
Tutorial.Functions2> :t impl_0
0 b : Type
0 a : Type
0 e : Type
fun : a -> Either e b
------------------------------
impl_0 : Either e (List b)
Now, this is an interesting situation. We are supposed to come up with a value of type Either e (List b)
with nothing to work with. We know nothing about a
, so we can't provide an argument with which to invoke fun
. Likewise, we know nothing about e
or b
either, so we can't produce any values of these either. The only option we have is to replace impl_0
with an empty list wrapped in a Right
:
traverseEither fun [] = Right []
The non-empty case is of course slightly more involved. Here's the context of ?impl_1
:
Tutorial.Functions2> :t impl_1
0 b : Type
0 a : Type
0 e : Type
x : a
xs : List a
fun : a -> Either e b
------------------------------
impl_1 : Either e (List b)
Since x
is of type a
, we can either use it as an argument to fun
or drop and ignore it. xs
, on the other hand, is the remainder of the list of type List a
. We could again drop it or process it further by invoking traverseEither
recursively. Since the goal is to try and convert all values, we should drop neither. Since in case of two Left
s we are supposed to accumulate the values, we eventually need to run both computations anyway (invoking fun
, and recursively calling traverseEither
). We therefore can do both at the same time and analyze the results in a single pattern match by wrapping both in a Pair
:
traverseEither fun (x :: xs) =
case (fun x, traverseEither fun xs) of
p => ?impl_2
Once again, we inspect the context:
Tutorial.Functions2> :t impl_2
0 b : Type
0 a : Type
0 e : Type
xs : List a
fun : a -> Either e b
x : a
p : (Either e b, Either e (List b))
------------------------------
impl_2 : Either e (List b)
We'll definitely need to pattern match on pair p
next to figure out, which of the two computations succeeded:
traverseEither fun (x :: xs) =
case (fun x, traverseEither fun xs) of
(Left y, Left z) => ?impl_6
(Left y, Right _) => ?impl_7
(Right _, Left z) => ?impl_8
(Right y, Right z) => ?impl_9
At this point we might have forgotten what we actually wanted to do (at least to me, this happens annoyingly often), so we'll just quickly check what our goal is:
Tutorial.Functions2> :t impl_6
0 b : Type
0 a : Type
0 e : Type
xs : List a
fun : a -> Either e b
x : a
y : e
z : e
------------------------------
impl_6 : Either e (List b)
So, we are still looking for a value of type Either e (List b)
, and we have two values of type e
in scope. According to the spec we want to accumulate these using e
s Semigroup
implementation. We can proceed for the other cases in a similar manner, remembering that we should return a Right
, if and only if all conversions where successful:
traverseEither fun (x :: xs) =
case (fun x, traverseEither fun xs) of
(Left y, Left z) => Left (y <+> z)
(Left y, Right _) => Left y
(Right _, Left z) => Left z
(Right y, Right z) => Right (y :: z)
To reap the fruits of our labour, let's show off with a small example:
data Nucleobase = Adenine | Cytosine | Guanine | Thymine
readNucleobase : Char -> Either (List String) Nucleobase
readNucleobase 'A' = Right Adenine
readNucleobase 'C' = Right Cytosine
readNucleobase 'G' = Right Guanine
readNucleobase 'T' = Right Thymine
readNucleobase c = Left ["Unknown nucleobase: " ++ show c]
DNA : Type
DNA = List Nucleobase
readDNA : String -> Either (List String) DNA
readDNA = traverseEither readNucleobase . unpack
Let's try this at the REPL:
Tutorial.Functions2> readDNA "CGTTA"
Right [Cytosine, Guanine, Thymine, Thymine, Adenine]
Tutorial.Functions2> readDNA "CGFTAQ"
Left ["Unknown nucleobase: 'F'", "Unknown nucleobase: 'Q'"]
Interactive Editing
There are plugins available for several editors and programming environments, which facilitate interacting with the Idris compiler when implementing your functions. One editor, which is well supported in the Idris community, is Neovim. Since I am a Neovim user myself, I added some examples of what's possible to the appendix. Now would be a good time to start using the utilities discussed there.
If you use a different editor, probably with less support for the Idris programming language, you should at the very least have a REPL session open all the time, where the source file you are currently working on is loaded. This allows you to introduce new metavariables and inspect their types and context as you develop your code.
Conclusion
We again covered a lot of ground in this section. I can't stress enough that you should get yourselves accustomed to programming with holes and let the type checker help you figure out what to do next.
-
When in need of local utility functions, consider defining them as local definitions in a where block.
-
Use let expressions to define and reuse local variables.
-
Function arguments can be given a name, which can serve as documentation, can be used to pass arguments in any order, and is used to refer to them in dependent types.
-
Implicit arguments are wrapped in curly braces. The compiler is supposed to infer them from the context. If that's not possible, they can be passed explicitly as other named arguments.
-
Whenever possible, Idris adds implicit erased arguments for all type parameters automatically.
-
Quantities allow us to track how often a function argument is used. Quantity 0 means, the argument is erased at runtime.
-
Use holes as placeholders for pieces of code you plan to fill in at a later time. Use the REPL (or your editor) to inspect the types of holes together with the names, types, and quantities of all variables in their context.
What's next
In the next chapter we'll start using dependent types to help us write provably correct code. Having a good understanding of how to read Idris' type signatures will be of paramount importance there. Whenever you feel lost, add one or more holes and inspect their context to decide what to do next.
Dependent Types
The ability to calculate types from values, pass them as arguments to functions, and return them as results from functions - in short, being a dependently typed language - is one of the most distinguishing features of Idris. Many of the more advanced type level extensions of languages like Haskell (and quite a bit more) can be treated in one fell swoop with dependent types.
module Tutorial.Dependent
%default total
Consider the following functions:
bogusMapList : (a -> b) -> List a -> List b
bogusMapList _ _ = []
bogusZipList : (a -> b -> c) -> List a -> List b -> List c
bogusZipList _ _ _ = []
The implementations type check, and still, they are obviously not what users of our library would expect. In the first example, we'd expect the implementation to apply the function argument to all values stored in the list, without dropping any of them or changing their order. The second is trickier: The two list arguments might be of different length. What are we supposed to do when that's the case? Return a list of the same length as the smaller of the two? Return an empty list? Or shouldn't we in most use cases expect the two lists to be of the same length? How could we even describe such a precondition?
Length-Indexed Lists
module Tutorial.Dependent.LengthIndexedLists
%default total
The answer to the issues described above is of course: Dependent types. Before we proceed to our example, first consider how Idris recursively defines the natural numbers (here affixed with apostrophes to avoid introducing a conflict with the actual definition of Nat
, which you can find here for reference)):
data Nat' : Type where
Z' : Nat'
S' : Nat' -> Nat'
In this scheme, 0 is represented by Z
, 1 is represented by S Z
, 2 is represented by S (S Z)
, and so on. Idris does this automatically so if you enter Z
or S Z
into the REPL, it will return 0
or 1
. Note that the only function inherently available to act on a value of type Nat
is our data constructor S
, which represents the successor function, i.e. adding 1.
Also note that in Idris, every Nat
can be represented as either a Z
or an S n
where n
is another Nat
. Much as every List a
can be represented as either a Nil
or an x :: xs
(where x
is an a
and xs
is a List a
), this informs our pattern matching when solving problems.
Now we can consider the textbook introductory example of dependent types, the vector, which is a list indexed by its length:
public export
data Vect : (len : Nat) -> (a : Type) -> Type where
Nil : Vect 0 a
(::) : (x : a) -> (xs : Vect n a) -> Vect (S n) a
Before we move on, please compare this with the implementation of Seq
in the section about algebraic data types. The constructors are exactly the same: Nil
and (::)
. But there is an important difference: Vect
, unlike Seq
or List
, is not a function from Type
to Type
, it is a function from Nat
to Type
to Type
. Go ahead! Open the REPL and verify this! The Nat
argument (also called an index) represents the length of the vector here. Nil
has type Vect 0 a
: A vector of length zero. Cons has type a -> Vect n a -> Vect (S n) a
: It is exactly one element longer (S n
) than its second argument, which is of length n
.
Let's experiment with this idea to gain a better understanding. There is only one way to come up with a vector of length zero:
ex1 : Vect 0 Integer
ex1 = Nil
The following, on the other hand, leads to a type error (a pretty complicated one, actually):
failing "Mismatch between: S ?n and 0."
ex2 : Vect 0 Integer
ex2 = [12]
The problem: [12]
gets desugared to 12 :: Nil
, but this has the wrong type! Since Nil
has type Vect 0 Integer
here, 12 :: Nil
has type Vect (S 0) Integer
, which is identical to Vect 1 Integer
. Idris verifies, at compile time, that our vector is of the correct length!
ex3 : Vect 1 Integer
ex3 = [12]
So, we found a way to encode the length of a list-like data structure in its type, and it is a type error if the number of elements in a vector does not agree with then length given in its type. We will shortly see several use cases, where this additional piece of information allows us to be more precise in the types and rule out additional programming mistakes. But first, we need to quickly clarify some terminology.
Type Indices versus Type Parameters
Vect
is not only a generic type, parameterized over the type of elements it holds, it is actually a family of types, each of them associated with a natural number representing it's length. We also say, the type family Vect
is indexed by its length.
The difference between a type parameter and an index is, that the latter can and does change across data constructors, while the former is the same for all data constructors. Or, put differently, we can learn about the value of an index by pattern matching on a value of the type family, while this is not possible with a type parameter.
Let's demonstrate this with a contrived example:
data Indexed : Nat -> Type where
I0 : Indexed 0
I3 : Indexed 3
I4 : String -> Indexed 4
Here, Indexed
is indexed over its Nat
argument, as values of the index changes across constructors (I chose some arbitrary value for each constructor), and we can learn about these values by pattern matching on Indexed
values. We can use this, for instance, to create a Vect
of the same length as the index of Indexed
:
fromIndexed : Indexed n -> a -> Vect n a
Go ahead, and try implementing this yourself! Work with holes, pattern match on the Indexed
argument, and learn about the expected output type in each case by inspecting the holes and their context.
Here is my implementation:
fromIndexed I0 va = []
fromIndexed I3 va = [va, va, va]
fromIndexed (I4 _) va = [va, va, va, va]
As you can see, by pattern matching on the value of the Indexed n
argument, we learned about the value of the n
index itself, which was necessary to return a Vect
of the correct length.
Length-Preserving map
Function bogusMapList
behaved unexpectedly, because it always returned the empty list. With Vect
, we need to be true to the types here. If we map over a Vect
, the argument and output type contain a length index, and these length indices will tell us exactly, if and how the lengths of our vectors are modified:
map3_1 : (a -> b) -> Vect 3 a -> Vect 1 b
map3_1 f [_,y,_] = [f y]
map5_0 : (a -> b) -> Vect 5 a -> Vect 0 b
map5_0 f _ = []
map5_10 : (a -> b) -> Vect 5 a -> Vect 10 b
map5_10 f [u,v,w,x,y] = [f u, f u, f v, f v, f w, f w, f x, f x, f y, f y]
While these examples are quite interesting, they are not really useful, are they? That's because they are too specialized. We'd like to have a general function for mapping vectors of any length. Instead of using concrete lengths in type signatures, we can also use variables as already seen in the definition of Vect
. This allows us to declare the general case:
mapVect' : (a -> b) -> Vect n a -> Vect n b
This type describes a length-preserving map. It is actually more instructive (but not necessary) to include the implicit arguments as well:
mapVect : {0 a,b : _} -> {0 n : Nat} -> (a -> b) -> Vect n a -> Vect n b
We ignore the two type parameters a
, and b
, as these just describe a generic function (note, however, that we can group arguments of the same type and quantity in a single pair of curly braces; this is optional, but it sometimes helps making type signatures a bit shorter). The implicit argument of type Nat
, however, tells us that the input and output Vect
are of the same length. It is a type error to not uphold to this contract. When implementing mapVect
, it is very instructive to follow along and use some holes. In order to get any information about the length of the Vect
argument, we need to pattern match on it:
mapVect _ Nil = ?impl_0
mapVect f (x :: xs) = ?impl_1
At the REPL, we learn the following:
Tutorial.Dependent> :t impl_0
0 a : Type
0 b : Type
0 n : Nat
------------------------------
impl_0 : Vect 0 b
Tutorial.Dependent> :t impl_1
0 a : Type
0 b : Type
x : a
xs : Vect n a
f : a -> b
0 n : Nat
------------------------------
impl_1 : Vect (S n) b
The first hole, impl_0
is of type Vect 0 b
. There is only one such value, as discussed above:
mapVect _ Nil = Nil
The second case is again more interesting. We note, that xs
is of type Vect n a
, for an arbitrary length n
(given as an erased argument), while the result is of type Vect (S n) b
. So, the result has to be one element longer than xs
. Luckily, we already have a value of type a
(bound to variable x
) and a function from a
to b
(bound to variable f
), so we can apply f
to x
and prepend the result to a yet unknown remainder:
mapVect f (x :: xs) = f x :: ?rest
Let's inspect the new hole at the REPL:
Tutorial.Dependent> :t rest
0 a : Type
0 b : Type
x : a
xs : Vect n a
f : a -> b
0 n : Nat
------------------------------
rest : Vect n b
Now, we have a Vect n a
and need a Vect n b
, without knowing anything else about n
. We could learn more about n
by pattern matching further on xs
, but this would quickly lead us down a rabbit hole, since after such a pattern match, we'd end up with another Nil
case and another cons case, with a new tail of unknown length. Instead, we can invoke mapVect
recursively to convert the remainder (xs
) to a Vect n b
. The type checker guarantees, that the lengths of xs
and mapVect f xs
are the same, so the whole expression type checks and we are done:
mapVect f (x :: xs) = f x :: mapVect f xs
Zipping Vectors
Let us now have a look at bogusZipList
: We'd like to pairwise merge two lists holding elements of (possibly) distinct types through a given binary function. As discussed above, the most reasonable thing to do is to expect the two lists as well as the result to be of equal length. With Vect
, this can be expressed and implemented as follows:
export
zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
zipWith f [] [] = Nil
zipWith f (x :: xs) (y :: ys) = f x y :: zipWith f xs ys
Now, here is an interesting thing: The totality checker (activated throughout this source file due to the initial %default total
pragma) accepts the above implementation as being total, although it is missing two more cases. This works, because Idris can figure out on its own, that the other two cases are impossible. From the pattern match on the first Vect
argument, Idris learns whether n
is zero or the successor of another natural number. But from this it can derive, whether the second vector, being also of length n
, is a Nil
or a cons. Still, it can be informative to add the impossible cases explicitly. We can use keyword impossible
to do so:
zipWith _ [] (_ :: _) impossible
zipWith _ (_ :: _) [] impossible
It is - of course - a type error to annotate a case in a pattern match with impossible
, if Idris cannot verify that this case is indeed impossible. We will learn in a later section what to do, when we think we are right about an impossible case and Idris is not.
Let's give zipWith
a spin at the REPL:
Tutorial.Dependent> zipWith (*) [1,2,3] [10,20,30]
[10, 40, 90]
Tutorial.Dependent> zipWith (\x,y => x ++ ": " ++ show y) ["The answer"] [42]
["The answer: 42"]
Tutorial.Dependent> zipWith (*) [1,2,3] [10,20]
... Nasty type error ...
Simplifying Type Errors
It is amazing to experience the amount of work Idris can do for us and the amount of things it can infer on its own when things go well. When things don't go well, however, the error messages we get from Idris can be quite long and hard to understand, especially for programmers new to the language. For instance, the error message in the last REPL example above was pretty long, listing different things Idris tried to do together with the reason why each of them failed.
If this happens, it often means that a combination of a type error and an ambiguity resulting from overloaded function names is at work. In the example above, the two vectors are of distinct length, which leads to a type error if we interpret the list literals as vectors. However, list literals are overloaded to work with all data types with constructors Nil
and (::)
, so Idris will now try other data constructors than those of Vect
(the ones of List
and Stream
from the Prelude in this case), each of which will again fail with a type error since zipWith
expects arguments of type Vect
, and neither List
nor Stream
will work.
If this happens, prefixing overloaded function names with their namespaces can often simplify things, as Idris no longer needs to disambiguate these functions:
Tutorial.Dependent> zipWith (*) (Dependent.(::) 1 Dependent.Nil) Dependent.Nil
Error: When unifying:
Vect 0 ?c
and:
Vect 1 ?c
Mismatch between: 0 and 1.
Here, the message is much clearer: Idris can't unify the lengths of the two vectors. Unification means: Idris tries to at compile time convert two expressions to the same normal form. If this succeeds, the two expressions are considered to be equivalent, if it doesn't, Idris fails with a unification error.
As an alternative to prefixing overloaded functions with their namespace, we can use the
to help with type inference:
Tutorial.Dependent> zipWith (*) (the (Vect 3 _) [1,2,3]) (the (Vect 2 _) [10,20])
Error: When unifying:
Vect 2 ?c
and:
Vect 3 ?c
Mismatch between: 0 and 1.
It is interesting to note, that the error above is not "Mismatch between: 2 and 3" but "Mismatch between: 0 and 1" instead. Here's what's going on: Idris tries to unify integer literals 2
and 3
, which are first converted to the corresponding Nat
values S (S Z)
and S (S (S Z))
, respectively. The two patterns match until we arrive at Z
vs S Z
, corresponding to values 0
and 1
, which is the discrepancy reported in the error message.
Creating Vectors
So far, we were able to learn something about the lengths of vectors by pattern matching on them. In the Nil
case, it was clear that the length is 0, while in the cons case the length was the successor of another natural number. This is not possible when we want to create a new vector:
failing "Mismatch between: S ?n and n."
fill : a -> Vect n a
You will have a hard time implementing fill
. The following, for instance, leads to a type error:
fill va = [va,va]
The problem is, that the callers of our function decide about the length of the resulting vector. The full type of fill
is actually the following:
fill' : {0 a : Type} -> {0 n : Nat} -> a -> Vect n a
You can read this type as follows: For every type a
and for every natural number n
(about which I know nothing at runtime, since it has quantity zero), given a value of type a
, I'll give you a vector holding exactly n
elements of type a
. This is like saying: "Think about a natural number n
, and I'll give you n
apples without you telling me the value of n
". Idris is powerful, but it is not a clairvoyant.
In order to implement fill
, we need to know what n
actually is: We need to pass n
as an explicit, unerased argument, which will allow us to pattern match on it and decide - based on this pattern match - which constructors of Vect
to use:
export
replicate : (n : Nat) -> a -> Vect n a
Now, replicate
is a dependent function type: The output type depends on the value of one of the arguments. It is straight forward to implement replicate
by pattern matching on n
:
replicate 0 _ = []
replicate (S k) va = va :: replicate k va
This is a pattern that comes up often when working with indexed types: We can learn about the values of the indices by pattern matching on the values of the type family. However, in order to return a value of the type family from a function, we need to either know the values of the indices at compile time (see constants ex1
or ex3
, for instance), or we need to have access to the values of the indices at runtime, in which case we can pattern match on them and learn from this, which constructor(s) of the type family to use.
Exercises part 1
-
Implement a function
len : List a -> Nat
for calculating the length of aList
. For example,len [1, 1, 1]
produces3
. -
Implement function
head
for non-empty vectors:head : Vect (S n) a -> a
Note, how we can describe non-emptiness by using a pattern in the length of
Vect
. This rules out theNil
case, and we can return a value of typea
, without having to wrap it in aMaybe
! Make sure to add animpossible
clause for theNil
case (although this is not strictly necessary here). -
Using
head
as a reference, declare and implement functiontail
for non-empty vectors. The types should reflect that the output is exactly one element shorter than the input. -
Implement
zipWith3
. If possible, try to doing so without looking at the implementation ofzipWith
:zipWith3 : (a -> b -> c -> d) -> Vect n a -> Vect n b -> Vect n c -> Vect n d
-
Declare and implement a function
foldSemi
for accumulating the values stored in aList
throughSemigroup
s append operator ((<+>)
). (Make sure to only use aSemigroup
constraint, as opposed to aMonoid
constraint.) -
Do the same as in Exercise 4, but for non-empty vectors. How does a vector's non-emptiness affect the output type?
-
Given an initial value of type
a
and a functiona -> a
, we'd like to generateVect
s ofa
s, the first value of which isa
, the second value beingf a
, the third beingf (f a)
and so on.For instance, if
a
is 1 andf
is(* 2)
, we'd like to get results similar to the following:[1,2,4,8,16,...]
.Declare and implement function
iterate
, which should encapsulate this behavior. Get some inspiration fromreplicate
if you don't know where to start. -
Given an initial value of a state type
s
and a functionfun : s -> (s,a)
, we'd like to generateVect
s ofa
s. Declare and implement functiongenerate
, which should encapsulate this behavior. Make sure to use the updated state in every new invocation offun
.Here's an example how this can be used to generate the first
n
Fibonacci numbers:generate 10 (\(x,y) => let z = x + y in ((y,z),z)) (0,1) [1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
-
Implement function
fromList
, which converts a list of values to aVect
of the same length. Use holes if you get stuck:fromList : (as : List a) -> Vect (length as) a
Note how, in the type of
fromList
, we can calculate the length of the resulting vector by passing the list argument to function length. -
Consider the following declarations:
maybeSize : Maybe a -> Nat
fromMaybe : (m : Maybe a) -> Vect (maybeSize m) a
Choose a reasonable implementation for maybeSize
and implement fromMaybe
afterwards.
Fin
: Safe Indexing into Vectors
module Tutorial.Dependent.Fin
import Tutorial.Dependent.LengthIndexedLists
%default total
Consider function index
, which tries to extract a value from a List
at the given position:
indexList : (pos : Nat) -> List a -> Maybe a
indexList _ [] = Nothing
indexList 0 (x :: _) = Just x
indexList (S k) (_ :: xs) = indexList k xs
Now, here is a thing to consider when writing functions like indexList
: Do we want to express the possibility of failure in the output type, or do we want to restrict the accepted arguments, so the function can no longer fail? These are important design decisions, especially in larger applications. Returning a Maybe
or Either
from a function forces client code to eventually deal with the Nothing
or Left
case, and until this happens, all intermediary results will carry the Maybe
or Either
stain, which will make it more cumbersome to run calculations with these intermediary results. On the other hand, restricting the values accepted as input will complicate the argument types and will put the burden of input validation on our functions' callers, (although, at compile time we can get help from Idris, as we will see when we talk about auto implicits) while keeping the output pure and clean.
Languages without dependent types (like Haskell), can often only take the route described above: To wrap the result in a Maybe
or Either
. However, in Idris we can often refine the input types to restrict the set of accepted values, thus ruling out the possibility of failure.
Assume, as an example, we'd like to extract a value from a Vect n a
at (zero-based) index k
. Surely, this can succeed if and only if k
is a natural number strictly smaller than the length n
of the vector. Luckily, we can express this precondition in an indexed type:
data Fin : (n : Nat) -> Type where
FZ : {0 n : Nat} -> Fin (S n)
FS : (k : Fin n) -> Fin (S n)
Fin n
is the type of natural numbers strictly smaller than n
. It is defined inductively: FZ
corresponds to natural number zero, which, as can be seen in its type, is strictly smaller than S n
for any natural number n
. FS
is the inductive case: If k
is strictly smaller than n
(k
being of type Fin n
), then FS k
is strictly smaller than S n
.
Let's come up with some values of type Fin
:
fin0_5 : Fin 5
fin0_5 = FZ
fin0_7 : Fin 7
fin0_7 = FZ
fin1_3 : Fin 3
fin1_3 = FS FZ
fin4_5 : Fin 5
fin4_5 = FS (FS (FS (FS FZ)))
Note, that there is no value of type Fin 0
. We will learn in a later session, how to express "there is no value of type x
" in a type.
Let us now check, whether we can use Fin
to safely index into a Vect
:
index : Fin n -> Vect n a -> a
Before you continue, try to implement index
yourself, making use of holes if you get stuck.
index FZ (x :: _) = x
index (FS k) (_ :: xs) = index k xs
Note, how there is no Nil
case and the totality checker is still happy. That's because Nil
is of type Vect 0 a
, but there is no value of type Fin 0
! We can verify this by adding the missing impossible clauses:
index FZ Nil impossible
index (FS _) Nil impossible
Exercises part 2
-
Implement function
update
, which, given a function of typea -> a
, updates the value in aVect n a
at positionk < n
. -
Implement function
insert
, which inserts a value of typea
at positionk <= n
in aVect n a
. Note, thatk
is the index of the freshly inserted value, so that the following holds:index k (insert k v vs) = v
-
Implement function
delete
, which deletes a value from a vector at the given index.This is trickier than Exercises 1 and 2, as we have to properly encode in the types that the vector is getting one element shorter.
-
We can use
Fin
to implement safe indexing intoList
s as well. Try to come up with a type and implementation forsafeIndexList
.Note: If you don't know how to start, look at the type of
fromList
for some inspiration. You might also need give the arguments in a different order than forindex
. -
Implement function
finToNat
, which converts aFin n
to the corresponding natural number, and use this to declare and implement functiontake
for splitting of the firstk
elements of aVect n a
withk <= n
. -
Implement function
minus
for subtracting a valuek
from a natural numbern
withk <= n
. -
Use
minus
from Exercise 6 to declare and implement functiondrop
, for dropping the firstk
values from aVect n a
, withk <= n
. -
Implement function
splitAt
for splitting aVect n a
at positionk <= n
, returning the prefix and suffix of the vector wrapped in a pair.Hint: Use
take
anddrop
in your implementation.
Hint: Since Fin n
consists of the values strictly smaller than n
, Fin (S n)
consists of the values smaller than or equal to n
.
Note: Functions take
, drop
, and splitAt
, while correct and provably total, are rather cumbersome to type. There is an alternative way to declare their types, as we will see in the next section.
Compile-Time Computations
module Tutorial.Dependent.Comptime
import Tutorial.Dependent.LengthIndexedLists
%default total
In the last section - especially in some of the exercises - we started more and more to use compile time computations to describe the types of our functions and values. This is a very powerful concept, as it allows us to compute output types from input types. Here's an example:
It is possible to concatenate two List
s with the (++)
operator. Surely, this should also be possible for Vect
. But Vect
is indexed by its length, so we have to reflect in the types exactly how the lengths of the inputs affect the lengths of the output. Here's how to do this:
(++) : Vect m a -> Vect n a -> Vect (m + n) a
(++) [] ys = ys
(++) (x :: xs) ys = x :: (xs ++ ys)
Note, how we keep track of the lengths at the type-level, again ruling out certain common programming errors like inadvertently dropping some values.
We can also use type-level computations as patterns on the input types. Here is an alternative type and implementation for drop
, which you implemented in the exercises by using a Fin n
argument:
drop' : (m : Nat) -> Vect (m + n) a -> Vect n a
drop' 0 xs = xs
drop' (S k) (_ :: xs) = drop' k xs
Note that changing the order from (m + n)
to (n + m)
in the second parameter will cause an error at the second xs
:
While processing right hand side of drop'. Can't solve constraint between: plus n 0 and n.
You will learn why in the next section.
Limitations
After all the examples and exercises in this section you might have come to the conclusion that we can use arbitrary expressions in the types and Idris will happily evaluate and unify all of them for us.
I'm afraid that's not even close to the truth. The examples in this section were hand-picked because they are known to just work. The reason being, that there was always a direct link between our own pattern matches and the implementations of functions we used at compile time.
For instance, here is the implementation of addition of natural numbers:
add : Nat -> Nat -> Nat
add Z n = n
add (S k) n = S $ add k n
As you can see, add
is implemented via a pattern match on its first argument, while the second argument is never inspected. Note, how this is exactly how (++)
for Vect
is implemented: There, we also pattern match on the first argument, returning the second unmodified in the Nil
case, and prepending the head to the result of appending the tail in the cons case. Since there is a direct correspondence between the two pattern matches, it is possible for Idris to unify 0 + n
with n
in the Nil
case, and (S k) + n
with S (k + n)
in the cons case.
Here is a simple example, where Idris will not longer be convinced without some help from us:
failing "Can't solve constraint"
reverse : Vect n a -> Vect n a
reverse [] = []
reverse (x :: xs) = reverse xs ++ [x]
When we type-check the above, Idris will fail with the following error message: "Can't solve constraint between: plus n 1 and S n." Here's what's going on: From the pattern match on the left hand side, Idris knows that the length of the vector is S n
, for some natural number n
corresponding to the length of xs
. The length of the vector on the right hand side is n + 1
, according to the type of (++)
and the lengths of xs
and [x]
. Overloaded operator (+)
is implemented via function Prelude.plus
, that's why Idris replaces (+)
with plus
in the error message.
As you can see from the above, Idris can't verify on its own that 1 + n
is the same thing as n + 1
. It can accept some help from us, though. If we come up with a proof that the above equality holds (or - more generally - that our implementation of addition for natural numbers is commutative), we can use this proof to rewrite the types on the right hand side of reverse
. Writing proofs and using rewrite
will require some in-depth explanations and examples. Therefore, these things will have to wait until another chapter.
Unrestricted Implicits
In functions like replicate
, we pass a natural number n
as an explicit, unrestricted argument from which we infer the length of the vector to return. In some circumstances, n
can be inferred from the context. For instance, in the following example it is tedious to pass n
explicitly:
ex4 : Vect 3 Integer
ex4 = zipWith (*) (replicate 3 10) (replicate 3 11)
The value n
is clearly derivable from the context, which can be confirmed by replacing it with underscores:
ex5 : Vect 3 Integer
ex5 = zipWith (*) (replicate _ 10) (replicate _ 11)
We therefore can implement an alternative version of replicate
, where we pass n
as an implicit argument of unrestricted quantity:
replicate' : {n : _} -> a -> Vect n a
replicate' = replicate n
Note how, in the implementation of replicate'
, we can refer to n
and pass it as an explicit argument to replicate
.
Deciding whether to pass potentially inferable arguments to a function implicitly or explicitly is a question of how often the arguments actually are inferable by Idris. Sometimes it might even be useful to have both versions of a function. Remember, however, that even in case of an implicit argument we can still pass the value explicitly:
ex6 : Vect ? Bool
ex6 = replicate' {n = 2} True
In the type signature above, the question mark (?
) means, that Idris should try and figure out the value on its own by unification. This forces us to specify n
explicitly on the right hand side of ex6
.
Pattern Matching on Implicits
The implementation of replicate'
makes use of function replicate
, where we could pattern match on the explicit argument n
. However, it is also possible to pattern match on implicit, named arguments of non-zero quantity:
replicate'' : {n : _} -> a -> Vect n a
replicate'' {n = Z} _ = Nil
replicate'' {n = S _} v = v :: replicate'' v
Exercises part 3
-
Here is a function declaration for flattening a
List
ofList
s:flattenList : List (List a) -> List a
Implement
flattenList
and declare and implement a similar functionflattenVect
for flattening vectors of vectors. -
Implement functions
take'
andsplitAt'
like in the exercises of the previous section but using the technique shown fordrop'
. -
Implement function
transpose
for converting anm x n
-matrix (represented as aVect m (Vect n a)
) to ann x m
-matrix.Note: This might be a challenging exercise, but make sure to give it a try. As usual, make use of holes if you get stuck!
Here is an example how this should work in action:
Solutions.Dependent> transpose [[1,2,3],[4,5,6]] [[1, 4], [2, 5], [3, 6]]
Conclusion
-
Dependent types allow us to calculate types from values. This makes it possible to encode properties of values at the type-level and verify these properties at compile time.
-
Length-indexed lists (vectors) let us rule out certain implementation errors, by forcing us to be precise about the lengths of input and output vectors.
-
We can use patterns in type signatures, for instance to express that the length of a vector is non-zero and therefore, the vector is non-empty.
-
When creating values of a type family, the values of the indices need to be known at compile time, or they need to be passed as arguments to the function creating the values, where we can pattern match on them to figure out, which constructors to use.
-
We can use
Fin n
, the type of natural numbers strictly smaller thann
, to safely index into a vector of lengthn
. -
Sometimes, it is convenient to pass inferable arguments as non-erased implicits, in which case we can still inspect them by pattern matching or pass them to other functions, while Idris will try and fill in the values for us.
Note, that data type Vect
together with many of the functions we implemented here is available from module Data.Vect
from the base library. Likewise, Fin
is available from Data.Fin
from base.
What's next
In the next section, it is time to learn how to write effectful programs and how to do this while still staying pure.
IO: Programming with Side Effects
So far, all our examples and exercises dealt with pure, total functions. We didn't read or write content from or to files, nor did we write any messages to the standard output. It is time to change that and learn, how we can write effectful programs in Idris.
Pure Side Effects?
module Tutorial.IO.PureSideEffects
import Data.List1
import Data.String
import Data.Vect
%default total
If we once again look at the hello world example from the introduction, it had the following type and implementation:
hello : IO ()
hello = putStrLn "Hello World!"
If you load this module in a REPL session and evaluate hello
, you'll get the following:
Tutorial.IO> hello
MkIO (prim__putStr "Hello World!")
This might not be what you expected, given that we'd actually wanted the program to just print "Hello World!". In order to explain what's going on here, we need to quickly look at how evaluation at the REPL works.
When we evaluate some expression at the REPL, Idris tries to reduce it to a value until it gets stuck somewhere. In the above case, Idris gets stuck at function prim__putStr
. This is a foreign function defined in the Prelude, which has to be implemented by each backend in order to be available there. At compile time (and at the REPL), Idris knows nothing about the implementations of foreign functions and therefore can't reduce foreign function calls, unless they are built into the compiler itself. But even then, values of type IO a
(a
being a type parameter) are typically not reduced.
It is important to understand that values of type IO a
describe a program, which, when being executed, will return a value of type a
, after performing arbitrary side effects along the way. For instance, putStrLn
has type String -> IO ()
. Read this as: "putStrLn
is a function, which, when given a String
argument, will return a description of an effectful program with an output type of ()
". (()
is syntactic sugar for type Unit
, the empty tuple defined at the Prelude, which has only one value called MkUnit
, for which we can also use ()
in our code.)
Since values of type IO a
are mere descriptions of effectful computations, functions returning such values or taking such values as arguments are still pure and thus referentially transparent. It is, however, not possible to extract a value of type a
from a value of type IO a
, that is, there is no generic function IO a -> a
, as such a function would inadvertently execute the side effects when extracting the result from its argument, thus breaking referential transparency. (Actually, there is such a function called unsafePerformIO
. Do not ever use it in your code unless you know what you are doing.)
Do Blocks
If you are new to pure functional programming, you might now - rightfully - mumble something about how useless it is to have descriptions of effectful programs without being able to run them. So please, hear me out. While we are not able to run values of type IO a
when writing programs, that is, there is no function of type IO a -> a
, we are able to chain such computations and describe more complex programs. Idris provides special syntax for this: Do blocks. Here's an example:
export
readHello : IO ()
readHello = do
name <- getLine
putStrLn $ "Hello " ++ name ++ "!"
Before we talk about what's going on here, let's give this a go at the REPL:
Tutorial.IO> :exec readHello
Stefan
Hello Stefan!
This is an interactive program, which will read a line from standard input (getLine
), assign the result to variable name
, and then use name
to create a friendly greeting and write it to standard output.
Note the do
keyword at the beginning of the implementation of readHello
: It starts a do block, where we can chain IO
computations and bind intermediary results to variables using arrows pointing to the left (<-
), which can then be used in later IO
actions. This concept is powerful enough to let us encapsulate arbitrary programs with side effects in a single value of type IO
. Such a description can then be returned by function main
, the main entry point to an Idris program, which is being executed when we run a compiled Idris binary.
The Difference between Program Description and Execution
In order to better understand the difference between describing an effectful computation and executing or running it, here is a small program:
launchMissiles : IO ()
launchMissiles = putStrLn "Boom! You're dead."
export
friendlyReadHello : IO ()
friendlyReadHello = do
_ <- putStrLn "Please enter your name."
readHello
actions : Vect 3 (IO ())
actions = [launchMissiles, friendlyReadHello, friendlyReadHello]
runActions : Vect (S n) (IO ()) -> IO ()
runActions (_ :: xs) = go xs
where go : Vect k (IO ()) -> IO ()
go [] = pure ()
go (y :: ys) = do
_ <- y
go ys
readHellos : IO ()
readHellos = runActions actions
Before I explain what the code above does, please note function pure
used in the implementation of runActions
. It is a constrained function, about which we will learn in the next chapter. Specialized to IO
, it has generic type a -> IO a
: It allows us to wrap a value in an IO
action. The resulting IO
program will just return the wrapped value without performing any side effects. We can now look at the big picture of what's going on in readHellos
.
First, we define a friendlier version of readHello
: When executed, this will ask about our name explicitly. Since we will not use the result of putStrLn
any further, we can use an underscore as a catch-all pattern here. Afterwards, readHello
is invoked. We also define launchMissiles
, which, when being executed, will lead to the destruction of planet earth.
Now, runActions
is the function we use to demonstrate that describing an IO
action is not the same as running it. It will drop the first action from the non-empty vector it takes as its argument and return a new IO
action, which describes the execution of the remaining IO
actions in sequence. If this behaves as expected, the first IO
action passed to runActions
should be silently dropped together with all its potential side effects.
When we execute readHellos
at the REPL, we will be asked for our name twice, although actions
also contains launchMissiles
at the beginning. Luckily, although we described how to destroy the planet, the action was not executed, and we are (probably) still here.
From this example we learn several things:
-
Values of type
IO a
are pure descriptions of programs, which, when being executed, perform arbitrary side effects before returning a value of typea
. -
Values of type
IO a
can be safely returned from functions and passed around as arguments or in data structures, without the risk of them being executed. -
Values of type
IO a
can be safely combined in do blocks to describe newIO
actions. -
An
IO
action will only ever get executed when it's passed to:exec
at the REPL, or when it is themain
function of a compiled Idris program that is being executed. -
It is not possible to ever break out of the
IO
context: There is no function of typeIO a -> a
, as such a function would need to execute its argument in order to extract the final result, and this would break referential transparency.
Combining Pure Code with IO
Actions
The title of this subsection is somewhat misleading. IO
actions are pure values, but what is typically meant here, is that we combine non-IO
functions with effectful computations.
As a demonstration, in this section we are going to write a small program for evaluating arithmetic expressions. We are going to keep things simple and allow only expressions with a single operator and two arguments, both of which must be integers, for instance 12 + 13
.
We are going to use function split
from Data.String
in base to tokenize arithmetic expressions. We are then trying to parse the two integer values and the operator. These operations might fail, since user input can be invalid, so we also need an error type. We could actually just use String
, but I consider it to be good practice to use custom sum types for erroneous conditions.
public export
data Error : Type where
NotAnInteger : (value : String) -> Error
UnknownOperator : (value : String) -> Error
ParseError : (input : String) -> Error
dispError : Error -> String
dispError (NotAnInteger v) = "Not an integer: " ++ v ++ "."
dispError (UnknownOperator v) = "Unknown operator: " ++ v ++ "."
dispError (ParseError v) = "Invalid expression: " ++ v ++ "."
In order to parse integer literals, we use function parseInteger
from Data.String
:
export
readInteger : String -> Either Error Integer
readInteger s = maybe (Left $ NotAnInteger s) Right $ parseInteger s
Likewise, we declare and implement a function for parsing arithmetic operators:
export
readOperator : String -> Either Error (Integer -> Integer -> Integer)
readOperator "+" = Right (+)
readOperator "*" = Right (*)
readOperator s = Left (UnknownOperator s)
We are now ready to parse and evaluate simple arithmetic expressions. This consists of several steps (splitting the input string, parsing each literal), each of which can fail. Later, when we learn about monads, we will see that do blocks can be used in such occasions just as well. However, in this case we can use an alternative syntactic convenience: Pattern matching in let bindings. Here is the code:
eval : String -> Either Error Integer
eval s =
let [x,y,z] := forget $ split isSpace s | _ => Left (ParseError s)
Right v1 := readInteger x | Left e => Left e
Right op := readOperator y | Left e => Left e
Right v2 := readInteger z | Left e => Left e
in Right $ op v1 v2
Let's break this down a bit. On the first line, we split the input string at all whitespace occurrences. Since split
returns a List1
(a type for non-empty lists exported from Data.List1
in base) but pattern matching on List
is more convenient, we convert the result using Data.List1.forget
. Note, how we use a pattern match on the left hand side of the assignment operator :=
. This is a partial pattern match (partial meaning, that it doesn't cover all possible cases), therefore we have to deal with the other possibilities as well, which is done after the vertical line. This can be read as follows: "If the pattern match on the left hand side is successful, and we get a list of exactly three tokens, continue with the let
expression, otherwise return a ParseError
in a Left
immediately".
The other three lines behave exactly the same: Each has a partial pattern match on the left hand side with instructions what to return in case of invalid input after the vertical bar. We will later see, that this syntax is also available in do blocks.
Note, how all of the functionality implemented so far is pure, that is, it does not describe computations with side effects. (One could argue that already the possibility of failure is an observable effect, but even then, the code above is still referentially transparent, can be easily tested at the REPL, and evaluated at compile time, which is the important thing here.)
Finally, we can wrap this functionality in an IO
action, which reads a string from standard input and tries to evaluate the arithmetic expression:
exprProg : IO ()
exprProg = do
s <- getLine
case eval s of
Left err => do
putStrLn "An error occured:"
putStrLn (dispError err)
Right res => putStrLn (s ++ " = " ++ show res)
Note, how in exprProg
we were forced to deal with the possibility of failure and handle both constructors of Either
differently in order to print a result. Note also, that do blocks are ordinary expressions, and we can, for instance, start a new do block on the right hand side of a case expression.
Exercises part 1
In these exercises, you are going to implement some small command-line applications. Some of these will potentially run forever, as they will only stop when the user enters a keyword for quitting the application. Such programs are no longer provably total. If you added the %default total
pragma at the top of your source file, you'll need to annotate these functions with covering
, meaning that you covered all cases in all pattern matches but your program might still loop due to unrestricted recursion.
-
Implement function
rep
, which will read a line of input from the terminal, evaluate it using the given function, and print the result to standard output:rep : (String -> String) -> IO ()
-
Implement function
repl
, which behaves just likerep
but will repeat itself forever (or until being forcefully terminated):covering repl : (String -> String) -> IO ()
-
Implement function
replTill
, which behaves just likerepl
but will only continue looping if the given function returns aRight
. If it returns aLeft
,replTill
should print the final message wrapped in theLeft
and then stop.covering replTill : (String -> Either String String) -> IO ()
-
Write a program, which reads arithmetic expressions from standard input, evaluates them using
eval
, and prints the result to standard output. The program should loop until users stops it by entering "done", in which case the program should terminate with a friendly greeting. UsereplTill
in your implementation. -
Implement function
replWith
, which behaves just likerepl
but uses some internal state to accumulate values. At each iteration (including the very first one!), the current state should be printed to standard output using functiondispState
, and the next state should be computed using functionnext
. The loop should terminate in case of aLeft
and print a final message usingdispResult
:covering replWith : (state : s) -> (next : s -> String -> Either res s) -> (dispState : s -> String) -> (dispResult : res -> s -> String) -> IO ()
-
Use
replWith
from Exercise 5 to write a program for reading natural numbers from standard input and printing the accumulated sum of these numbers. The program should terminate in case of invalid input and if a user enters "done".
Do Blocks, Desugared
module Tutorial.IO.DoUnsugared
import Data.List1
import Data.String
import Data.Vect
import Tutorial.IO.PureSideEffects
%default total
Here's an important piece of information: There is nothing special about do blocks. They are just syntactic sugar, which is converted to a sequence of operator applications. With syntactic sugar, we mean syntax in a programming language that makes it easier to express certain things in that language without making the language itself any more powerful or expressive. Here, it means you could write all the IO
programs without using do
notation, but the code you'll write will sometimes be harder to read, so do blocks provide nicer syntax for these occasions.
Consider the following example program:
sugared1 : IO ()
sugared1 = do
str1 <- getLine
str2 <- getLine
str3 <- getLine
putStrLn (str1 ++ str2 ++ str3)
The compiler will convert this to the following program before disambiguating function names and type checking:
desugared1 : IO ()
desugared1 =
getLine >>= (\str1 =>
getLine >>= (\str2 =>
getLine >>= (\str3 =>
putStrLn (str1 ++ str2 ++ str3)
)
)
)
There is a new operator ((>>=)
) called bind in the implementation of desugared1
. If you look at its type at the REPL, you'll see the following:
Main> :t (>>=)
Prelude.>>= : Monad m => m a -> (a -> m b) -> m b
This is a constrained function requiring an interface called Monad
. We will talk about Monad
and some of its friends in the next chapter. Specialized to IO
, bind has the following type:
Main> :t (>>=) {m = IO}
>>= : IO a -> (a -> IO b) -> IO b
This describes a sequencing of IO
actions. Upon execution, the first IO
action is being run and its result is being passed as an argument to the function generating the second IO
action, which is then also being executed.
You might remember, that you already implemented something similar in an earlier exercise: In Algebraic Data Types, you implemented bind for Maybe
and Either e
. We will learn in the next chapter, that Maybe
and Either e
too come with an implementation of Monad
. For now, suffice to say that Monad
allows us to run computations with some kind of effect in sequence by passing the result of the first computation to the function returning the second computation. In desugared1
you can see, how we first perform an IO
action and use its result to compute the next IO
action and so on. The code is somewhat hard to read, since we use several layers of nested anonymous function, that's why in such cases, do blocks are a nice alternative to express the same functionality.
Since do block are always desugared to sequences of applied bind operators, we can use them to chain any monadic computation. For instance, we can rewrite function eval
by using a do block like so:
evalDo : String -> Either Error Integer
evalDo s = case forget $ split isSpace s of
[x,y,z] => do
v1 <- readInteger x
op <- readOperator y
v2 <- readInteger z
Right $ op v1 v2
_ => Left (ParseError s)
Don't worry, if this doesn't make too much sense yet. We will see many more examples, and you'll get the hang of this soon enough. The important thing to remember is how do blocks are always converted to sequences of bind operators as shown in desugared1
.
Binding Unit
Remember our implementation of friendlyReadHello
? Here it is again:
friendlyReadHello' : IO ()
friendlyReadHello' = do
_ <- putStrLn "Please enter your name."
readHello
The underscore in there is a bit ugly and unnecessary. In fact, a common use case is to just chain effectful computations with result type Unit
(()
), merely for the side effects they perform. For instance, we could repeat friendlyReadHello
three times, like so:
friendly3 : IO ()
friendly3 = do
_ <- friendlyReadHello
_ <- friendlyReadHello
friendlyReadHello
This is such a common thing to do, that Idris allows us to drop the bound underscores altogether:
friendly4 : IO ()
friendly4 = do
friendlyReadHello
friendlyReadHello
friendlyReadHello
friendlyReadHello
Note, however, that the above gets desugared slightly differently:
friendly4Desugared : IO ()
friendly4Desugared =
friendlyReadHello >>
friendlyReadHello >>
friendlyReadHello >>
friendlyReadHello
Operator (>>)
has the following type:
Main> :t (>>)
Prelude.>> : Monad m => m () -> Lazy (m b) -> m b
Note the Lazy
keyword in the type signature. This means, that the wrapped argument will be lazily evaluated. This makes sense in many occasions. For instance, if the Monad
in question is Maybe
the result will be Nothing
if the first argument is Nothing
, in which case there is no need to even evaluate the second argument.
Do, Overloaded
Because Idris supports function and operator overloading, we can write custom bind operators, which allows us to use do notation for types without an implementation of Monad
. For instance, here is a custom implementation of (>>=)
for sequencing computations returning vectors. Every value in the first vector (of length m
) will be converted to a vector of length n
, and the results will be concatenated leading to a vector of length m * n
:
flatten : Vect m (Vect n a) -> Vect (m * n) a
flatten [] = []
flatten (x :: xs) = x ++ flatten xs
(>>=) : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
as >>= f = flatten (map f as)
It is not possible to write an implementation of Monad
, which encapsulates this behavior, as the types wouldn't match: Monadic bind specialized to Vect
has type Vect k a -> (a -> Vect k b) -> Vect k b
. As you see, the sizes of all three occurrences of Vect
have to be the same, which is not what we expressed in our custom version of bind. Here is an example to see this in action:
modString : String -> Vect 4 String
modString s = [s, reverse s, toUpper s, toLower s]
testDo : Vect 24 String
testDo = DoUnsugared.do
s1 <- ["Hello", "World"]
s2 <- [1, 2, 3]
modString (s1 ++ show s2)
Try to figure out how testDo
works by desugaring it manually and then comparing its result with what you expected at the REPL. Note, how we helped Idris disambiguate, which version of the bind operator to use by prefixing the do
keyword with part of the operator's namespace. In this case, this wasn't strictly necessary, although Vect k
does have an implementation of Monad
, but it is still good to know that it is possible to help the compiler with disambiguating do blocks.
Of course, we can (and should!) overload (>>)
in the same manner as (>>=)
, if we want to overload the behavior of do blocks.
Modules and Namespaces
Every data type, function, or operator can be unambiguously identified by prefixing it with its namespace. A function's namespace typically is the same as the module where it was defined. For instance, the fully qualified name of function eval
would be Tutorial.IO.eval
. Function and operator names must be unique in their namespace.
As we already learned, Idris can often disambiguate between functions with the same name but defined in different namespaces based on the types involved. If this is not possible, we can help the compiler by prefixing the function or operator name with a suffix of the full namespace. Let's demonstrate this at the REPL:
Tutorial.IO> :t (>>=)
Prelude.>>= : Monad m => m a -> (a -> m b) -> m b
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
As you can see, if we load this module in a REPL session and inspect the type of (>>=)
, we get two results as two operators with this name are in scope. If we only want the REPL to print the type of our custom bind operator, is is sufficient to prefix it with IO
, although we could also prefix it with its full namespace:
Tutorial.IO> :t IO.(>>=)
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
Tutorial.IO> :t Tutorial.IO.(>>=)
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
Since function names must be unique in their namespace and we still may want to define two overloaded versions of a function in an Idris module, Idris makes it possible to add additional namespaces to modules. For instance, in order to define another function called eval
, we need to add it to its own namespace (note, that all definitions in a namespace must be indented by the same amount of whitespace):
namespace Foo
export
eval : Nat -> Nat -> Nat
eval = (*)
-- prefixing `eval` with its namespace is not strictly necessary here
testFooEval : Nat
testFooEval = Foo.eval 12 100
Now, here is an important thing: For functions and data types to be accessible from outside their namespace or module, they need to be exported by annotating them with the export
or public export
keywords.
The difference between export
and public export
is the following: A function annotated with export
exports its type and can be called from other namespaces. A data type annotated with export
exports its type constructor but not its data constructors. A function annotated with public export
also exports its implementation. This is necessary to use the function in compile-time computations. A data type annotated with public export
exports its data constructors as well.
In general, consider annotating data types with public export
, since otherwise you will not be able to create values of these types or deconstruct them in pattern matches. Likewise, unless you plan to use your functions in compile-time computations, annotate them with export
.
Bind, with a Bang
Sometimes, even do blocks are too noisy to express a combination of effectful computations. In this case, we can prefix the effectful parts with an exclamation mark (wrapping them in parentheses if they contain additional whitespace), while leaving pure expressions unmodified:
getHello : IO ()
getHello = putStrLn $ "Hello " ++ !getLine ++ "!"
The above gets desugared to the following do block:
getHello' : IO ()
getHello' = do
s <- getLine
putStrLn $ "Hello " ++ s ++ "!"
Here is another example:
bangExpr : String -> String -> String -> Maybe Integer
bangExpr s1 s2 s3 =
Just $ !(parseInteger s1) + !(parseInteger s2) * !(parseInteger s3)
And here is the desugared do block:
bangExpr' : String -> String -> String -> Maybe Integer
bangExpr' s1 s2 s3 = do
x1 <- parseInteger s1
x2 <- parseInteger s2
x3 <- parseInteger s3
Just $ x1 + x2 * x3
Please remember the following: Syntactic sugar has been introduced to make code more readable or more convenient to write. If it is abused just to show how clever you are, you make things harder for other people (including your future self!) reading and trying to understand your code.
Exercises part 2
-
Reimplement the following do blocks, once by using bang notation, and once by writing them in their desugared form with nested binds:
ex1a : IO String ex1a = do s1 <- getLine s2 <- getLine s3 <- getLine pure $ s1 ++ reverse s2 ++ s3 ex1b : Maybe Integer ex1b = do n1 <- parseInteger "12" n2 <- parseInteger "300" Just $ n1 + n2 * 100
-
Below is the definition of an indexed family of types, the index of which keeps track of whether the value in question is possibly empty or provably non-empty:
data List01 : (nonEmpty : Bool) -> Type -> Type where Nil : List01 False a (::) : a -> List01 False a -> List01 ne a
Please note, that the
Nil
case must have thenonEmpty
tag set toFalse
, while with the cons case, this is optional. So, aList01 False a
can be empty or non-empty, and we'll only find out, which is the case, by pattern matching on it. AList01 True a
on the other hand must be a cons, as for theNil
case thenonEmpty
tag is always set toFalse
.-
Declare and implement function
head
for non-empty lists:head : List01 True a -> a
-
Declare and implement function
weaken
for converting anyList01 ne a
to aList01 False a
of the same length and order of values. -
Declare and implement function
tail
for extracting the possibly empty tail from a non-empty list. -
Implement function
(++)
for concatenating two values of typeList01
. Note, how we use a type-level computation to make sure the result is non-empty if and only if at least one of the two arguments is non-empty:(++) : List01 b1 a -> List01 b2 a -> List01 (b1 || b2) a
-
Implement utility function
concat'
and use it in the implementation ofconcat
. Note, that inconcat
the two boolean tags are passed as unrestricted implicits, since you will need to pattern match on these to determine whether the result is provably non-empty or not:concat' : List01 ne1 (List01 ne2 a) -> List01 False a concat : {ne1, ne2 : _} -> List01 ne1 (List01 ne2 a) -> List01 (ne1 && ne2) a
-
Implement
map01
:map01 : (a -> b) -> List01 ne a -> List01 ne b
-
Implement a custom bind operator in namespace
List01
for sequencing computations returningList01
s.Hint: Use
map01
andconcat
in your implementation and make sure to use unrestricted implicits where necessary.You can use the following examples to test your custom bind operator:
-- this and lf are necessary to make sure, which tag to use -- when using list literals lt : List01 True a -> List01 True a lt = id lf : List01 False a -> List01 False a lf = id test : List01 True Integer test = List01.do x <- lt [1,2,3] y <- lt [4,5,6,7] op <- lt [(*), (+), (-)] [op x y] test2 : List01 False Integer test2 = List01.do x <- lt [1,2,3] y <- Nil {a = Integer} op <- lt [(*), (+), (-)] lt [op x y]
-
Some notes on Exercise 2: Here, we combined the capabilities of List
and Data.List1
in a single indexed type family. This allowed us to treat list concatenation correctly: If at least one of the arguments is provably non-empty, the result is also non-empty. To tackle this correctly with List
and List1
, a total of four concatenation functions would have to be written. So, while it is often possible to define distinct data types instead of indexed families, the latter allow us to perform type-level computations to be more precise about the pre- and postconditions of the functions we write, at the cost of more-complex type signatures. In addition, sometimes it's not possible to derive the values of the indices from pattern matching on the data values alone, so they have to be passed as unerased (possibly implicit) arguments.
Please remember, that do blocks are first desugared, before type-checking, disambiguating which bind operator to use, and filling in implicit arguments. It is therefore perfectly fine to define bind operators with arbitrary constraints or implicit arguments as was shown above. Idris will handle all the details, after desugaring the do blocks.
Working with Files
module Tutorial.IO.Files
import Data.List1
import Data.String
import Data.Vect
import System.File
%default total
Module System.File
from the base library exports utilities necessary to work with file handles and read and write from and to files. When you have a file path (for instance "/home/hock/idris/tutorial/tutorial.ipkg"), the first thing we will typically do is to try and create a file handle (of type System.File.File
by calling fileOpen
).
Here is a program for counting all empty lines in a Unix/Linux-file:
covering
countEmpty : (path : String) -> IO (Either FileError Nat)
countEmpty path = openFile path Read >>= either (pure . Left) (go 0)
where covering go : Nat -> File -> IO (Either FileError Nat)
go k file = do
False <- fEOF file | True => closeFile file $> Right k
Right "\n" <- fGetLine file
| Right _ => go k file
| Left err => closeFile file $> Left err
go (k + 1) file
In the example above, I invoked (>>=)
without starting a do block. Make sure you understand what's going on here. Reading concise functional code is important in order to understand other people's code. Have a look at function either
at the REPL, try figuring out what (pure . Left)
does, and note how we use a curried version of go
as the second argument to either
.
Function go
calls for some additional explanations. First, note how we used the same syntax for pattern matching intermediary results as we also saw for let
bindings. As you can see, we can use several vertical bars to handle more than one additional pattern. In order to read a single line from a file, we use function fGetLine
. As with most operations working with the file system, this function might fail with a FileError
, which we have to handle correctly. Note also, that fGetLine
will return the line including its trailing newline character '\n'
, so in order to check for empty lines, we have to match against "\n"
instead of the empty string ""
.
Finally, go
is not provably total and rightfully so. Files like /dev/urandom
or /dev/zero
provide infinite streams of data, so countEmpty
will never terminate when invoked with such a file path.
Safe Resource Handling
Note, how we had to manually open and close the file handle in countEmpty
. This is error-prone and tedious. Resource handling is a big topic, and we definitely won't be going into the details here, but there is a convenient function exported from System.File
: withFile
, which handles the opening, closing and handling of file errors for us.
covering
countEmpty' : (path : String) -> IO (Either FileError Nat)
countEmpty' path = withFile path Read pure (go 0)
where covering go : Nat -> File -> IO (Either FileError Nat)
go k file = do
False <- fEOF file | True => pure (Right k)
Right "\n" <- fGetLine file
| Right _ => go k file
| Left err => pure (Left err)
go (k + 1) file
Go ahead, and have a look at the type of withFile
, then have a look how we use it to simplify the implementation of countEmpty'
. Reading and understanding slightly more complex function types is important when learning to program in Idris.
Interface HasIO
When you look at the IO
functions we used so far, you'll notice that most if not all of them actually don't work with IO
itself but with a type parameter io
with a constraint of HasIO
. This interface allows us to lift a value of type IO a
into another context. We will see use cases for this in later chapters, especially when we talk about monad transformers. For now, you can treat these io
parameters as being specialized to IO
.
Exercises part 3
-
As we have seen in the examples above,
IO
actions working with file handles often come with the risk of failure. We can therefore simplify things by writing some utility functions and a custom bind operator to work with these nested effects. In a new namespaceIOErr
, implement the following utility functions and use these to further cleanup the implementation ofcountEmpty'
:pure : a -> IO (Either e a) fail : e -> IO (Either e a) lift : IO a -> IO (Either e a) catch : IO (Either e1 a) -> (e1 -> IO (Either e2 a)) -> IO (Either e2 a) (>>=) : IO (Either e a) -> (a -> IO (Either e b)) -> IO (Either e b) (>>) : IO (Either e ()) -> Lazy (IO (Either e a)) -> IO (Either e a)
-
Write a function
countWords
for counting the words in a file. Consider usingData.String.words
and the utilities from exercise 1 in your implementation. -
We can generalize the functionality used in
countEmpty
andcountWords
, by implementing a helper function for iterating over the lines in a file and accumulating some state along the way. ImplementwithLines
and use it to reimplementcountEmpty
andcountWords
:covering withLines : (path : String) -> (accum : s -> String -> s) -> (initialState : s) -> IO (Either FileError s)
-
We often use a
Monoid
for accumulating values. It is therefore convenient to specializewithLines
for this case. UsewithLines
to implementfoldLines
according to the type given below:covering foldLines : Monoid s => (path : String) -> (f : String -> s) -> IO (Either FileError s)
-
Implement function
wordCount
for counting the number of lines, words, and characters in a text document. Define a custom record type together with an implementation ofMonoid
for storing and accumulating these values and usefoldLines
in your implementation ofwordCount
.
How IO
is Implemented
In this final section of an already lengthy chapter, we will risk a glance at how IO
is implemented in Idris. It is interesting to note, that IO
is not a built-in type but a regular data type with only one minor speciality. Let's learn about it at the REPL:
Tutorial.IO> :doc IO
data PrimIO.IO : Type -> Type
Totality: total
Constructor: MkIO : (1 _ : PrimIO a) -> IO a
Hints:
Applicative IO
Functor IO
HasLinearIO IO
Monad IO
Here, we learn that IO
has a single data constructor called MkIO
, which takes a single argument of type PrimIO a
with quantity 1. We are not going to talk about the quantities here, as in fact they are not important to understand how IO
works.
Now, PrimIO a
is a type alias for the following function:
Tutorial.IO> :printdef PrimIO
PrimIO.PrimIO : Type -> Type
PrimIO a = (1 _ : %World) -> IORes a
Again, don't mind the quantities. There is only one piece of the puzzle missing: IORes a
, which is a publicly exported record type:
Solutions.IO> :doc IORes
data PrimIO.IORes : Type -> Type
Totality: total
Constructor: MkIORes : a -> (1 _ : %World) -> IORes a
So, to put this all together, IO
is a wrapper around something similar to the following function type:
%World -> (a, %World)
You can think of type %World
as a placeholder for the state of the outside world of a program (file system, memory, network connections, and so on). Conceptually, to execute an IO a
action, we pass it the current state of the world, and in return get an updated world state plus a result of type a
. The world state being updated represents all the side effects describable in a computer program.
Now, it is important to understand that there is no such thing as the state of the world. The %World
type is just a placeholder, which is converted to some kind of constant that's passed around and never inspected at runtime. So, if we had a value of type %World
, we could pass it to an IO a
action and execute it, and this is exactly what happens at runtime: A single value of type %World
(an uninteresting placeholder like null
, 0
, or - in case of the JavaScript backends - undefined
) is passed to the main
function, thus setting the whole program in motion. However, it is impossible to programmatically create a value of type %World
(it is an abstract, primitive type), and therefore we cannot ever extract a value of type a
from an IO a
action (modulo unsafePerformIO
).
Once we will talk about monad transformers and the state monad, you will see that IO
is nothing else but a state monad in disguise but with an abstract state type, which makes it impossible for us to run the stateful computation.
Conclusion
-
Values of type
IO a
describe programs with side effects, which will eventually result in a value of typea
. -
While we cannot safely extract a value of type
a
from anIO a
, we can use several combinators and syntactic constructs to combineIO
actions and build more-complex programs. -
Do blocks offer a convenient way to run and combine
IO
actions sequentially. -
Do blocks are desugared to nested applications of bind operators (
(>>=)
). -
Bind operators, and thus do blocks, can be overloaded to achieve custom behavior instead of the default (monadic) bind.
-
Under the hood,
IO
actions are stateful computations operating on a symbolic%World
state.
What's next
Now, that we had a glimpse at monads and the bind operator, it is time to in the next chapter introduce Monad
and some related interfaces for real.
Functor and Friends
Programming, like mathematics, is about abstraction. We try to model parts of the real world, reusing recurring patterns by abstracting over them.
In this chapter, we will learn about several related interfaces, which are all about abstraction and therefore can be hard to understand at the beginning. Especially figuring out why they are useful and when to use them will take time and experience. This chapter therefore comes with tons of exercises, most of which can be solved with only a few short lines of code. Don't skip them. Come back to them several times until these things start feeling natural to you. You will then realize that their initial complexity has vanished.
Functor
module Tutorial.Functor.Functor
import Data.List1
import Data.String
import Data.Vect
%default total
What do type constructors like List
, List1
, Maybe
, or IO
have in common? First, all of them are of type Type -> Type
. Second, they all put values of a given type in a certain context. With List
, the context is non-determinism: We know there to be zero or more values, but we don't know the exact number until we start taking the list apart by pattern matching on it. Likewise for List1
, though we know for sure that there is at least one value. For Maybe
, we are still not sure about how many values there are, but the possibilities are much smaller: Zero or one. With IO
, the context is a different one: Arbitrary side effects.
Although the type constructors discussed above are quite different in how they behave and when they are useful, there are certain operations that keep coming up when working with them. The first such operation is mapping a pure function over the data type, without affecting its underlying structure.
For instance, given a list of numbers, we'd like to multiply each number by two, without changing their order or removing any values:
multBy2List : Num a => List a -> List a
multBy2List [] = []
multBy2List (x :: xs) = 2 * x :: multBy2List xs
But we might just as well convert every string in a list of strings to upper case characters:
toUpperList : List String -> List String
toUpperList [] = []
toUpperList (x :: xs) = toUpper x :: toUpperList xs
Sometimes, the type of the stored value changes. In the next example, we calculate the lengths of the strings stored in a list:
toLengthList : List String -> List Nat
toLengthList [] = []
toLengthList (x :: xs) = length x :: toLengthList xs
I'd like you to appreciate, just how boring these functions are. They are almost identical, with the only interesting part being the function we apply to each element. Surely, there must be a pattern to abstract over:
mapList : (a -> b) -> List a -> List b
mapList f [] = []
mapList f (x :: xs) = f x :: mapList f xs
This is often the first step of abstraction in functional programming: Write a (possibly generic) higher-order function. We can now concisely implement all examples shown above in terms of mapList
:
multBy2List' : Num a => List a -> List a
multBy2List' = mapList (2 *)
toUpperList' : List String -> List String
toUpperList' = mapList toUpper
toLengthList' : List String -> List Nat
toLengthList' = mapList length
But surely we'd like to do the same kind of thing with List1
and Maybe
! After all, they are just container types like List
, the only difference being some detail about the number of values they can or can't hold:
mapMaybe : (a -> b) -> Maybe a -> Maybe b
mapMaybe f Nothing = Nothing
mapMaybe f (Just v) = Just (f v)
Even with IO
, we'd like to be able to map pure functions over effectful computations. The implementation is a bit more involved, due to the nested layers of data constructors, but if in doubt, the types will surely guide us. Note, however, that IO
is not publicly exported, so its data constructor is unavailable to us. We can use functions toPrim
and fromPrim
, however, for converting IO
from and to PrimIO
, which we can freely dissect:
mapIO : (a -> b) -> IO a -> IO b
mapIO f io = fromPrim $ mapPrimIO (toPrim io)
where mapPrimIO : PrimIO a -> PrimIO b
mapPrimIO prim w =
let MkIORes va w2 = prim w
in MkIORes (f va) w2
From the concept of mapping a pure function over values in a context follow some derived functions, which are often useful. Here are some of them for IO
:
mapConstIO : b -> IO a -> IO b
mapConstIO = mapIO . const
forgetIO : IO a -> IO ()
forgetIO = mapConstIO ()
Of course, we'd want to implement mapConst
and forget
as well for List
, List1
, and Maybe
(and dozens of other type constructors with some kind of mapping function), and they'd all look the same and be equally boring.
When we come upon a recurring class of functions with several useful derived functions, we should consider defining an interface. But how should we go about this here? When you look at the types of mapList
, mapMaybe
, and mapIO
, you'll see that it's the List
, List1
, and IO
types we need to get rid of. These are not of type Type
but of type Type -> Type
. Luckily, there is nothing preventing us from parametrizing an interface over something else than a Type
.
The interface we are looking for is called Functor
. Here is its definition and an example implementation (I appended a tick at the end of the names for them not to overlap with the interface and functions exported by the Prelude):
public export
interface Functor' (0 f : Type -> Type) where
map' : (a -> b) -> f a -> f b
export
implementation Functor' Maybe where
map' _ Nothing = Nothing
map' f (Just v) = Just $ f v
Note, that we had to give the type of parameter f
explicitly, and in that case it needs to be annotated with quantity zero if you want it to be erased at runtime (which you almost always want).
Now, reading type signatures consisting only of type parameters like the one of map'
can take some time to get used to, especially when some type parameters are applied to other parameters as in f a
. It can be very helpful to inspect these signatures together with all implicit arguments at the REPL (I formatted the output to make it more readable):
Tutorial.Functor> :ti map'
Tutorial.Functor.map' : {0 b : Type}
-> {0 a : Type}
-> {0 f : Type -> Type}
-> Functor' f
=> (a -> b)
-> f a
-> f b
It can also be helpful to replace type parameter f
with a concrete value of the same type:
Tutorial.Functor> :t map' {f = Maybe}
map' : (?a -> ?b) -> Maybe ?a -> Maybe ?b
Remember, being able to interpret type signatures is paramount to understanding what's going on in an Idris declaration. You must practice this and make use of the tools and utilities given to you.
Derived Functions
There are several functions and operators directly derivable from interface Functor
. Eventually, you should know and remember all of them as they are highly useful. Here they are together with their types:
Tutorial.Functor> :t (<$>)
Prelude.<$> : Functor f => (a -> b) -> f a -> f b
Tutorial.Functor> :t (<&>)
Prelude.<&> : Functor f => f a -> (a -> b) -> f b
Tutorial.Functor> :t ($>)
Prelude.$> : Functor f => f a -> b -> f b
Tutorial.Functor> :t (<$)
Prelude.<$ : Functor f => b -> f a -> f b
Tutorial.Functor> :t ignore
Prelude.ignore : Functor f => f a -> f ()
(<$>)
is an operator alias for map
and allows you to sometimes drop some parentheses. For instance:
tailShowReversNoOp : Show a => List1 a -> List String
tailShowReversNoOp xs = map (reverse . show) (tail xs)
tailShowReverse : Show a => List1 a -> List String
tailShowReverse xs = reverse . show <$> tail xs
(<&>)
is an alias for (<$>)
with the arguments flipped. The other three (ignore
, ($>)
, and (<$)
) are all used to replace the values in a context with a constant. They are often useful when you don't care about the values themselves but want to keep the underlying structure.
Functors with more than one Type Parameter
The type constructors we looked at so far were all of type Type -> Type
. However, we can also implement Functor
for other type constructors. The only prerequisite is that the type parameter we'd like to change with function map
must be the last in the argument list. For instance, here is the Functor
implementation for Either e
(note, that Either e
has of course type Type -> Type
as required):
implementation Functor' (Either e) where
map' _ (Left ve) = Left ve
map' f (Right va) = Right $ f va
Here is another example, this time for a type constructor of type Bool -> Type -> Type
(you might remember this from the exercises in the last chapter):
data List01 : (nonEmpty : Bool) -> Type -> Type where
Nil : List01 False a
(::) : a -> List01 False a -> List01 ne a
implementation Functor (List01 ne) where
map _ [] = []
map f (x :: xs) = f x :: map f xs
Functor Composition
The nice thing about functors is how they can be paired and nested with other functors and the results are functors again:
record Product (f,g : Type -> Type) (a : Type) where
constructor MkProduct
fst : f a
snd : g a
implementation Functor f => Functor g => Functor (Product f g) where
map f (MkProduct l r) = MkProduct (map f l) (map f r)
The above allows us to conveniently map over a pair of functors. Note, however, that Idris needs some help with inferring the types involved:
toPair : Product f g a -> (f a, g a)
toPair (MkProduct fst snd) = (fst, snd)
fromPair : (f a, g a) -> Product f g a
fromPair (x,y) = MkProduct x y
productExample : Show a
=> (Either e a, List a)
-> (Either e String, List String)
productExample = toPair . map show . fromPair {f = Either e, g = List}
More often, we'd like to map over several layers of nested functors at once. Here's how to do this with an example:
record Comp (f,g : Type -> Type) (a : Type) where
constructor MkComp
unComp : f (g a)
implementation Functor f => Functor g => Functor (Comp f g) where
map f (MkComp v) = MkComp $ map f <$> v
compExample : Show a => List (Either e a) -> List (Either e String)
compExample = unComp . map show . MkComp {f = List, g = Either e}
Named Implementations
Sometimes, there are more ways to implement an interface for a given type. For instance, for numeric types we can have a Monoid
representing addition and one representing multiplication. Likewise, for nested functors, map
can be interpreted as a mapping over only the first layer of values, or a mapping over several layers of values.
One way to go about this is to define single-field wrappers as shown with data type Comp
above. However, Idris also allows us to define additional interface implementations, which must then be given a name. For instance:
[Compose'] Functor f => Functor g => Functor (f . g) where
map f = (map . map) f
Note, that this defines a new implementation of Functor
, which will not be considered during implicit resolution in order to avoid ambiguities. However, it is possible to explicitly choose to use this implementation by passing it as an explicit argument to map
, prefixed with an @
:
compExample2 : Show a => List (Either e a) -> List (Either e String)
compExample2 = map @{Compose} show
In the example above, we used Compose
instead of Compose'
, since the former is already exported by the Prelude.
Functor Laws
Implementations of Functor
are supposed to adhere to certain laws, just like implementations of Eq
or Ord
. Again, these laws are not verified by Idris, although it would be possible (and often cumbersome) to do so.
-
map id = id
: Mapping the identity function over a functor must not have any visible effect such as changing a container's structure or affecting the side effects perfomed when running anIO
action. -
map (f . g) = map f . map g
: Sequencing two mappings must be identical to a single mapping using the composition of the two functions.
Both of these laws request, that map
is preserving the structure of values. This is easier to understand with container types like List
, Maybe
, or Either e
, where map
is not allowed to add or remove any wrapped value, nor - in case of List
- change their order. With IO
, this can best be described as map
not performing additional side effects.
Exercises part 1
-
Write your own implementations of
Functor'
forMaybe
,List
,List1
,Vect n
,Either e
, andPair a
. -
Write a named implementation of
Functor
for pairs of functors (similar to the one implemented forProduct
). -
Implement
Functor
for data typeIdentity
(which is available fromControl.Monad.Identity
in base):record Identity a where constructor Id value : a
-
Here is a curious one: Implement
Functor
forConst e
(which is also available fromControl.Applicative.Const
in base). You might be confused about the fact that the second type parameter has absolutely no relevance at runtime, as there is no value of that type. Such types are sometimes called phantom types. They can be quite useful for tagging values with additional typing information.Don't let the above confuse you: There is only one possible implementation. As usual, use holes and let the compiler guide you if you get lost.
record Const (e,a : Type) where constructor MkConst value : e
-
Here is a sum type for describing CRUD operations (Create, Read, Update, and Delete) in a data store:
data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
Implement
Functor
forCrud i
. -
Here is a sum type for describing responses from a data server:
data Response : (e, i, a : Type) -> Type where Created : (id : i) -> (value : a) -> Response e i a Updated : (id : i) -> (value : a) -> Response e i a Found : (values : List a) -> Response e i a Deleted : (id : i) -> Response e i a Error : (err : e) -> Response e i a
Implement
Functor
forRepsonse e i
. -
Implement
Functor
forValidated e
:data Validated : (e,a : Type) -> Type where Invalid : (err : e) -> Validated e a Valid : (val : a) -> Validated e a
Applicative
module Tutorial.Functor.Applicative
import Tutorial.Functor.Functor
import Data.List1
import Data.String
import Data.Vect
%default total
While Functor
allows us to map a pure, unary function over a value in a context, it doesn't allow us to combine n such values under an n-ary function.
For instance, consider the following functions:
liftMaybe2 : (a -> b -> c) -> Maybe a -> Maybe b -> Maybe c
liftMaybe2 f (Just va) (Just vb) = Just $ f va vb
liftMaybe2 _ _ _ = Nothing
liftVect2 : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
liftVect2 _ [] [] = []
liftVect2 f (x :: xs) (y :: ys) = f x y :: liftVect2 f xs ys
liftIO2 : (a -> b -> c) -> IO a -> IO b -> IO c
liftIO2 f ioa iob = fromPrim $ go (toPrim ioa) (toPrim iob)
where go : PrimIO a -> PrimIO b -> PrimIO c
go pa pb w =
let MkIORes va w2 = pa w
MkIORes vb w3 = pb w2
in MkIORes (f va vb) w3
This behavior is not covered by Functor
, yet it is a very common thing to do. For instance, we might want to read two numbers from standard input (both operations might fail), calculating the product of the two. Here's the code:
multNumbers : Num a => Neg a => IO (Maybe a)
multNumbers = do
s1 <- getLine
s2 <- getLine
pure $ liftMaybe2 (*) (parseInteger s1) (parseInteger s2)
And it won't stop here. We might just as well want to have liftMaybe3
for ternary functions and three Maybe
arguments and so on, for arbitrary numbers of arguments.
But there is more: We'd also like to lift pure values into the context in question. With this, we could do the following:
liftMaybe3 : (a -> b -> c -> d) -> Maybe a -> Maybe b -> Maybe c -> Maybe d
liftMaybe3 f (Just va) (Just vb) (Just vc) = Just $ f va vb vc
liftMaybe3 _ _ _ _ = Nothing
pureMaybe : a -> Maybe a
pureMaybe = Just
multAdd100 : Num a => Neg a => String -> String -> Maybe a
multAdd100 s t = liftMaybe3 calc (parseInteger s) (parseInteger t) (pure 100)
where calc : a -> a -> a -> a
calc x y z = x * y + z
As you'll of course already know, I am now going to present a new interface to encapsulate this behavior. It's called Applicative
. Here is its definition and an example implementation:
public export
interface Functor' f => Applicative' f where
app : f (a -> b) -> f a -> f b
pure' : a -> f a
export
implementation Applicative' Maybe where
app (Just fun) (Just val) = Just $ fun val
app _ _ = Nothing
pure' = Just
Interface Applicative
is of course already exported by the Prelude. There, function app
is an operator sometimes called app or apply: (<*>)
.
You may wonder, how functions like liftMaybe2
or liftIO3
are related to operator apply. Let me demonstrate this:
liftA2 : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2 fun fa fb = pure fun <*> fa <*> fb
liftA3 : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3 fun fa fb fc = pure fun <*> fa <*> fb <*> fc
It is really important for you to understand what's going on here, so let's break these down. If we specialize liftA2
to use Maybe
for f
, pure fun
is of type Maybe (a -> b -> c)
. Likewise, pure fun <*> fa
is of type Maybe (b -> c)
, as (<*>)
will apply the value stored in fa
to the function stored in pure fun
(currying!).
You'll often see such chains of applications of apply, the number of applies corresponding to the arity of the function we lift. You'll sometimes also see the following, which allows us to drop the initial call to pure
, and use the operator version of map
instead:
liftA2' : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2' fun fa fb = fun <$> fa <*> fb
liftA3' : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3' fun fa fb fc = fun <$> fa <*> fb <*> fc
So, interface Applicative
allows us to lift values (and functions!) into computational contexts and apply them to values in the same contexts. Before we will see an extended example why this is useful, I'll quickly introduce some syntactic sugar for working with applicative functors.
Idiom Brackets
The programming style used for implementing liftA2'
and liftA3'
is also referred to as applicative style and is used a lot in Haskell for combining several effectful computations with a single pure function.
In Idris, there is an alternative to using such chains of operator applications: Idiom brackets. Here's another reimplementation of liftA2
and liftA3
:
liftA2'' : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2'' fun fa fb = [| fun fa fb |]
liftA3'' : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3'' fun fa fb fc = [| fun fa fb fc |]
The above implementations will be desugared to the one given for liftA2
and liftA3
, again before disambiguating, type checking, and filling in of implicit values. Like with the bind operator, we can therefore write custom implementations for pure
and (<*>)
, and Idris will use these if it can disambiguate between the overloaded function names.
Use Case: CSV Reader
In order to understand the power and versatility that comes with applicative functors, we will look at a slightly extended example. We are going to write some utilities for parsing and decoding content from CSV files. These are files where each line holds a list of values separated by commas (or some other delimiter). Typically, they are used to store tabular data, for instance from spread sheet applications. What we would like to do is convert lines in a CSV file and store the result in custom records, where each record field corresponds to a column in the table.
For instance, here is a simple example file, containing tabular user information from a web store: First name, last name, age (optional), email address, gender, and password.
Jon,Doe,42,jon@doe.ch,m,weijr332sdk
Jane,Doe,,jane@doe.ch,f,aa433sd112
Stefan,Hoeck,,nope@goaway.ch,m,password123
And here are the Idris data types necessary to hold this information at runtime. We use again custom string wrappers for increased type safety and because it will allow us to define for each data type what we consider to be valid input:
data Gender = Male | Female | Other
public export
record Name where
constructor MkName
value : String
record Email where
constructor MkEmail
value : String
record Password where
constructor MkPassword
value : String
record User where
constructor MkUser
firstName : Name
lastName : Name
age : Maybe Nat
email : Email
gender : Gender
password : Password
We start by defining an interface for reading fields in a CSV file and writing implementations for the data types we'd like to read:
public export
interface CSVField a where
read : String -> Maybe a
Below are implementations for Gender
and Bool
. I decided to in these cases encode each value with a single lower case character:
export
CSVField Gender where
read "m" = Just Male
read "f" = Just Female
read "o" = Just Other
read _ = Nothing
export
CSVField Bool where
read "t" = Just True
read "f" = Just False
read _ = Nothing
For numeric types, we can use the parsing functions from Data.String
:
export
CSVField Nat where
read = parsePositive
export
CSVField Integer where
read = parseInteger
export
CSVField Double where
read = parseDouble
For optional values, the stored type must itself come with an instance of CSVField
. We can then treat the empty string ""
as Nothing
, while a non-empty string will be passed to the encapsulated type's field reader. (Remember that (<$>)
is an alias for map
.)
export
CSVField a => CSVField (Maybe a) where
read "" = Just Nothing
read s = Just <$> read s
Finally, for our string wrappers, we need to decide what we consider to be valid values. For simplicity, I decided to limit the length of allowed strings and the set of valid characters.
readIf : (String -> Bool) -> (String -> a) -> String -> Maybe a
readIf p mk s = if p s then Just (mk s) else Nothing
isValidName : String -> Bool
isValidName s =
let len = length s
in 0 < len && len <= 100 && all isAlpha (unpack s)
export
CSVField Name where
read = readIf isValidName MkName
isEmailChar : Char -> Bool
isEmailChar '.' = True
isEmailChar '@' = True
isEmailChar c = isAlphaNum c
isValidEmail : String -> Bool
isValidEmail s =
let len = length s
in 0 < len && len <= 100 && all isEmailChar (unpack s)
CSVField Email where
read = readIf isValidEmail MkEmail
isPasswordChar : Char -> Bool
isPasswordChar ' ' = True
-- please note that isSpace holds as well for other characaters than ' '
-- e.g. for non-breaking space: isSpace '\160' = True
-- but only ' ' shall be llowed in passwords
isPasswordChar c = not (isControl c) && not (isSpace c)
isValidPassword : String -> Bool
isValidPassword s =
let len = length s
in 8 < len && len <= 100 && all isPasswordChar (unpack s)
CSVField Password where
read = readIf isValidPassword MkPassword
In a later chapter, we will learn about refinement types and how to store an erased proof of validity together with a validated value.
We can now start to decode whole lines in a CSV file. In order to do so, we first introduce a custom error type encapsulating how things can go wrong:
public export
data CSVError : Type where
FieldError : (line, column : Nat) -> (str : String) -> CSVError
UnexpectedEndOfInput : (line, column : Nat) -> CSVError
ExpectedEndOfInput : (line, column : Nat) -> CSVError
We can now use CSVField
to read a single field at a given line and position in a CSV file, and return a FieldError
in case of a failure.
export
readField : CSVField a => (line, column : Nat) -> String -> Either CSVError a
readField line col str =
maybe (Left $ FieldError line col str) Right (read str)
If we know in advance the number of fields we need to read, we can try and convert a list of strings to a Vect
of the given length. This facilitates reading record values of a known number of fields, as we get the correct number of string variables when pattern matching on the vector:
toVect : (n : Nat) -> (line, col : Nat) -> List a -> Either CSVError (Vect n a)
toVect 0 line _ [] = Right []
toVect 0 line col _ = Left (ExpectedEndOfInput line col)
toVect (S k) line col [] = Left (UnexpectedEndOfInput line col)
toVect (S k) line col (x :: xs) = (x ::) <$> toVect k line (S col) xs
Finally, we can implement function readUser
to try and convert a single line in a CSV-file to a value of type User
:
readUser' : (line : Nat) -> List String -> Either CSVError User
readUser' line ss = do
[fn,ln,a,em,g,pw] <- toVect 6 line 0 ss
[| MkUser (readField line 1 fn)
(readField line 2 ln)
(readField line 3 a)
(readField line 4 em)
(readField line 5 g)
(readField line 6 pw) |]
readUser : (line : Nat) -> String -> Either CSVError User
readUser line = readUser' line . forget . split (',' ==)
Let's give this a go at the REPL:
Tutorial.Functor> readUser 1 "Joe,Foo,46,j@f.ch,m,pw1234567"
Right (MkUser (MkName "Joe") (MkName "Foo")
(Just 46) (MkEmail "j@f.ch") Male (MkPassword "pw1234567"))
Tutorial.Functor> readUser 7 "Joe,Foo,46,j@f.ch,m,shortPW"
Left (FieldError 7 6 "shortPW")
Note, how in the implementation of readUser'
we used an idiom bracket to map a function of six arguments (MkUser
) over six values of type Either CSVError
. This will automatically succeed, if and only if all of the parsings have succeeded. It would have been notoriously cumbersome resulting in much less readable code to implement readUser'
with a succession of six nested pattern matches.
However, the idiom bracket above looks still quite repetitive. Surely, we can do better?
A Case for Heterogeneous Lists
It is time to learn about a family of types, which can be used as a generic representation for record types, and which will allow us to represent and read rows in heterogeneous tables with a minimal amount of code: Heterogeneous lists.
namespace HList
public export
data HList : (ts : List Type) -> Type where
Nil : HList Nil
(::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)
A heterogeneous list is a list type indexed over a list of types. This allows us to at each position store a value of the type at the same position in the list index. For instance, here is a variant, which stores three values of types Bool
, Nat
, and Maybe String
(in that order):
hlist1 : HList [Bool, Nat, Maybe String]
hlist1 = [True, 12, Nothing]
You could argue that heterogeneous lists are just tuples storing values of the given types. That's right, of course, however, as you'll learn the hard way in the exercises, we can use the list index to perform compile-time computations on HList
, for instance when concatenating two such lists to keep track of the types stored in the result at the same time.
But first, we'll make use of HList
as a means to concisely parse CSV-lines. In order to do that, we need to introduce a new interface for types corresponding to whole lines in a CSV-file:
public export
interface CSVLine a where
decodeAt : (line, col : Nat) -> List String -> Either CSVError a
We'll now write two implementations of CSVLine
for HList
: One for the Nil
case, which will succeed if and only if the current list of strings is empty. The other for the cons case, which will try and read a single field from the head of the list and the remainder from its tail. We use again an idiom bracket to concatenate the results:
export
CSVLine (HList []) where
decodeAt _ _ [] = Right Nil
decodeAt l c _ = Left (ExpectedEndOfInput l c)
export
CSVField t => CSVLine (HList ts) => CSVLine (HList (t :: ts)) where
decodeAt l c [] = Left (UnexpectedEndOfInput l c)
decodeAt l c (s :: ss) = [| readField l c s :: decodeAt l (S c) ss |]
And that's it! All we need to add is two utility function for decoding whole lines before they have been split into tokens, one of which is specialized to HList
and takes an erased list of types as argument to make it more convenient to use at the REPL:
decode : CSVLine a => (line : Nat) -> String -> Either CSVError a
decode line = decodeAt line 1 . forget . split (',' ==)
hdecode : (0 ts : List Type)
-> CSVLine (HList ts)
=> (line : Nat)
-> String
-> Either CSVError (HList ts)
hdecode _ = decode
It's time to reap the fruits of our labour and give this a go at the REPL:
Tutorial.Functor> hdecode [Bool,Nat,Double] 1 "f,100,12.123"
Right [False, 100, 12.123]
Tutorial.Functor> hdecode [Name,Name,Gender] 3 "Idris,,f"
Left (FieldError 3 2 "")
Applicative Laws
Again, Applicative
implementations must follow certain laws. Here they are:
-
pure id <*> fa = fa
: Lifting and applying the identity function has no visible effect. -
[| f . g |] <*> v = f <*> (g <*> v)
: I must not matter, whether we compose our functions first and then apply them, or whether we apply our functions first and then compose them.The above might be hard to understand, so here they are again with explicit types and implementations:
compL : Maybe (b -> c) -> Maybe (a -> b) -> Maybe a -> Maybe c compL f g v = [| f . g |] <*> v compR : Maybe (b -> c) -> Maybe (a -> b) -> Maybe a -> Maybe c compR f g v = f <*> (g <*> v)
The second applicative law states, that the two implementations
compL
andcompR
should behave identically. -
pure f <*> pure x = pure (f x)
. This is also called the homomorphism law. It should be pretty self-explaining. -
f <*> pure v = pure ($ v) <*> f
. This is called the law of interchange.This should again be explained with a concrete example:
interL : Maybe (a -> b) -> a -> Maybe b interL f v = f <*> pure v interR : Maybe (a -> b) -> a -> Maybe b interR f v = pure ($ v) <*> f
Note, that
($ v)
has type(a -> b) -> b
, so this is a function type being applied tof
, which has a function of typea -> b
wrapped in aMaybe
context.The law of interchange states that it must not matter whether we apply a pure value from the left or right of the apply operator.
Exercises part 2
-
Implement
Applicative'
forEither e
andIdentity
. -
Implement
Applicative'
forVect n
. Note: In order to implementpure
, the length must be known at runtime. This can be done by passing it as an unerased implicit to the interface implementation:implementation {n : _} -> Applicative' (Vect n) where
-
Implement
Applicative'
forPair e
, withe
having aMonoid
constraint. -
Implement
Applicative
forConst e
, withe
having aMonoid
constraint. -
Implement
Applicative
forValidated e
, withe
having aSemigroup
constraint. This will allow us to use(<+>)
to accumulate errors in case of twoInvalid
values in the implementation of apply. -
Add an additional data constructor of type
CSVError -> CSVError -> CSVError
toCSVError
and use this to implementSemigroup
forCSVError
. -
Refactor our CSV-parsers and all related functions so that they return
Validated
instead ofEither
. This will only work, if you solved exercise 6.Two things to note: You will have to adjust very little of the existing code, as we can still use applicative syntax with
Validated
. Also, with this change, we enhanced our CSV-parsers with the ability of error accumulation. Here are some examples from a REPL session:Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "t,12,f" Valid [True, 12, Female] Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "o,-12,f" Invalid (App (FieldError 1 1 "o") (FieldError 1 2 "-12")) Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "o,-12,foo" Invalid (App (FieldError 1 1 "o") (App (FieldError 1 2 "-12") (FieldError 1 3 "foo")))
Behold the power of applicative functors and heterogeneous lists: With only a few lines of code we wrote a pure, type-safe, and total parser with error accumulation for lines in CSV-files, which is very convenient to use at the same time!
-
Since we introduced heterogeneous lists in this chapter, it would be a pity not to experiment with them a little.
This exercise is meant to sharpen your skills in type wizardry. It therefore comes with very few hints. Try to decide yourself what behavior you'd expect from a given function, how to express this in the types, and how to implement it afterwards. If your types are correct and precise enough, the implementations will almost come for free. Don't give up too early if you get stuck. Only if you truly run out of ideas should you have a glance at the solutions (and then, only at the types at first!)
-
Implement
head
forHList
. -
Implement
tail
forHList
. -
Implement
(++)
forHList
. -
Implement
index
forHList
. This might be harder than the other three. Go back and look how we implementedindexList
in an earlier exercise and start from there. -
Package contrib, which is part of the Idris project, provides
Data.HVect.HVect
, a data type for heterogeneous vectors. The only difference to our ownHList
is, thatHVect
is indexed over a vector of types instead of a list of types. This makes it easier to express certain operations at the type level.Write your own implementation of
HVect
together with functionshead
,tail
,(++)
, andindex
. -
For a real challenge, try implementing a function for transposing a
Vect m (HVect ts)
. You'll first have to be creative about how to even express this in the types.Note: In order to implement this, you'll need to pattern match on an erased argument in at least one case to help Idris with type inference. Pattern matching on erased arguments is forbidden (they are erased after all, so we can't inspect them at runtime), unless the structure of the value being matched on can be derived from another, un-erased argument.
Also, don't worry if you get stuck on this one. It took me several tries to figure it out. But I enjoyed the experience, so I just had to include it here. :-)
Note, however, that such a function might be useful when working with CSV-files, as it allows us to convert a table represented as rows (a vector of tuples) to one represented as columns (a tuple of vectors).
-
-
Show, that the composition of two applicative functors is again an applicative functor by implementing
Applicative
forComp f g
. -
Show, that the product of two applicative functors is again an applicative functor by implementing
Applicative
forProd f g
.
Monad
module Tutorial.Functor.Monad
import Tutorial.Functor.Functor
import Tutorial.Functor.Applicative
import Data.List1
import Data.String
import Data.Vect
%default total
Finally, Monad
. A lot of ink has been spilled about this one. However, after what we already saw in the chapter about IO
, there is not much left to discuss here. Monad
extends Applicative
and adds two new related functions: The bind operator ((>>=)
) and function join
. Here is its definition:
interface Applicative' m => Monad' m where
bind : m a -> (a -> m b) -> m b
join' : m (m a) -> m a
Implementers of Monad
are free to choose to either implement (>>=)
or join
or both. You will show in an exercise, how join
can be implemented in terms of bind and vice versa.
The big difference between Monad
and Applicative
is, that the former allows a computation to depend on the result of an earlier computation. For instance, we could decide based on a string read from standard input whether to delete a file or play a song. The result of the first IO
action (reading some user input) will affect, which IO
action to run next. This is not possible with the apply operator:
(<*>) : IO (a -> b) -> IO a -> IO b
The two IO
actions have already been decided on when they are being passed as arguments to (<*>)
. The result of the first cannot - in the general case - affect which computation to run in the second. (Actually, with IO
this would theoretically be possible via side effects: The first action could write some command to a file or overwrite some mutable state, and the second action could read from that file or state, thus deciding on the next thing to do. But this is a speciality of IO
, not of applicative functors in general. If the functor in question was Maybe
, List
, or Vector
, no such thing would be possible.)
Let's demonstrate the difference with an example. Assume we'd like to enhance our CSV-reader with the ability to decode a line of tokens to a sum type. For instance, we'd like to decode CRUD requests from the lines of a CSV-file:
data Crud : (i : Type) -> (a : Type) -> Type where
Create : (value : a) -> Crud i a
Update : (id : i) -> (value : a) -> Crud i a
Read : (id : i) -> Crud i a
Delete : (id : i) -> Crud i a
We need a way to on each line decide, which data constructor to choose for our decoding. One way to do this is to put the name of the data constructor (or some other tag of identification) in the first column of the CSV-file:
hlift : (a -> b) -> HList [a] -> b
hlift f [x] = f x
hlift2 : (a -> b -> c) -> HList [a,b] -> c
hlift2 f [x,y] = f x y
decodeCRUD : CSVField i
=> CSVField a
=> (line : Nat)
-> (s : String)
-> Either CSVError (Crud i a)
decodeCRUD l s =
let h ::: t = split (',' ==) s
in do
MkName n <- readField l 1 h
case n of
"Create" => hlift Create <$> decodeAt l 2 t
"Update" => hlift2 Update <$> decodeAt l 2 t
"Read" => hlift Read <$> decodeAt l 2 t
"Delete" => hlift Delete <$> decodeAt l 2 t
_ => Left (FieldError l 1 n)
I added two utility function for helping with type inference and to get slightly nicer syntax. The important thing to note is, how we pattern match on the result of the first parsing function to decide on the data constructor and thus the next parsing function to use.
Here's how this works at the REPL:
Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Create,jon@doe.ch"
Right (Create (MkEmail "jon@doe.ch"))
Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Update,12,jane@doe.ch"
Right (Update 12 (MkEmail "jane@doe.ch"))
Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Delete,jon@doe.ch"
Left (FieldError 1 2 "jon@doe.ch")
To conclude, Monad
, unlike Applicative
, allows us to chain computations sequentially, where intermediary results can affect the behavior of later computations. So, if you have n unrelated effectful computations and want to combine them under a pure, n-ary function, Applicative
will be sufficient. If, however, you want to decide based on the result of an effectful computation what computation to run next, you need a Monad
.
Note, however, that Monad
has one important drawback compared to Applicative
: In general, monads don't compose. For instance, there is no Monad
instance for Either e . IO
. We will later learn about monad transformers, which can be composed with other monads.
Monad Laws
Without further ado, here are the laws for Monad
:
-
ma >>= pure = ma
andpure v >>= f = f v
. These are monad's identity laws. Here they are as concrete examples:id1L : Maybe a -> Maybe a id1L ma = ma >>= pure id2L : a -> (a -> Maybe b) -> Maybe b id2L v f = pure v >>= f id2R : a -> (a -> Maybe b) -> Maybe b id2R v f = f v
These two laws state that
pure
should behave neutrally w.r.t. bind. -
(m >>= f) >>= g = m >>= (f >=> g)
. This is the law of associativity for monad. You might not have seen the second operator(>=>)
. It can be used to sequence effectful computations and has the following type:Tutorial.Functor> :t (>=>) Prelude.>=> : Monad m => (a -> m b) -> (b -> m c) -> a -> m c
The above are the official monad laws. However, we need to consider a third one, given that in Idris (and Haskell) Monad
extends Applicative
: As (<*>)
can be implemented in terms of (>>=)
, the actual implementation of (<*>)
must behave the same as the implementation in terms of (>>=)
:
mf <*> ma = mf >>= (\fun => map (fun $) ma)
.
Exercises part 3
-
Applicative
extendsFunctor
, because everyApplicative
is also aFunctor
. Proof this by implementingmap
in terms ofpure
and(<*>)
. -
Monad
extendsApplicative
, because everyMonad
is also anApplicative
. Proof this by implementing(<*>)
in terms of(>>=)
andpure
. -
Implement
(>>=)
in terms ofjoin
and other functions in theMonad
hierarchy. -
Implement
join
in terms of(>>=)
and other functions in theMonad
hierarchy. -
There is no lawful
Monad
implementation forValidated e
. Why? -
In this slightly extended exercise, we are going to simulate CRUD operations on a data store. We will use a mutable reference (imported from
Data.IORef
from the base library) holding a list ofUser
s paired with a unique ID of typeNat
as our user data base:DB : Type DB = IORef (List (Nat,User))
Most operations on a database come with a risk of failure: When we try to update or delete a user, the entry in question might no longer be there. When we add a new user, a user with the given email address might already exist. Here is a custom error type to deal with this:
data DBError : Type where UserExists : Email -> Nat -> DBError UserNotFound : Nat -> DBError SizeLimitExceeded : DBError
In general, our functions will therefore have a type similar to the following:
someDBProg : arg1 -> arg2 -> DB -> IO (Either DBError a)
We'd like to abstract over this, by introducing a new wrapper type:
record Prog a where constructor MkProg runProg : DB -> IO (Either DBError a)
We are now ready to write us some utility functions. Make sure to follow the following business rules when implementing the functions below:
-
Email addresses in the DB must be unique. (Consider implementing
Eq Email
to verify this). -
The size limit of 1000 entries must not be exceeded.
-
Operations trying to lookup a user by their ID must fail with
UserNotFound
in case no entry was found in the DB.
You'll need the following functions from
Data.IORef
when working with mutable references:newIORef
,readIORef
, andwriteIORef
. In addition, functionsData.List.lookup
andData.List.find
might be useful to implement some of the functions below.-
Implement interfaces
Functor
,Applicative
, andMonad
forProg
. -
Implement interface
HasIO
forProg
. -
Implement the following utility functions:
throw : DBError -> Prog a getUsers : Prog (List (Nat,User)) -- check the size limit! putUsers : List (Nat,User) -> Prog () -- implement this in terms of `getUsers` and `putUsers` modifyDB : (List (Nat,User) -> List (Nat,User)) -> Prog ()
-
Implement function
lookupUser
. This should fail with an appropriate error, if a user with the given ID cannot be found.lookupUser : (id : Nat) -> Prog User
-
Implement function
deleteUser
. This should fail with an appropriate error, if a user with the given ID cannot be found. Make use oflookupUser
in your implementation.deleteUser : (id : Nat) -> Prog ()
-
Implement function
addUser
. This should fail, if a user with the givenEmail
already exists, or if the data banks size limit of 1000 entries is exceeded. In addition, this should create and return a unique ID for the new user entry.addUser : (new : User) -> Prog Nat
-
Implement function
updateUser
. This should fail, if the user in question cannot be found or a user with the updated user'sEmail
already exists. The returned value should be the updated user.updateUser : (id : Nat) -> (mod : User -> User) -> Prog User
-
Data type
Prog
is actually too specific. We could just as well abstract over the error type and theDB
environment:record Prog' env err a where constructor MkProg' runProg' : env -> IO (Either err a)
Verify, that all interface implementations you wrote for
Prog
can be used verbatim to implement the same interfaces forProg' env err
. The same goes forthrow
with only a slight adjustment in the function's type.
-
Background and further Reading
Concepts like functor and monad have their origin in category theory, a branch of mathematics. That is also where their laws come from. Category theory was found to have applications in programming language theory, especially functional programming. It is a highly abstract topic, but there is a pretty accessible introduction for programmers, written by Bartosz Milewski.
The usefulness of applicative functors as a middle ground between functor and monad was discovered several years after monads had already been in use in Haskell. They were introduced in the article Applicative Programming with Effects, which is freely available online and a highly recommended read.
Conclusion
-
Interfaces
Functor
,Applicative
, andMonad
abstract over programming patterns that come up when working with type constructors of typeType -> Type
. Such data types are also referred to as values in a context, or effectful computations. -
Functor
allows us to map over values in a context without affecting the context's underlying structure. -
Applicative
allows us to apply n-ary functions to n effectful computations and to lift pure values into a context. -
Monad
allows us to chain effectful computations, where the intermediary results can affect, which computation to run further down the chain. -
Unlike
Monad
,Functor
andApplicative
compose: The product and composition of two functors or applicatives are again functors or applicatives, respectively. -
Idris provides syntactic sugar for working with some of the interfaces presented here: Idiom brackets for
Applicative
, do blocks and the bang operator forMonad
.
What's next?
In the next chapter we get to learn more about recursion, totality checking, and an interface for collapsing container types: Foldable
.
Recursion and Folds
In this chapter, we are going to have a closer look at the computations we typically perform with container types: Parameterized data types like List
, Maybe
, or Identity
, holding zero or more values of the parameter's type. Many of these functions are recursive in nature, so we start with a discourse about recursion in general, and tail recursion as an important optimization technique in particular. Most recursive functions in this part will describe pure iterations over lists.
It is recursive functions, for which totality is hard to determine, so we will next have a quick look at the totality checker and learn, when it will refuse to accept a function as being total and what to do about this.
Finally, we will start looking for common patterns in the recursive functions from the first part and will eventually introduce a new interface for consuming container types: Interface Foldable
.
Recursion
module Tutorial.Folds.Recursion
import Data.List1
import Data.Maybe
import Data.Vect
import Debug.Trace
%default total
In this section, we are going to have a closer look at recursion in general and at tail recursion in particular.
Recursive functions are functions, which call themselves to repeat a task or calculation until a certain aborting condition (called the base case) holds. Please note, that it is recursive functions, which make it hard to verify totality: Non-recursive functions, which are covering (they cover all possible cases in their pattern matches) are automatically total if they only invoke other total functions.
Here is an example of a recursive function: It generates a list of the given length filling it with identical values:
replicateList : Nat -> a -> List a
replicateList 0 _ = []
replicateList (S k) x = x :: replicateList k x
As you can see (this module has the %default total
pragma at the top), this function is provably total. Idris verifies, that the Nat
argument gets strictly smaller in each recursive call, and that therefore, the function must eventually come to an end. Of course, we can do the same thing for Vect
, where we can even show that the length of the resulting vector matches the given natural number:
replicateVect : (n : Nat) -> a -> Vect n a
replicateVect 0 _ = []
replicateVect (S k) x = x :: replicateVect k x
While we often use recursion to create values of data types like List
or Vect
, we also use recursion, when we consume such values. For instance, here is a function for calculating the length of a list:
len : List a -> Nat
len [] = 0
len (_ :: xs) = 1 + len xs
Again, Idris can verify that len
is total, as the list we pass in the recursive case is strictly smaller than the original list argument.
But when is a recursive function non-total? Here is an example: The following function creates a sequence of values until the given generation function (gen
) returns a Nothing
. Note, how we use a state value (of generic type s
) and use gen
to calculate a value together with the next state:
covering
unfold : (gen : s -> Maybe (s,a)) -> s -> List a
unfold gen vs = case gen vs of
Just (vs',va) => va :: unfold gen vs'
Nothing => []
With unfold
, Idris can't verify that any of its arguments is converging towards the base case. It therefore rightfully refuses to accept that unfold
is total. And indeed, the following function produces an infinite list (so please, don't try to inspect this at the REPL, as doing so will consume all your computer's memory):
fiboHelper : (Nat,Nat) -> ((Nat,Nat),Nat)
fiboHelper (f0,f1) = ((f1, f0 + f1), f0)
covering
fibonacci : List Nat
fibonacci = unfold (Just . fiboHelper) (1,1)
In order to safely create a (finite) sequence of Fibonacci numbers, we need to make sure the function generating the sequence will stop after a finite number of steps, for instance by limiting the length of the list:
unfoldTot : Nat -> (gen : s -> Maybe (s,a)) -> s -> List a
unfoldTot 0 _ _ = []
unfoldTot (S k) gen vs = case gen vs of
Just (vs',va) => va :: unfoldTot k gen vs'
Nothing => []
fibonacciN : Nat -> List Nat
fibonacciN n = unfoldTot n (Just . fiboHelper) (1,1)
The Call Stack
In order to demonstrate what tail recursion is about, we require the following main
function:
main : IO ()
main = printLn . len $ replicateList 10000 10
If you have Node.js installed on your system, you might try the following experiment. Compile and run this module using the Node.js backend of Idris instead of the default Chez Scheme backend and run the resulting JavaScript source file with the Node.js binary:
idris2 --cg node -o test.js --find-ipkg src/Tutorial/Folds.md
node build/exec/test.js
Node.js will fail with the following error message and a lengthy stack trace: RangeError: Maximum call stack size exceeded
. What's going on here? How can it be that main
fails with an exception although it is provably total?
First, remember that a function being total means that it will eventually produce a value of the given type in a finite amount of time, given enough resources like computer memory. Here, main
hasn't been given enough resources as Node.js has a very small size limit on its call stack. The call stack can be thought of as a stack data structure (first in, last out), where nested function calls are put. In case of recursive functions, the stack size increases by one with every recursive function call. In case of our main
function, we create and consume a list of length 10'000, so the call stack will hold at least 10'000 function calls before they are being invoked and the stack's size is reduced again. This exceeds Node.js's stack size limit by far, hence the overflow error.
Now, before we look at a solution how to circumvent this issue, please note that this is a very serious and limiting source of bugs when using the JavaScript backends of Idris. In Idris, having no access to control structures like for
or while
loops, we always have to resort to recursion in order to describe iterative computations. Luckily (or should I say "unfortunately", since otherwise this issue would already have been addressed with all seriousness), the Scheme backends don't have this issue, as their stack size limit is much larger and they perform all kinds of optimizations internally to prevent the call stack from overflowing.
Tail Recursion
A recursive function is said to be tail recursive, if all recursive calls occur at tail position: The last function call in a (sub)expression. For instance, the following version of len
is tail recursive:
lenOnto : Nat -> List a -> Nat
lenOnto k [] = k
lenOnto k (_ :: xs) = lenOnto (k + 1) xs
Compare this to len
as defined above: There, the last function call is an invocation of operator (+)
, and the recursive call happens in one of its arguments:
len (_ :: xs) = 1 + len xs
We can use lenOnto
as a utility to implement a tail recursive version of len
without the additional Nat
argument:
lenTR : List a -> Nat
lenTR = lenOnto 0
This is a common pattern when writing tail recursive functions: We typically add an additional function argument for accumulating intermediary results, which is then passed on explicitly at each recursive call. For instance, here is a tail recursive version of replicateList
:
replicateListTR : Nat -> a -> List a
replicateListTR n v = go Nil n
where go : List a -> Nat -> List a
go xs 0 = xs
go xs (S k) = go (v :: xs) k
The big advantage of tail recursive functions is, that they can be easily converted to efficient, imperative loops by the Idris compiler, and are thus stack safe: Recursive function calls are not added to the call stack, thus avoiding the dreaded stack overflow errors.
main1 : IO ()
main1 = printLn . lenTR $ replicateListTR 10000 10
We can again run main1
using the Node.js backend. This time, we use slightly different syntax to execute a function other than main
(Remember: The dollar prefix is only there to distinghish a terminal command from its output. It is not part of the command you enter in a terminal sesssion.):
$ idris2 --cg node --exec main1 --find-ipkg src/Tutorial/Folds.md
10000
As you can see, this time the computation finished without overflowing the call stack.
Tail recursive functions are allowed to consist of (possibly nested) pattern matches, with recursive calls at tail position in several of the branches. Here is an example:
countTR : (a -> Bool) -> List a -> Nat
countTR p = go 0
where go : Nat -> List a -> Nat
go k [] = k
go k (x :: xs) = case p x of
True => go (S k) xs
False => go k xs
Note, how each invocation of go
is in tail position in its branch of the case expression.
Mutual Recursion
It is sometimes convenient to implement several related functions, which call each other recursively. In Idris, unlike in many other programming languages, a function must be declared in a source file before it can be called by other functions, as in general a function's implementation must be available during type checking (because Idris has dependent types). There are two ways around this, which actually result in the same internal representation in the compiler. Our first option is to write down the functions' declarations first with the implementations following after. Here's a silly example:
even : Nat -> Bool
odd : Nat -> Bool
even 0 = True
even (S k) = odd k
odd 0 = False
odd (S k) = even k
As you can see, function even
is allowed to call function odd
in its implementation, since odd
has already been declared (but not yet implemented).
If you're like me and want to keep declarations and implementations next to each other, you can introduce a mutual
block, which has the same effect. Like with other code blocks, functions in a mutual
block must all be indented by the same amount of whitespace:
mutual
even' : Nat -> Bool
even' 0 = True
even' (S k) = odd' k
odd' : Nat -> Bool
odd' 0 = False
odd' (S k) = even' k
Just like with single recursive functions, mutually recursive functions can be optimized to imperative loops if all recursive calls occur at tail position. This is the case with functions even
and odd
, as can again be verified at the Node.js backend:
main2 : IO ()
main2 = printLn (even 100000)
>> printLn (odd 100000)
$ idris2 --cg node --exec main2 --find-ipkg src/Tutorial/Folds.md
True
False
Final Remarks
In this section, we learned about several important aspects of recursion and totality checking, which are summarized here:
-
In pure functional programming, recursion is the way to implement iterative procedures.
-
Recursive functions pass the totality checker, if it can verify that one of the arguments is getting strictly smaller in every recursive function call.
-
Arbitrary recursion can lead to stack overflow exceptions on backends with small stack size limits.
-
The JavaScript backends of Idris perform mutual tail call optimization: Tail recursive functions are converted to stack safe, imperative loops.
Note, that not all Idris backends you will come across in the wild will perform tail call optimization. Please check the corresponding documentation.
Note also, that most recursive functions in the core libraries (prelude and base) do not yet make use of tail recursion. There is an important reason for this: In many cases, non-tail recursive functions are easier to use in compile-time proofs, as they unify more naturally than their tail recursive counterparts. Compile-time proofs are an important aspect of programming in Idris (as we will see in later chapters), so there is a compromise to be made between what performs well at runtime and what works well at compile time. Eventually, the way to go might be to provide two implementations for most recursive functions with a transform rule telling the compiler to use the optimized version at runtime whenever programmers use the non-optimized version in their code. Such transform rules have - for instance - already been written for functions pack
and unpack
(which use fastPack
and fastUnpack
at runtime; see the corresponding rules in the following source file).
Exercises part 1
In these exercises you are going to implement several recursive functions. Make sure to use tail recursion whenever possible and quickly verify the correct behavior of all functions at the REPL.
-
Implement functions
anyList
andallList
, which returnTrue
if any element (or all elements in case ofallList
) in a list fulfills the given predicate:anyList : (a -> Bool) -> List a -> Bool allList : (a -> Bool) -> List a -> Bool
-
Implement function
findList
, which returns the first value (if any) fulfilling the given predicate:findList : (a -> Bool) -> List a -> Maybe a
-
Implement function
collectList
, which returns the first value (if any), for which the given function returns aJust
:collectList : (a -> Maybe b) -> List a -> Maybe b
Implement
lookupList
in terms ofcollectList
:lookupList : Eq a => a -> List (a,b) -> Maybe b
-
For functions like
map
orfilter
, which must loop over a list without affecting the order of elements, it is harder to write a tail recursive implementation. The safest way to do so is by using aSnocList
(a reverse kind of list that's built from head to tail instead of from tail to head) to accumulate intermediate results. Its two constructors areLin
and(:<)
(called the snoc operator). ModuleData.SnocList
exports two tail recursive operators called fish and chips ((<><)
and(<>>)
) for going fromSnocList
toList
and vice versa. Have a look at the types of all new data constructors and operators before continuing with the exercise.Implement a tail recursive version of
map
forList
by using aSnocList
to reassemble the mapped list. Use then the chips operator with aNil
argument to in the end convert theSnocList
back to aList
.mapTR : (a -> b) -> List a -> List b
-
Implement a tail recursive version of
filter
, which only keeps those values in a list, which fulfill the given predicate. Use the same technique as described in exercise 4.filterTR : (a -> Bool) -> List a -> List a
-
Implement a tail recursive version of
mapMaybe
, which only keeps those values in a list, for which the given function argument returns aJust
:mapMaybeTR : (a -> Maybe b) -> List a -> List b
Implement
catMaybesTR
in terms ofmapMaybeTR
:catMaybesTR : List (Maybe a) -> List a
-
Implement a tail recursive version of list concatenation:
concatTR : List a -> List a -> List a
-
Implement tail recursive versions of bind and
join
forList
:bindTR : List a -> (a -> List b) -> List b joinTR : List (List a) -> List a
Notes on Totality Checking
module Tutorial.Folds.Totality
%default total
The totality checker in Idris verifies, that at least one (possibly erased!) argument in a recursive call converges towards a base case. For instance, with natural numbers, if the base case is zero (corresponding to data constructor Z
), and we continue with k
after pattern matching on S k
, Idris can derive from Nat
's constructors, that k
is strictly smaller than S k
and therefore the recursive call must converge towards a base case. Exactly the same reasoning is used when pattern matching on a list and continuing only with its tail in the recursive call.
While this works in many cases, it doesn't always go as expected. Below, I'll show you a couple of examples where totality checking fails, although we know, that the functions in question are definitely total.
Case 1: Recursion over a Primitive
Idris doesn't know anything about the internal structure of primitive data types. So the following function, although being obviously total, will not be accepted by the totality checker:
covering
replicatePrim : Bits32 -> a -> List a
replicatePrim 0 v = []
replicatePrim x v = v :: replicatePrim (x - 1) v
Unlike with natural numbers (Nat
), which are defined as an inductive data type and are only converted to integer primitives during compilation, Idris can't tell that x - 1
is strictly smaller than x
, and so it fails to verify that this must converge towards the base case. (The reason is, that x - 1
is implemented in terms of primitive function prim__sub_Bits32
, which is built into the compiler and must be implemented by each backend individually. The totality checker knows about data types, constructors, and functions defined in Idris, but not about (primitive) functions and foreign functions implemented at the backends. While it is theoretically possible to also define and use laws for primitive and foreign functions, this hasn't yet been done for most of them.)
Since non-totality is highly contagious (all functions invoking a partial function are themselves considered to be partial by the totality checker), there is utility function assert_smaller
, which we can use to convince the totality checker and still annotate our functions with the total
keyword:
replicatePrim' : Bits32 -> a -> List a
replicatePrim' 0 v = []
replicatePrim' x v = v :: replicatePrim' (assert_smaller x $ x - 1) v
Please note, though, that whenever you use assert_smaller
to silence the totality checker, the burden of proving totality rests on your shoulders. Failing to do so can lead to arbitrary and unpredictable program behavior (which is the default with most other programming languages).
Ex Falso Quodlibet
Below - as a demonstration - is a simple proof of Void
. Void
is an uninhabited type: a type with no values. Proofing Void
means, that we implement a function accepted by the totality checker, which returns a value of type Void
, although this is supposed to be impossible as there is no such value. Doing so allows us to completely disable the type system together with all the guarantees it provides. Here's the code and its dire consequences:
-- In order to proof `Void`, we just loop forever, using
-- `assert_smaller` to silence the totality checker.
proofOfVoid : Bits8 -> Void
proofOfVoid n = proofOfVoid (assert_smaller n n)
-- From a value of type `Void`, anything follows!
-- This function is safe and total, as there is no
-- value of type `Void`!
exFalsoQuodlibet : Void -> a
exFalsoQuodlibet _ impossible
-- By passing our proof of void to `exFalsoQuodlibet`
-- (exported by the *Prelude* by the name of `void`), we
-- can coerce any value to a value of any other type.
-- This renders type checking completely useless, as
-- we can freely convert between values of different
-- types.
coerce : a -> b
coerce _ = exFalsoQuodlibet (proofOfVoid 0)
-- Finally, we invoke `putStrLn` with a number instead
-- of a string. `coerce` allows us to do just that.
pain : IO ()
pain = putStrLn $ coerce 0
Please take a moment to marvel at provably total function coerce
: It claims to convert any value to a value of any other type. And it is completely safe, as it only uses total functions in its implementation. The problem is - of course - that proofOfVoid
should never ever have been a total function.
In pain
we use coerce
to conjure a string from an integer. In the end, we get what we deserve: The program crashes with an error. While things could have been much worse, it can still be quite time consuming and annoying to localize the source of such an error.
$ idris2 --cg node --exec pain --find-ipkg src/Tutorial/Folds.md
ERROR: No clauses
So, with a single thoughtless placement of assert_smaller
we wrought havoc within our pure and total codebase sacrificing totality and type safety in one fell swoop. Therefore: Use at your own risk!
Note: I do not expect you to understand all the dark magic at work in the code above. I'll explain the details in due time in another chapter.
Second note: Ex falso quodlibet, also called the principle of explosion is a law in logic: From a contradiction, any statement can be proven. In our case, the contradiction was our proof of Void
: The claim that we wrote a total function producing such a value, although Void
is an uninhabited type. You can verify this by inspecting Void
at the REPL with :doc Void
: It has no data constructors.
Case 2: Recursion via Function Calls
Below is an implementation of a rose tree. Rose trees can represent search paths in computer algorithms, for instance in graph theory.
record Tree a where
constructor Node
value : a
forest : List (Tree a)
Forest : Type -> Type
Forest = List . Tree
We could try and compute the size of such a tree as follows:
covering
size : Tree a -> Nat
size (Node _ forest) = S . sum $ map size forest
In the code above, the recursive call happens within map
. We know that we are using only subtrees in the recursive calls (since we know how map
is implemented for List
), but Idris can't know this (teaching a totality checker how to figure this out on its own seems to be an open research question). So it will refuse to accept the function as being total.
There are two ways to handle the case above. If we don't mind writing a bit of otherwise unneeded boilerplate code, we can use explicit recursion. In fact, since we often also work with search forests, this is the preferable way here.
mutual
treeSize : Tree a -> Nat
treeSize (Node _ forest) = S $ forestSize forest
forestSize : Forest a -> Nat
forestSize [] = 0
forestSize (x :: xs) = treeSize x + forestSize xs
In the case above, Idris can verify that we don't blow up our trees behind its back as we are explicit about what happens in each recursive step. This is the safe, preferable way of going about this, especially if you are new to the language and totality checking in general.
However, sometimes the solution presented above is just too cumbersome to write. For instance, here is an implementation of Show
for rose trees:
Show a => Show (Tree a) where
showPrec p (Node v ts) =
assert_total $ showCon p "Node" (showArg v ++ showArg ts)
In this case, we'd have to manually reimplement Show
for lists of trees: A tedious task - and error-prone on its own. Instead, we resort to using the mighty sledgehammer of totality checking: assert_total
. Needless to say that this comes with the same risks as assert_smaller
, so be very careful.
Exercises part 2
Implement the following functions in a provably total way without "cheating". Note: It is not necessary to implement these in a tail recursive way.
-
Implement function
depth
for rose trees. This should return the maximal number ofNode
constructors from the current node to the farthest child node. For instance, the current node should be at depth one, all its direct child nodes are at depth two, their immediate child nodes at depth three and so on. -
Implement interface
Eq
for rose trees. -
Implement interface
Functor
for rose trees. -
For the fun of it: Implement interface
Show
for rose trees. -
In order not to forget how to program with dependent types, implement function
treeToVect
for converting a rose tree to a vector of the correct size.Hint: Make sure to follow the same recursion scheme as in the implementation of
treeSize
. Otherwise, this might be very hard to get to work.
Interface Foldable
module Tutorial.Folds.Foldable
import Debug.Trace
%default total
When looking back at all the exercises we solved in the section about recursion, most tail recursive functions on lists were of the following pattern: Iterate over all list elements from head to tail while passing along some state for accumulating intermediate results. At the end of the list, return the final state or convert it with an additional function call.
Left Folds
This is functional programming, and we'd like to abstract over such reoccurring patterns. In order to tail recursively iterate over a list, all we need is an accumulator function and some initial state. But what should be the type of the accumulator? Well, it combines the current state with the list's next element and returns an updated state: state -> elem -> state
. Surely, we can come up with a higher-order function to encapsulate this behavior:
leftFold : (acc : state -> el -> state) -> (st : state) -> List el -> state
leftFold _ st [] = st
leftFold acc st (x :: xs) = leftFold acc (acc st x) xs
We call this function a left fold, as it iterates over the list from left to right (head to tail), collapsing (or folding) the list until just a single value remains. This new value might still be a list or other container type, but the original list has been consumed from head to tail. Note how leftFold
is tail recursive, and therefore all functions implemented in terms of leftFold
are tail recursive (and thus, stack safe!) as well.
Here are a few examples:
sumLF : Num a => List a -> a
sumLF = leftFold (+) 0
reverseLF : List a -> List a
reverseLF = leftFold (flip (::)) Nil
-- this is more natural than `reverseLF`!
toSnocListLF : List a -> SnocList a
toSnocListLF = leftFold (:<) Lin
Right Folds
The example functions we implemented in terms of leftFold
had to always completely traverse the whole list, as every single element was required to compute the result. This is not always necessary, however. For instance, if you look at findList
from the exercises, we could abort iterating over the list as soon as our search was successful. It is not possible to implement this more efficient behavior in terms of leftFold
: There, the result will only be returned when our pattern match reaches the Nil
case.
Interestingly, there is another, non-tail recursive fold, which reflects the list structure more naturally, we can use for breaking out early from an iteration. We call this a right fold. Here is its implementation:
rightFold : (acc : el -> state -> state) -> state -> List el -> state
rightFold acc st [] = st
rightFold acc st (x :: xs) = acc x (rightFold acc st xs)
Now, it might not immediately be obvious how this differs from leftFold
. In order to see this, we will have to talk about lazy evaluation first.
Lazy Evaluation in Idris
For some computations, it is not necessary to evaluate all function arguments in order to return a result. For instance, consider boolean operator (&&)
: If the first argument evaluates to False
, we already know that the result is False
without even looking at the second argument. In such a case, we don't want to unnecessarily evaluate the second argument, as this might include a lengthy computation.
Consider the following REPL session:
Tutorial.Folds> False && (length [1..10000000000] > 100)
False
If the second argument were evaluated, this computation would most certainly blow up your computer's memory, or at least take a very long time to run to completion. However, in this case, the result False
is printed immediately. If you look at the type of (&&)
, you'll see the following:
Tutorial.Folds> :t (&&)
Prelude.&& : Bool -> Lazy Bool -> Bool
As you can see, the second argument is wrapped in a Lazy
type constructor. This is a built-in type, and the details are handled by Idris automatically most of the time. For instance, when passing arguments to (&&)
, we don't have to manually wrap the values in some data constructor. A lazy function argument will only be evaluated at the moment it is required in the function's implementation, for instance, because it is being pattern matched on, or it is being passed as a strict argument to another function. In the implementation of (&&)
, the pattern match happens on the first argument, so the second will only be evaluated if the first argument is True
and the second is returned as the function's (strict) result.
There are two utility functions for working with lazy evaluation: Function delay
wraps a value in the Lazy
data type. Note, that the argument of delay
is strict, so the following might take several seconds to print its result:
Tutorial.Folds> False && (delay $ length [1..10000] > 100)
False
In addition, there is function force
, which forces evaluation of a Lazy
value.
Lazy Evaluation and Right Folds
We will now learn how to make use of rightFold
and lazy evaluation to implement folds, which can break out from iteration early. Note, that in the implementation of rightFold
the result of folding over the remainder of the list is passed as an argument to the accumulator (instead of the result of invoking the accumulator being used in the recursive call):
rightFold acc st (x :: xs) = acc x (rightFold acc st xs)
If the second argument of acc
were lazily evaluated, it would be possible to abort the computation of acc
's result without having to iterate till the end of the list:
foldHead : List a -> Maybe a
foldHead = force . rightFold first Nothing
where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
first v _ = Just v
Note, how Idris takes care of the bookkeeping of laziness most of the time. (It doesn't handle the curried invocation of rightFold
correctly, though, so we either must pass on the list argument of foldHead
explicitly, or compose the curried function with force
to get the types right.)
In order to verify that this works correctly, we need a debugging utility called trace
from module Debug.Trace
. This "function" allows us to print debugging messages to the console at certain points in our pure code. Please note, that this is for debugging purposes only and should never be left lying around in production code, as, strictly speaking, printing stuff to the console breaks referential transparency.
Here is an adjusted version of foldHead
, which prints "folded" to standard output every time utility function first
is being invoked:
foldHeadTraced : List a -> Maybe a
foldHeadTraced = force . rightFold first Nothing
where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
first v _ = trace "folded" (Just v)
In order to test this at the REPL, we need to know that trace
uses unsafePerformIO
internally and therefore will not reduce during evaluation. We have to resort to the :exec
command to see this in action at the REPL:
Tutorial.Folds> :exec printLn $ foldHeadTraced [1..10]
folded
Just 1
As you can see, although the list holds ten elements, first
is only called once resulting in a considerable increase of efficiency.
Let's see what happens, if we change the implementation of first
to use strict evaluation:
foldHeadTracedStrict : List a -> Maybe a
foldHeadTracedStrict = rightFold first Nothing
where first : a -> Maybe a -> Maybe a
first v _ = trace "folded" (Just v)
Although we don't use the second argument in the implementation of first
, it is still being evaluated before evaluating the body of first
, because Idris - unlike Haskell! - defaults to use strict semantics. Here's how this behaves at the REPL:
Tutorial.Folds> :exec printLn $ foldHeadTracedStrict [1..10]
folded
folded
folded
folded
folded
folded
folded
folded
folded
folded
Just 1
While this technique can sometimes lead to very elegant code, always remember that rightFold
is not stack safe in the general case. So, unless your accumulator is guaranteed to return a result after not too many iterations, consider implementing your function tail recursively with an explicit pattern match. Your code will be slightly more verbose, but with the guaranteed benefit of stack safety.
Folds and Monoids
Left and right folds share a common pattern: In both cases, we start with an initial state value and use an accumulator function for combining the current state with the current element. This principle of combining values after starting from an initial value lies at the heart of an interface we've already learned about: Monoid
. It therefore makes sense to fold a list over a monoid:
foldMapList : Monoid m => (a -> m) -> List a -> m
foldMapList f = leftFold (\vm,va => vm <+> f va) neutral
Note how, with foldMapList
, we no longer need to pass an accumulator function. All we need is a conversion from the element type to a type with an implementation of Monoid
. As we have already seen in the chapter about interfaces, there are many monoids in functional programming, and therefore, foldMapList
is an incredibly useful function.
We could make this even shorter: If the elements in our list already are of a type with a monoid implementation, we don't even need a conversion function to collapse the list:
concatList : Monoid m => List m -> m
concatList = foldMapList id
Stop Using List
for Everything
And here we are, finally, looking at a large pile of utility functions all dealing in some way with the concept of collapsing (or folding) a list of values into a single result. But all of these folding functions are just as useful when working with vectors, with non-empty lists, with rose trees, even with single-value containers like Maybe
, Either e
, or Identity
. Heck, for the sake of completeness, they are even useful when working with zero-value containers like Control.Applicative.Const e
! And since there are so many of these functions, we'd better look out for an essential set of them in terms of which we can implement all the others, and wrap up the whole bunch in an interface. This interface is called Foldable
, and is available from the Prelude
. When you look at its definition in the REPL (:doc Foldable
), you'll see that it consists of six essential functions:
foldr
, for folds from the rightfoldl
, for folds from the leftnull
, for testing if the container is empty or notfoldlM
, for effectful folds in a monadtoList
, for converting the container to a list of valuesfoldMap
, for folding over a monoid
For a minimal implementation of Foldable
, it is sufficient to only implement foldr
. However, consider implementing all six functions manually, because folds over container types are often performance critical operations, and each of them should be optimized accordingly. For instance, implementing toList
in terms of foldr
for List
just makes no sense, as this is a non-tail recursive function running in linear time complexity, while a hand-written implementation can just return its argument without any modifications.
Exercises part 3
In these exercises, you are going to implement Foldable
for different data types. Make sure to try and manually implement all six functions of the interface.
-
Implement
Foldable
forCrud i
:data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
-
Implement
Foldable
forResponse e i
:data Response : (e, i, a : Type) -> Type where Created : (id : i) -> (value : a) -> Response e i a Updated : (id : i) -> (value : a) -> Response e i a Found : (values : List a) -> Response e i a Deleted : (id : i) -> Response e i a Error : (err : e) -> Response e i a
-
Implement
Foldable
forList01
. Use tail recursion in the implementations oftoList
,foldMap
, andfoldl
.data List01 : (nonEmpty : Bool) -> Type -> Type where Nil : List01 False a (::) : a -> List01 False a -> List01 ne a
-
Implement
Foldable
forTree
. There is no need to use tail recursion in your implementations, but your functions must be accepted by the totality checker, and you are not allowed to cheat by usingassert_smaller
orassert_total
.Hint: You can test the correct behavior of your implementations by running the same folds on the result of
treeToVect
and verify that the outcome is the same. -
Like
Functor
andApplicative
,Foldable
composes: The product and composition of two foldable container types are again foldable container types. Proof this by implementingFoldable
forComp
andProduct
:record Comp (f,g : Type -> Type) (a : Type) where constructor MkComp unComp : f (g a) record Product (f,g : Type -> Type) (a : Type) where constructor MkProduct fst : f a snd : g a
Conclusion
We learned a lot about recursion, totality checking, and folds in this chapter, all of which are important concepts in pure functional programming in general. Wrapping one's head around recursion takes time and experience. Therefore - as usual - try to solve as many exercises as you can.
In the next chapter, we are taking the concept of iterating over container types one step further and look at effectful data traversals.
Effectful Traversals
In this chapter, we are going to bring our treatment of the higher-kinded interfaces in the Prelude to an end. In order to do so, we will continue developing the CSV reader we started implementing in chapter Functor and Friends. I moved some of the data types and interfaces from that chapter to their own modules, so we can import them here without the need to start from scratch.
Note that unlike in our original CSV reader, we will use Validated
instead of Either
for handling exceptions, since this will allow us to accumulate all errors when reading a CSV file.
Reading CSV Tables
module Tutorial.Traverse.CSV
import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV
%default total
We stopped developing our CSV reader with function hdecode
, which allows us to read a single line in a CSV file and decode it to a heterogeneous list. As a reminder, here is how to use hdecode
at the REPL:
Tutorial.Traverse> hdecode [Bool,String,Bits8] 1 "f,foo,12"
Valid [False, "foo", 12]
The next step will be to parse a whole CSV table, represented as a list of strings, where each string corresponds to one of the table's rows. We will go about this stepwise as there are several aspects about doing this properly. What we are looking for - eventually - is a function of the following type (we are going to implement several versions of this function, hence the numbering):
hreadTable1 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
In our first implementation, we are not going to care about line numbers:
hreadTable1 _ [] = pure []
hreadTable1 ts (s :: ss) = [| hdecode ts 0 s :: hreadTable1 ts ss |]
Note, how we can just use applicative syntax in the implementation of hreadTable1
. To make this clearer, I used pure []
on the first line instead of the more specific Valid []
. In fact, if we used Either
or Maybe
instead of Validated
for error handling, the implementation of hreadTable1
would look exactly the same.
The question is: Can we extract a pattern to abstract over from this observation? What we do in hreadTable1
is running an effectful computation of type String -> Validated CSVError (HList ts)
over a list of strings, so that the result is a list of HList ts
wrapped in a Validated CSVError
. The first step of abstraction should be to use type parameters for the input and output: Run a computation of type a -> Validated CSVError b
over a list List a
:
traverseValidatedList : (a -> Validated CSVError b)
-> List a
-> Validated CSVError (List b)
traverseValidatedList _ [] = pure []
traverseValidatedList f (x :: xs) = [| f x :: traverseValidatedList f xs |]
hreadTable2 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
hreadTable2 ts = traverseValidatedList (hdecode ts 0)
But our observation was, that the implementation of hreadTable1
would be exactly the same if we used Either CSVError
or Maybe
as our effect types instead of Validated CSVError
. So, the next step should be to abstract over the effect type. We note, that we used applicative syntax (idiom brackets and pure
) in our implementation, so we will need to write a function with an Applicative
constraint on the effect type:
traverseList : Applicative f => (a -> f b) -> List a -> f (List b)
traverseList _ [] = pure []
traverseList f (x :: xs) = [| f x :: traverseList f xs |]
hreadTable3 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
hreadTable3 ts = traverseList (hdecode ts 0)
Note, how the implementation of traverseList
is exactly the same as the one of traverseValidatedList
, but the types are more general and therefore, traverseList
is much more powerful.
Let's give this a go at the REPL:
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,0"]
Valid [[False, 12], [True, 0]]
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,1000"]
Invalid (FieldError 0 2 "1000")
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["1,12","t,1000"]
Invalid (Append (FieldError 0 1 "1") (FieldError 0 2 "1000"))
This works very well already, but note how our error messages do not yet print the correct line numbers. That's not surprising, as we are using a dummy constant in our call to hdecode
. We will look at how we can come up with the line numbers on the fly when we talk about stateful computations later in this chapter. For now, we could just manually annotate the lines with their numbers and pass a list of pairs to hreadTable
:
hreadTable4 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List (Nat, String)
-> Validated CSVError (List $ HList ts)
hreadTable4 ts = traverseList (uncurry $ hdecode ts)
If this is the first time you came across function uncurry
, make sure you have a look at its type and try to figure out why it is used here. There are several utility functions like this in the Prelude, such as curry
, uncurry
, flip
, or even id
, all of which can be very useful when working with higher-order functions.
While not perfect, this version at least allows us to verify at the REPL that the line numbers are passed to the error messages correctly:
Tutorial.Traverse> hreadTable4 [Bool,Bits8] [(1,"t,1000"),(2,"1,100")]
Invalid (Append (FieldError 1 2 "1000") (FieldError 2 1 "1"))
Interface Traversable
Now, here is an interesting observation: We can implement a function like traverseList
for other container types as well. You might think that's obvious, given that we can convert container types to lists via function toList
from interface Foldable
. However, while going via List
might be feasible in some occasions, it is undesirable in general, as we loose typing information. For instance, here is such a function for Vect
:
traverseVect' : Applicative f => (a -> f b) -> Vect n a -> f (List b)
traverseVect' fun = traverseList fun . toList
Note how we lost all information about the structure of the original container type. What we are looking for is a function like traverseVect'
, which keeps this type level information: The result should be a vector of the same length as the input.
traverseVect : Applicative f => (a -> f b) -> Vect n a -> f (Vect n b)
traverseVect _ [] = pure []
traverseVect fun (x :: xs) = [| fun x :: traverseVect fun xs |]
That's much better! And as I wrote above, we can easily get the same for other container types like List1
, SnocList
, Maybe
, and so on. As usual, some derived functions will follow immediately from traverseXY
. For instance:
sequenceList : Applicative f => List (f a) -> f (List a)
sequenceList = traverseList id
All of this calls for a new interface, which is called Traversable
and is exported from the Prelude. Here is its definition (with primes for disambiguation):
interface Functor t => Foldable t => Traversable' t where
traverse' : Applicative f => (a -> f b) -> t a -> f (t b)
Function traverse
is one of the most abstract and versatile functions available from the Prelude. Just how powerful it is will only become clear once you start using it over and over again in your code. However, it will be the goal of the remainder of this chapter to show you several diverse and interesting use cases.
For now, we will quickly focus on the degree of abstraction. Function traverse
is parameterized over no less than four parameters: The container type t
(List
, Vect n
, Maybe
, to just name a few), the effect type (Validated e
, IO
, Maybe
, and so on), the input element type a
, and the output element type b
. Considering that the libraries bundled with the Idris project export more than 30 data types with an implementation of Applicative
and more than ten traversable container types, there are literally hundreds of combinations for traversing a container with an effectful computation. This number gets even larger once we realize that traversable containers - like applicative functors - are closed under composition (see the exercises and the final section in this chapter).
Traversable Laws
There are two laws function traverse
must obey:
traverse (Id . f) = Id . map f
: Traversing over theIdentity
monad is just functormap
.traverse (MkComp . map f . g) = MkComp . map (traverse f) . traverse g
: Traversing with a composition of effects must be the same when being done in a single traversal (left hand side) or a sequence of two traversals (right hand side).
Since map id = id
(functor's identity law), we can derive from the first law that traverse Id = Id
. This means, that traverse
must not change the size or shape of the container type, nor is it allowed to change the order of elements.
Exercises part 1
-
It is interesting that
Traversable
has aFunctor
constraint. Proof that everyTraversable
is automatically aFunctor
by implementingmap
in terms oftraverse
.Hint: Remember
Control.Monad.Identity
. -
Likewise, proof that every
Traversable
is aFoldable
by implementingfoldMap
in terms ofTraverse
.Hint: Remember
Control.Applicative.Const
. -
To gain some routine, implement
Traversable'
forList1
,Either e
, andMaybe
. -
Implement
Traversable
forList01 ne
:data List01 : (nonEmpty : Bool) -> Type -> Type where Nil : List01 False a (::) : a -> List01 False a -> List01 ne a
-
Implement
Traversable
for rose trees. Try to satisfy the totality checker without cheating.record Tree a where constructor Node value : a forest : List (Tree a)
-
Implement
Traversable
forCrud i
:data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
-
Implement
Traversable
forResponse e i
:data Response : (e, i, a : Type) -> Type where Created : (id : i) -> (value : a) -> Response e i a Updated : (id : i) -> (value : a) -> Response e i a Found : (values : List a) -> Response e i a Deleted : (id : i) -> Response e i a Error : (err : e) -> Response e i a
-
Like
Functor
,Applicative
andFoldable
,Traversable
is closed under composition. Proof this by implementingTraversable
forComp
andProduct
:record Comp (f,g : Type -> Type) (a : Type) where constructor MkComp unComp : f (g a) record Product (f,g : Type -> Type) (a : Type) where constructor MkProduct fst : f a snd : g a
Programming with State
module Tutorial.Traverse.State
import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV
%default total
Let's go back to our CSV reader. In order to get reasonable error messages, we'd like to tag each line with its index:
zipWithIndex : List a -> List (Nat, a)
It is, of course, very easy to come up with an ad hoc implementation for this:
zipWithIndex = go 1
where go : Nat -> List a -> List (Nat,a)
go _ [] = []
go n (x :: xs) = (n,x) :: go (S n) xs
While this is perfectly fine, we should still note that we might want to do the same thing with the elements of trees, vectors, non-empty lists and so on. And again, we are interested in whether there is some form of abstraction we can use to describe such computations.
Mutable References in Idris
Let us for a moment think about how we'd do such a thing in an imperative language. There, we'd probably define a local (mutable) variable to keep track of the current index, which would then be increased while iterating over the list in a for
- or while
-loop.
In Idris, there is no such thing as mutable state. Or is there? Remember, how we used a mutable reference to simulate a data base connection in an earlier exercise. There, we actually used some truly mutable state. However, since accessing or modifying a mutable variable is not a referential transparent operation, such actions have to be performed within IO
. Other than that, nothing keeps us from using mutable variables in our code. The necessary functionality is available from module Data.IORef
from the base library.
As a quick exercise, try to implement a function, which - given an IORef Nat
- pairs a value with the current index and increases the index afterwards.
Here's how I would do this:
pairWithIndexIO : IORef Nat -> a -> IO (Nat,a)
pairWithIndexIO ref va = do
ix <- readIORef ref
writeIORef ref (S ix)
pure (ix,va)
Note, that every time we run pairWithIndexIO ref
, the natural number stored in ref
is incremented by one. Also, look at the type of pairWithIndexIO ref
: a -> IO (Nat,a)
. We want to apply this effectful computation to each element in a list, which should lead to a new list wrapped in IO
, since all of this describes a single computation with side effects. But this is exactly what function traverse
does: Our input type is a
, our output type is (Nat,a)
, our container type is List
, and the effect type is IO
!
zipListWithIndexIO : IORef Nat -> List a -> IO (List (Nat,a))
zipListWithIndexIO ref = traverse (pairWithIndexIO ref)
Now this is really powerful: We could apply the same function to any traversable data structure. It therefore makes absolutely no sense to specialize zipListWithIndexIO
to lists only:
zipWithIndexIO : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO ref = traverse (pairWithIndexIO ref)
To please our intellectual minds even more, here is the same function in point-free style:
zipWithIndexIO' : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO' = traverse . pairWithIndexIO
All that's left to do now is to initialize a new mutable variable before passing it to zipWithIndexIO
:
zipFromZeroIO : Traversable t => t a -> IO (t (Nat,a))
zipFromZeroIO ta = newIORef 0 >>= (`zipWithIndexIO` ta)
Quickly, let's give this a go at the REPL:
> :exec zipFromZeroIO {t = List} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]
> :exec zipFromZeroIO (Just 12) >>= printLn
Just (0, 12)
> :exec zipFromZeroIO {t = Vect 2} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]
Thus, we solved the problem of tagging each element with its index once and for all for all traversable container types.
The State Monad
Alas, while the solution presented above is elegant and performs very well, it still carries its IO
stain, which is fine if we are already in IO
land, but unacceptable otherwise. We do not want to make our otherwise pure functions much harder to test and reason about just for a simple case of stateful element tagging.
Luckily, there is an alternative to using a mutable reference, which allows us to keep our computations pure and untainted. However, it is not easy to come upon this alternative on one's own, and it can be hard to figure out what's going on here, so I'll try to introduce this slowly. We first need to ask ourselves what the essence of a "stateful" but otherwise pure computation is. There are two essential ingredients:
- Access to the current state. In case of a pure function, this means that the function should take the current state as one of its arguments.
- Ability to communicate the updated state to later stateful computations. In case of a pure function this means, that the function will return a pair of values: The computation's result plus the updated state.
These two prerequisites lead to the following generic type for a pure, stateful computation operating on state type st
and producing values of type a
:
Stateful : (st : Type) -> (a : Type) -> Type
Stateful st a = st -> (st, a)
Our use case is pairing elements with indices, which can be implemented as a pure, stateful computation like so:
pairWithIndex' : a -> Stateful Nat (Nat,a)
pairWithIndex' v index = (S index, (index,v))
Note, how we at the same time increment the index, returning the incremented value as the new state, while pairing the first argument with the original index.
Now, here is an important thing to note: While Stateful
is a useful type alias, Idris in general does not resolve interface implementations for function types. If we want to write a small library of utility functions around such a type, it is therefore best to wrap it in a single-constructor data type and use this as our building block for writing more complex computations. We therefore introduce record State
as a wrapper for pure, stateful computations:
public export
record State st a where
constructor ST
runST : st -> (st,a)
We can now implement pairWithIndex
in terms of State
like so:
export
pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = ST $ \index => (S index, (index, v))
In addition, we can define some more utility functions. Here's one for getting the current state without modifying it (this corresponds to readIORef
):
get : State st st
get = ST $ \s => (s,s)
Here are two others, for overwriting the current state. These corresponds to writeIORef
and modifyIORef
:
put : st -> State st ()
put v = ST $ \_ => (v,())
modify : (st -> st) -> State st ()
modify f = ST $ \v => (f v,())
Finally, we can define three functions in addition to runST
for running stateful computations
runState : st -> State st a -> (st, a)
runState = flip runST
export
evalState : st -> State st a -> a
evalState s = snd . runState s
execState : st -> State st a -> st
execState s = fst . runState s
All of these are useful on their own, but the real power of State s
comes from the observation that it is a monad. Before you go on, please spend some time and try implementing Functor
, Applicative
, and Monad
for State s
yourself. Even if you don't succeed, you will have an easier time understanding how the implementations below work.
export
Functor (State st) where
map f (ST run) = ST $ \s => let (s2,va) = run s in (s2, f va)
export
Applicative (State st) where
pure v = ST $ \s => (s,v)
ST fun <*> ST val = ST $ \s =>
let (s2, f) = fun s
(s3, va) = val s2
in (s3, f va)
export
Monad (State st) where
ST val >>= f = ST $ \s =>
let (s2, va) = val s
in runST (f va) s2
This may take some time to digest, so we come back to it in a slightly advanced exercise. The most important thing to note is, that we use every state value only ever once. We must make sure that the updated state is passed to later computations, otherwise the information about state updates is being lost. This can best be seen in the implementation of Applicative
: The initial state, s
, is used in the computation of the function value, which will also return an updated state, s2
, which is then used in the computation of the function argument. This will again return an updated state, s3
, which is passed on to later stateful computations together with the result of applying f
to va
.
Exercises part 2
This sections consists of two extended exercise, the aim of which is to increase your understanding of the state monad. In the first exercise, we will look at random value generation, a classical application of stateful computations. In the second exercise, we will look at an indexed version of a state monad, which allows us to not only change the state's value but also its type during computations.
-
Below is the implementation of a simple pseudo-random number generator. We call this a pseudo-random number generator, because the numbers look pretty random but are generated predictably. If we initialize a series of such computations with a truly random seed, most users of our library will not be able to predict the outcome of our computations.
rnd : Bits64 -> Bits64 rnd seed = fromInteger $ (437799614237992725 * cast seed) `mod` 2305843009213693951
The idea here is that the next pseudo-random number gets calculated from the previous one. But once we think about how we can use these numbers as seeds for computing random values of other types, we realize that these are just stateful computations. We can therefore write down an alias for random value generators as stateful computations:
Gen : Type -> Type Gen = State Bits64
Before we begin, please note that
rnd
is not a very strong pseudo-random number generator. It will not generate values in the full 64bit range, nor is it safe to use in cryptographic applications. It is sufficient for our purposes in this chapter, however. Note also, that we could replacernd
with a stronger generator without any changes to the functions you will implement as part of this exercise.-
Implement
bits64
in terms ofrnd
. This should return the current state, updating it afterwards by invoking functionrnd
. Make sure the state is properly updated, otherwise this won't behave as expected.bits64 : Gen Bits64
This will be our only primitive generator, from which we will derived all the others. Therefore, before you continue, quickly test your implementation of
bits64
at the REPL:Solutions.Traverse> runState 100 bits64 (2274787257952781382, 100)
-
Implement
range64
for generating random values in the range[0,upper]
. Hint: Usebits64
andmod
in your implementation but make sure to deal with the fact thatmod x upper
produces values in the range[0,upper)
.range64 : (upper : Bits64) -> Gen Bits64
Likewise, implement
interval64
for generating values in the range[min a b, max a b]
:interval64 : (a,b : Bits64) -> Gen Bits64
Finally, implement
interval
for arbitrary integral types.interval : Num n => Cast n Bits64 => (a,b : n) -> Gen n
Note, that
interval
will not generate all possible values in the given interval but only such values with aBits64
representation in the the range[0,2305843009213693950]
. -
Implement a generator for random boolean values.
-
Implement a generator for
Fin n
. You'll have to think carefully about getting this one to typecheck and be accepted by the totality checker without cheating. Note: Have a look at functionData.Fin.natToFin
. -
Implement a generator for selecting a random element from a vector of values. Use the generator from exercise 4 in your implementation.
-
Implement
vect
andlist
. In case oflist
, the first argument should be used to randomly determine the length of the list.vect : {n : _} -> Gen a -> Gen (Vect n a) list : Gen Nat -> Gen a -> Gen (List a)
Use
vect
to implement utility functiontestGen
for testing your generators at the REPL:testGen : Bits64 -> Gen a -> Vect 10 a
-
Implement
choice
.choice : {n : _} -> Vect (S n) (Gen a) -> Gen a
-
Implement
either
.either : Gen a -> Gen b -> Gen (Either a b)
-
Implement a generator for printable ASCII characters. These are characters with ASCII codes in the interval
[32,126]
. Hint: Functionchr
from the Prelude will be useful here. -
Implement a generator for strings. Hint: Function
pack
from the Prelude might be useful for this.string : Gen Nat -> Gen Char -> Gen String
-
We shouldn't forget about our ability to encode interesting things in the types in Idris, so, for a challenge and without further ado, implement
hlist
(note the distinction betweenHListF
andHList
). If you are rather new to dependent types, this might take a moment to digest, so don't forget to use holes.data HListF : (f : Type -> Type) -> (ts : List Type) -> Type where Nil : HListF f [] (::) : (x : f t) -> (xs : HLift f ts) -> HListF f (t :: ts) hlist : HListF Gen ts -> Gen (HList ts)
-
Generalize
hlist
to work with any applicative functor, not justGen
.
If you arrived here, please realize how we can now generate pseudo-random values for most primitives, as well as regular sum- and product types. Here is an example REPL session:
> testGen 100 $ hlist [bool, printableAscii, interval 0 127] [[True, ';', 5], [True, '^', 39], [False, 'o', 106], [True, 'k', 127], [False, ' ', 11], [False, '~', 76], [True, 'M', 11], [False, 'P', 107], [True, '5', 67], [False, '8', 9]]
Final remarks: Pseudo-random value generators play an important role in property based testing libraries like QuickCheck or Hedgehog. The idea of property based testing is to test predefined properties of pure functions against a large number of randomly generated arguments, to get strong guarantees about these properties to hold for all possible arguments. One example would be a test for verifying that the result of reversing a list twice equals the original list. While it is possible to proof many of the simpler properties in Idris directly without the need for tests, this is no longer possible as soon as functions are involved, which don't reduce during unification such as foreign function calls or functions not publicly exported from other modules.
-
-
While
State s a
gives us a convenient way to talk about stateful computations, it only allows us to mutate the state's value but not its type. For instance, the following function cannot be encapsulated inState
because the type of the state changes:uncons : Vect (S n) a -> (Vect n a, a) uncons (x :: xs) = (xs, x)
Your task is to come up with a new state type allowing for such changes (sometimes referred to as an indexed state data type). The goal of this exercise is to also sharpen your skills in expressing things at the type level including derived function types and interfaces. Therefore, I will give only little guidance on how to go about this. If you get stuck, feel free to peek at the solutions but make sure to only look at the types at first.
-
Come up with a parameterized data type for encapsulating stateful computations where the input and output state type can differ. It must be possible to wrap
uncons
in a value of this type. -
Implement
Functor
for your indexed state type. -
It is not possible to implement
Applicative
for this indexed state type (but see also exercise 2.vii). Still, implement the necessary functions to use it with idom brackets. -
It is not possible to implement
Monad
for this indexed state type. Still, implement the necessary functions to use it in do blocks. -
Generalize the functions from exercises 3 and 4 with two new interfaces
IxApplicative
andIxMonad
and provide implementations of these for your indexed state data type. -
Implement functions
get
,put
,modify
,runState
,evalState
, andexecState
for the indexed state data type. Make sure to adjust the type parameters where necessary. -
Show that your indexed state type is strictly more powerful than
State
by implementingApplicative
andMonad
for it.Hint: Keep the input and output state identical. Note also, that you might need to implement
join
manually if Idris has trouble inferring the types correctly.
Indexed state types can be useful when we want to make sure that stateful computations are combined in the correct sequence, or that scarce resources get cleaned up properly. We might get back to such use cases in later examples.
-
The Power of Composition
module Tutorial.Traverse.Composition
import Tutorial.Traverse.State
import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV
%default total
After our excursion into the realms of stateful computations, we will go back and combine mutable state with error accumulation to tag and read CSV lines in a single traversal. We already defined pairWithIndex
for tagging lines with their indices. We also have uncurry $ hdecode ts
for decoding single tagged lines. We can now combine the two effects in a single computation:
tagAndDecode : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> State Nat (Validated CSVError (HList ts))
tagAndDecode ts s = uncurry (hdecode ts) <$> pairWithIndex s
Now, as we learned before, applicative functors are closed under composition, and the result of tagAndDecode
is a nesting of two applicatives: State Nat
and Validated CSVError
. The Prelude exports a corresponding named interface implementation (Prelude.Applicative.Compose
), which we can use for traversing a list of strings with tagAndDecode
. Remember, that we have to provide named implementations explicitly. Since traverse
has the applicative functor as its second constraint, we also need to provide the first constraint (Traversable
) explicitly. But this is going to be the unnamed default implementation! To get our hands on such a value, we can use the %search
pragma:
readTable : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
readTable ts = evalState 1 . traverse @{%search} @{Compose} (tagAndDecode ts)
This tells Idris to use the default implementation for the Traversable
constraint, and Prelude.Applicatie.Compose
for the Applicative
constraint. While this syntax is not very nice, it doesn't come up too often, and if it does, we can improve things by providing custom functions for better readability:
traverseComp : Traversable t
=> Applicative f
=> Applicative g
=> (a -> f (g b))
-> t a
-> f (g (t b))
traverseComp = traverse @{%search} @{Compose}
readTable' : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
readTable' ts = evalState 1 . traverseComp (tagAndDecode ts)
Note, how this allows us to combine two computational effects (mutable state and error accumulation) in a single list traversal.
But I am not yet done demonstrating the power of composition. As you showed in one of the exercises, Traversable
is also closed under composition, so a nesting of traversables is again a traversable. Consider the following use case: When reading a CSV file, we'd like to allow lines to be annotated with additional information. Such annotations could be mere comments but also some formatting instructions or other custom data tags might be feasible. Annotations are supposed to be separated from the rest of the content by a single hash character (#
). We want to keep track of these optional annotations so we come up with a custom data type encapsulating this distinction:
data Line : Type -> Type where
Annotated : String -> a -> Line a
Clean : a -> Line a
This is just another container type and we can easily implement Traversable
for Line
(do this yourself as a quick exercise):
Functor Line where
map f (Annotated s x) = Annotated s $ f x
map f (Clean x) = Clean $ f x
Foldable Line where
foldr f acc (Annotated _ x) = f x acc
foldr f acc (Clean x) = f x acc
Traversable Line where
traverse f (Annotated s x) = Annotated s <$> f x
traverse f (Clean x) = Clean <$> f x
Below is a function for parsing a line and putting it in its correct category. For simplicity, we just split the line on hashes: If the result consists of exactly two strings, we treat the second part as an annotation, otherwise we treat the whole line as untagged CSV content.
readLine : String -> Line String
readLine s = case split ('#' ==) s of
h ::: [t] => Annotated t h
_ => Clean s
We are now going to implement a function for reading whole CSV tables, keeping track of line annotations:
readCSV : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> Validated CSVError (List $ Line $ HList ts)
readCSV ts = evalState 1
. traverse @{Compose} @{Compose} (tagAndDecode ts)
. map readLine
. lines
Let's digest this monstrosity. This is written in point-free style, so we have to read it from end to beginning. First, we split the whole string at line breaks, getting a list of strings (function Data.String.lines
). Next, we analyze each line, keeping track of optional annotations (map readLine
). This gives us a value of type List (Line String)
. Since this is a nesting of traversables, we invoke traverse
with a named instance from the Prelude: Prelude.Traversable.Compose
. Idris can disambiguate this based on the types, so we can drop the namespace prefix. But the effectful computation we run over the list of lines results in a composition of applicative functors, so we also need the named implementation for compositions of applicatives in the second constraint (again without need of an explicit prefix, which would be Prelude.Applicative
here). Finally, we evaluate the stateful computation with evalState 1
.
Honestly, I wrote all of this without verifying if it works, so let's give it a go at the REPL. I'll provide two example strings for this, a valid one without errors, and an invalid one. I use multiline string literals here, about which I'll talk in more detail in a later chapter. For the moment, note that these allow us to conveniently enter string literals with line breaks:
validInput : String
validInput = """
f,12,-13.01#this is a comment
t,100,0.0017
t,1,100.8#color: red
f,255,0.0
f,24,1.12e17
"""
invalidInput : String
invalidInput = """
o,12,-13.01#another comment
t,100,0.0017
t,1,abc
f,256,0.0
f,24,1.12e17
"""
And here's how it goes at the REPL:
Tutorial.Traverse> readCSV [Bool,Bits8,Double] validInput
Valid [Annotated "this is a comment" [False, 12, -13.01],
Clean [True, 100, 0.0017],
Annotated "color: red" [True, 1, 100.8],
Clean [False, 255, 0.0],
Clean [False, 24, 1.12e17]]
Tutorial.Traverse> readCSV [Bool,Bits8,Double] invalidInput
Invalid (Append (FieldError 1 1 "o")
(Append (FieldError 3 3 "abc") (FieldError 4 2 "256")))
It is pretty amazing how we wrote dozens of lines of code, always being guided by the type- and totality checkers, arriving eventually at a function for parsing properly typed CSV tables with automatic line numbering and error accumulation, all of which just worked on first try.
Exercises part 3
The Prelude provides three additional interfaces for container types parameterized over two type parameters such as Either
or Pair
: Bifunctor
, Bifoldable
, and Bitraversable
. In the following exercises we get some hands-one experience working with these. You are supposed to look up what functions they provide and how to implement and use them yourself.
-
Assume we'd like to not only interpret CSV content but also the optional comment tags in our CSV files. For this, we could use a data type such as
Tagged
:data Tagged : (tag, value : Type) -> Type where Tag : tag -> value -> Tagged tag value Pure : value -> Tagged tag value
Implement interfaces
Functor
,Foldable
, andTraversable
but alsoBifunctor
,Bifoldable
, andBitraversable
forTagged
. -
Show that the composition of a bifunctor with two functors such as
Either (List a) (Maybe b)
is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation ofBifunctor
. Likewise forBifoldable
/Foldable
andBitraversable
/Traversable
. -
Show that the composition of a functor with a bifunctor such as
List (Either a b)
is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation ofBifunctor
. Likewise forBifoldable
/Foldable
andBitraversable
/Traversable
. -
We are now going to adjust
readCSV
in such a way that it decodes comment tags and CSV content in a single traversal. We need a new error type to include invalid tags for this:data TagError : Type where CE : CSVError -> TagError InvalidTag : (line : Nat) -> (tag : String) -> TagError Append : TagError -> TagError -> TagError Semigroup TagError where (<+>) = Append
For testing, we also define a simple data type for color tags:
data Color = Red | Green | Blue
You should now implement the following functions, but please note that while
readColor
will need to access the current line number in case of an error, it must not increase it, as otherwise line numbers will be wrong in the invocation oftagAndDecodeTE
.readColor : String -> State Nat (Validated TagError Color) readTaggedLine : String -> Tagged String String tagAndDecodeTE : (0 ts : List Type) -> CSVLine (HList ts) => String -> State Nat (Validated TagError (HList ts))
Finally, implement
readTagged
by using the wrapper type from exercise 3 as well asreadColor
andtagAndDecodeTE
in a call tobitraverse
. The implementation will look very similar toreadCSV
but with some additional wrapping and unwrapping at the right places.readTagged : (0 ts : List Type) -> CSVLine (HList ts) => String -> Validated TagError (List $ Tagged Color $ HList ts)
Test your implementation with some example strings at the REPL.
You can find more examples for functor/bifunctor compositions in Haskell's bifunctors package.
Conclusion
Interface Traversable
and its main function traverse
are incredibly powerful forms of abstraction - even more so, because both Applicative
and Traversable
are closed under composition. If you are interested in additional use cases, the publication, which introduced Traversable
to Haskell, is a highly recommended read: The Essence of the Iterator Pattern
The base library provides an extended version of the state monad in module Control.Monad.State
. We will look at this in more detail when we talk about monad transformers. Please note also, that IO
itself is implemented as a simple state monad over an abstract, primitive state type: %World
.
Here's a short summary of what we learned in this chapter:
- Function
traverse
is used to run effectful computations over container types without affecting their size or shape. - We can use
IORef
as mutable references in stateful computations running inIO
. - For referentially transparent computations with "mutable" state, the
State
monad is extremely useful. - Applicative functors are closed under composition, so we can run several effectful computations in a single traversal.
- Traversables are also closed under composition, so we can use
traverse
to operate on a nesting of containers.
For now, this concludes our introduction of the Prelude's higher-kinded interfaces, which started with the introduction of Functor
, Applicative
, and Monad
, before moving on to Foldable
, and - last but definitely not least - Traversable
. There's one still missing - Alternative
- but this will have to wait a bit longer, because we need to first make our brains smoke with some more type-level wizardry.
Sigma Types
So far in our examples of dependently typed programming, type indices such as the length of vectors were known at compile time or could be calculated from values known at compile time. In real applications, however, such information is often not available until runtime, where values depend on the decisions made by users or the state of the surrounding world. For instance, if we store a file's content as a vector of lines of text, the length of this vector is in general unknown until the file has been loaded into memory. As a consequence, the types of values we work with depend on other values only known at runtime, and we can often only figure out these types by pattern matching on the values they depend on. To express these dependencies, we need so called sigma types: Dependent pairs and their generalization, dependent records.
Dependent Pairs
module Tutorial.DPair.DPair
import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect
import Text.CSV
%default total
We've already seen several examples of how useful the length index of a vector is to describe more precisely in the types what a function can and can't do. For instance, map
or traverse
operating on a vector will return a vector of exactly the same length. The types guarantee that this is true, therefore the following function is perfectly safe and provably total:
parseAndDrop : Vect (3 + n) String -> Maybe (Vect n Nat)
parseAndDrop = map (drop 3) . traverse parsePositive
Since the argument of traverse parsePositive
is of type Vect (3 + n) String
, its result will be of type Maybe (Vect (3 + n) Nat)
. It is therefore safe to use this in a call to drop 3
. Note, how all of this is known at compile time: We encoded the prerequisite that the first argument is a vector of at least three elements in the length index and could derive the length of the result from this.
Vectors of Unknown Length
However, this is not always possible. Consider the following function, defined on List
and exported by Data.List
:
Tutorial.Relations> :t takeWhile
Data.List.takeWhile : (a -> Bool) -> List a -> List a
This will take the longest prefix of the list argument, for which the given predicate returns True
. In this case, it depends on the list elements and the predicate, how long this prefix will be. Can we write such a function for vectors? Let's give it a try:
takeWhile' : (a -> Bool) -> Vect n a -> Vect m a
Go ahead, and try to implement this. Don't try too long, as you will not be able to do so in a provably total way. The question is: What is the problem here? In order to understand this, we have to realize what the type of takeWhile'
promises: "For all predicates operating on values on type a
, and for all vectors holding values of this type, and for all lengths m
, I give you a vector of length m
holding values of type a
". All three arguments are said to be universally quantified: The caller of our function is free to choose the predicate, the input vector, the type of values the vector holds, and the length of the output vector. Don't believe me? See here:
-- This looks like trouble: We got a non-empty vector of `Void`...
voids : Vect 7 Void
voids = takeWhile' (const True) []
-- ...from which immediately follows a proof of `Void`
proofOfVoid : Void
proofOfVoid = head voids
See how I could freely decide on the value of m
when invoking takeWhile'
? Although I passed takeWhile'
an empty vector (the only existing vector holding values of type Void
), the function's type promises me to return a possibly non-empty vector holding values of the same type, from which I freely extracted the first one.
Luckily, Idris doesn't allow this: We won't be able to implement takeWhile'
without cheating (for instance, by turning totality checking off and looping forever). So, the question remains, how to express the result of takeWhile'
in a type. The answer to this is: "Use a dependent pair", a vector paired with a value corresponding to its length.
record AnyVect a where
constructor MkAnyVect
length : Nat
vect : Vect length a
This corresponds to existential quantification in predicate logic: There is a natural number, which corresponds to the length of the vector I have here. Note, how from the outside of AnyVect a
, the length of the wrapped vector is no longer visible at the type level but we can still inspect it and learn something about it at runtime, since it is wrapped up together with the actual vector. We can implement takeWhile
in such a way that it returns a value of type AnyVect a
:
takeWhile : (a -> Bool) -> Vect n a -> AnyVect a
takeWhile f [] = MkAnyVect 0 []
takeWhile f (x :: xs) = case f x of
False => MkAnyVect 0 []
True => let MkAnyVect n ys = takeWhile f xs in MkAnyVect (S n) (x :: ys)
This works in a provably total way, because callers of this function can no longer choose the length of the resulting vector themselves. Our function, takeWhile
, decides on this length and returns it together with the vector, and the type checker verifies that we make no mistakes when pairing the two values. In fact, the length can be inferred automatically by Idris, so we can replace it with underscores, if we so desire:
takeWhile2 : (a -> Bool) -> Vect n a -> AnyVect a
takeWhile2 f [] = MkAnyVect _ []
takeWhile2 f (x :: xs) = case f x of
False => MkAnyVect 0 []
True => let MkAnyVect _ ys = takeWhile2 f xs in MkAnyVect _ (x :: ys)
To summarize: Parameters in generic function types are universally quantified, and their values can be decided on at the call site of such functions. Dependent record types allow us to describe existentially quantified values. Callers cannot choose such values freely: They are returned as part of a function's result.
Note, that Idris allows us to be explicit about universal quantification. The type of takeWhile'
can also be written like so:
takeWhile'' : forall a, n, m . (a -> Bool) -> Vect n a -> Vect m a
Universally quantified arguments are desugared to implicit erased arguments by Idris. The above is a less verbose version of the following function type, the likes of which we have seen before:
takeWhile''' : {0 a : _}
-> {0 n : _}
-> {0 m : _}
-> (a -> Bool)
-> Vect n a
-> Vect m a
In Idris, we are free to choose whether we want to be explicit about universal quantification. Sometimes it can help understanding what's going on at the type level. Other languages - for instance PureScript - are more strict about this: There, explicit annotations on universally quantified parameters are mandatory.
The Essence of Dependent Pairs
It can take some time and experience to understand what's going on here. At least in my case, it took many sessions programming in Idris, before I figured out what dependent pairs are about: They pair a value of some type with a second value of a type calculated from the first value. For instance, a natural number n
(the value) paired with a vector of length n
(the second value, the type of which depends on the first value). This is such a fundamental concept of programming with dependent types, that a general dependent pair type is provided by the Prelude. Here is its implementation (primed for disambiguation):
record DPair' (a : Type) (p : a -> Type) where
constructor MkDPair'
fst : a
snd : p fst
It is essential to understand what's going on here. There are two parameters: A type a
, and a function p
, calculating a type from a value of type a
. Such a value (fst
) is then used to calculate the type of the second value (snd
). For instance, here is AnyVect a
represented as a DPair
:
AnyVect' : (a : Type) -> Type
AnyVect' a = DPair Nat (\n => Vect n a)
Note, how \n => Vect n a
is a function from Nat
to Type
. Idris provides special syntax for describing dependent pairs, as they are important building blocks for programming in languages with first class types:
AnyVect'' : (a : Type) -> Type
AnyVect'' a = (n : Nat ** Vect n a)
We can inspect at the REPL, that the right hand side of AnyVect''
get's desugared to the right hand side of AnyVect'
:
Tutorial.Relations> (n : Nat ** Vect n Int)
DPair Nat (\n => Vect n Int)
Idris can infer, that n
must be of type Nat
, so we can drop this information. (We still need to put the whole expression in parentheses.)
AnyVect3 : (a : Type) -> Type
AnyVect3 a = (n ** Vect n a)
This allows us to pair a natural number n
with a vector of length n
, which is exactly what we did with AnyVect
. We can therefore rewrite takeWhile
to return a DPair
instead of our custom type AnyVect
. Note, that like with regular pairs, we can use the same syntax (x ** y)
for creating and pattern matching on dependent pairs:
takeWhile3 : (a -> Bool) -> Vect m a -> (n ** Vect n a)
takeWhile3 f [] = (_ ** [])
takeWhile3 f (x :: xs) = case f x of
False => (_ ** [])
True => let (_ ** ys) = takeWhile3 f xs in (_ ** x :: ys)
Just like with regular pairs, we can use the dependent pair syntax to define dependent triples and larger tuples:
AnyMatrix : (a : Type) -> Type
AnyMatrix a = (m ** n ** Vect m (Vect n a))
Erased Existentials
Sometimes, it is possible to determine the value of an index by pattern matching on a value of the indexed type. For instance, by pattern matching on a vector, we can learn about its length index. In these cases, it is not strictly necessary to carry around the index at runtime, and we can write a special version of a dependent pair where the first argument has quantity zero. Module Data.DPair
from base exports data type Exists
for this use case.
As an example, here is a version of takeWhile
returning a value of type Exists
:
takeWhileExists : (a -> Bool) -> Vect m a -> Exists (\n => Vect n a)
takeWhileExists f [] = Evidence _ []
takeWhileExists f (x :: xs) = case f x of
True => let Evidence _ ys = takeWhileExists f xs
in Evidence _ (x :: ys)
False => takeWhileExists f xs
In order to restore an erased value, data type Singleton
from base module Data.Singleton
can be useful: It is parameterized by the value it stores:
true : Singleton True
true = Val True
This is called a singleton type: A type corresponding to exactly one value. It is a type error to return any other value for constant true
, and Idris knows this:
true' : Singleton True
true' = Val _
We can use this to conjure the (erased!) length of a vector out of thin air:
vectLength : Vect n a -> Singleton n
vectLength [] = Val 0
vectLength (x :: xs) = let Val k = vectLength xs in Val (S k)
This function comes with much stronger guarantees than Data.Vect.length
: The latter claims to just return any natural number, while vectLength
must return exactly n
in order to type check. As a demonstration, here is a well-typed bogus implementation of length
:
bogusLength : Vect n a -> Nat
bogusLength = const 0
This would not be accepted as a valid implementation of vectLength
, as you may quickly verify yourself.
With the help of vectLength
(but not with Data.Vect.length
) we can convert an erased existential to a proper dependent pair:
toDPair : Exists (\n => Vect n a) -> (m ** Vect m a)
toDPair (Evidence _ as) = let Val m = vectLength as in (m ** as)
Again, as a quick exercise, try implementing toDPair
in terms of length
, and note how Idris will fail to unify the result of length
with the actual length of the vector.
Exercises part 1
-
Declare and implement a function for filtering a vector similar to
Data.List.filter
. -
Declare and implement a function for mapping a partial function over the values of a vector similar to
Data.List.mapMaybe
. -
Declare and implement a function similar to
Data.List.dropWhile
for vectors. UseData.DPair.Exists
as your return type. -
Repeat exercise 3 but return a proper dependent pair. Use the function from exercise 3 in your implementation.
Use Case: Nucleic Acids
module Tutorial.DPair.DNA
import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect
import Text.CSV
%default total
We'd like to come up with a small, simplified library for running computations on nucleic acids: RNA and DNA. These are built from five types of nucleobases, three of which are used in both types of nucleic acids and two bases specific for each type of acid. We'd like to make sure that only valid bases are in strands of nucleic acids. Here's a possible encoding:
data BaseType = DNABase | RNABase
data Nucleobase : BaseType -> Type where
Adenine : Nucleobase b
Cytosine : Nucleobase b
Guanine : Nucleobase b
Thymine : Nucleobase DNABase
Uracile : Nucleobase RNABase
NucleicAcid : BaseType -> Type
NucleicAcid = List . Nucleobase
RNA : Type
RNA = NucleicAcid RNABase
DNA : Type
DNA = NucleicAcid DNABase
encodeBase : Nucleobase b -> Char
encodeBase Adenine = 'A'
encodeBase Cytosine = 'C'
encodeBase Guanine = 'G'
encodeBase Thymine = 'T'
encodeBase Uracile = 'U'
encode : NucleicAcid b -> String
encode = pack . map encodeBase
It is a type error to use Uracile
in a strand of DNA:
failing "Mismatch between: RNABase and DNABase."
errDNA : DNA
errDNA = [Uracile, Adenine]
Note, how we used a variable for nucleobases Adenine
, Cytosine
, and Guanine
: These are again universally quantified, and client code is free to choose a value here. This allows us to use these bases in strands of DNA and RNA:
dna1 : DNA
dna1 = [Adenine, Cytosine, Guanine]
rna1 : RNA
rna1 = [Adenine, Cytosine, Guanine]
With Thymine
and Uracile
, we are more restrictive: Thymine
is only allowed in DNA, while Uracile
is restricted to be used in RNA strands. Let's write parsers for strands of DNA and RNA:
readAnyBase : Char -> Maybe (Nucleobase b)
readAnyBase 'A' = Just Adenine
readAnyBase 'C' = Just Cytosine
readAnyBase 'G' = Just Guanine
readAnyBase _ = Nothing
readRNABase : Char -> Maybe (Nucleobase RNABase)
readRNABase 'U' = Just Uracile
readRNABase c = readAnyBase c
readDNABase : Char -> Maybe (Nucleobase DNABase)
readDNABase 'T' = Just Thymine
readDNABase c = readAnyBase c
readRNA : String -> Maybe RNA
readRNA = traverse readRNABase . unpack
readDNA : String -> Maybe DNA
readDNA = traverse readDNABase . unpack
Again, in case of the bases appearing in both kinds of strands, users of the universally quantified readAnyBase
are free to choose what base type they want, but they will never get a Thymine
or Uracile
value.
We can now implement some simple calculations on sequences of nucleobases. For instance, we can come up with the complementary strand:
complementRNA' : RNA -> RNA
complementRNA' = map calc
where calc : Nucleobase RNABase -> Nucleobase RNABase
calc Guanine = Cytosine
calc Cytosine = Guanine
calc Adenine = Uracile
calc Uracile = Adenine
complementDNA' : DNA -> DNA
complementDNA' = map calc
where calc : Nucleobase DNABase -> Nucleobase DNABase
calc Guanine = Cytosine
calc Cytosine = Guanine
calc Adenine = Thymine
calc Thymine = Adenine
Ugh, code repetition! Not too bad here, but imagine there were dozens of bases with only few specialized ones. Surely, we can do better? Unfortunately, the following won't work:
complementBase' : Nucleobase b -> Nucleobase b
complementBase' Adenine = ?what_now
complementBase' Cytosine = Guanine
complementBase' Guanine = Cytosine
complementBase' Thymine = Adenine
complementBase' Uracile = Adenine
All goes well with the exception of the Adenine
case. Remember: Parameter b
is universally quantified, and the callers of our function can decide what b
is supposed to be. We therefore can't just return Thymine
: Idris will respond with a type error since callers might want a Nucleobase RNABase
instead. One way to go about this is to take an additional unerased argument (explicit or implicit) representing the base type:
complementBase : (b : BaseType) -> Nucleobase b -> Nucleobase b
complementBase DNABase Adenine = Thymine
complementBase RNABase Adenine = Uracile
complementBase _ Cytosine = Guanine
complementBase _ Guanine = Cytosine
complementBase _ Thymine = Adenine
complementBase _ Uracile = Adenine
This is again an example of a dependent function type (also called a pi type): The input and output types both depend on the value of the first argument. We can now use this to calculate the complement of any nucleic acid:
complement : (b : BaseType) -> NucleicAcid b -> NucleicAcid b
complement b = map (complementBase b)
Now, here is an interesting use case: We'd like to read a sequence of nucleobases from user input, accepting two strings: The first telling us, whether the user plans to enter a DNA or RNA sequence, the second being the sequence itself. What should be the type of such a function? Well, we're describing computations with side effects, so something involving IO
seems about right. User input almost always needs to be validated or translated, so something might go wrong and we need an error type for this case. Finally, our users can decide whether they want to enter a strand of RNA or DNA, so this distinction should be encoded as well.
Of course, it is always possible to write a custom sum type for such a use case:
data Result : Type where
UnknownBaseType : String -> Result
InvalidSequence : String -> Result
GotDNA : DNA -> Result
GotRNA : RNA -> Result
This has all possible outcomes encoded in a single data type. However, it is lacking in terms of flexibility. If we want to handle errors early on and just extract a strand of RNA or DNA, we need yet another data type:
data RNAOrDNA = ItsRNA RNA | ItsDNA DNA
This might be the way to go, but for results with many options, this can get cumbersome quickly. Also: Why come up with a custom data type when we already have the tools to deal with this at our hands?
Here is how we can encode this with a dependent pair:
namespace InputError
public export
data InputError : Type where
UnknownBaseType : String -> InputError
InvalidSequence : String -> InputError
readAcid : (b : BaseType) -> String -> Either InputError (NucleicAcid b)
readAcid b str =
let err = InvalidSequence str
in case b of
DNABase => maybeToEither err $ readDNA str
RNABase => maybeToEither err $ readRNA str
getNucleicAcid : IO (Either InputError (b ** NucleicAcid b))
getNucleicAcid = do
baseString <- getLine
case baseString of
"DNA" => map (MkDPair _) . readAcid DNABase <$> getLine
"RNA" => map (MkDPair _) . readAcid RNABase <$> getLine
_ => pure $ Left (UnknownBaseType baseString)
Note, how we paired the type of nucleobases with the nucleic acid sequence. Assume now we implement a function for transcribing a strand of DNA to RNA, and we'd like to convert a sequence of nucleobases from user input to the corresponding RNA sequence. Here's how to do this:
transcribeBase : Nucleobase DNABase -> Nucleobase RNABase
transcribeBase Adenine = Uracile
transcribeBase Cytosine = Guanine
transcribeBase Guanine = Cytosine
transcribeBase Thymine = Adenine
transcribe : DNA -> RNA
transcribe = map transcribeBase
printRNA : RNA -> IO ()
printRNA = putStrLn . encode
transcribeProg : IO ()
transcribeProg = do
Right (b ** seq) <- getNucleicAcid
| Left (InvalidSequence str) => putStrLn $ "Invalid sequence: " ++ str
| Left (UnknownBaseType str) => putStrLn $ "Unknown base type: " ++ str
case b of
DNABase => printRNA $ transcribe seq
RNABase => printRNA seq
By pattern matching on the first value of the dependent pair we could determine, whether the second value is an RNA or DNA sequence. In the first case, we had to transcribe the sequence first, in the second case, we could invoke printRNA
directly.
In a more interesting scenario, we would translate the RNA sequence to the corresponding protein sequence. Still, this example shows how to deal with a simplified real world scenario: Data may be encoded differently and coming from different sources. By using precise types, we are forced to first convert values to the correct format. Failing to do so leads to a compile time exception instead of an error at runtime or - even worse - the program silently running a bogus computation.
Dependent Records vs Sum Types
Dependent records as shown for AnyVect a
are a generalization of dependent pairs: We can have an arbitrary number of fields and use the values stored therein to calculate the types of other values. For very simple cases like the example with nucleobases, it doesn't matter too much, whether we use a DPair
, a custom dependent record, or even a sum type. In fact, the three encodings are equally expressive:
Acid1 : Type
Acid1 = (b ** NucleicAcid b)
record Acid2 where
constructor MkAcid2
baseType : BaseType
sequence : NucleicAcid baseType
data Acid3 : Type where
SomeRNA : RNA -> Acid3
SomeDNA : DNA -> Acid3
It is trivial to write lossless conversions between these encodings, and with each encoding we can decide with a simple pattern match, whether we currently have a sequence of RNA or DNA. However, dependent types can depend on more than one value, as we will see in the exercises. In such cases, sum types and dependent pairs quickly become unwieldy, and you should go for an encoding as a dependent record.
Exercises part 2
Sharpen your skills in using dependent pairs and dependent records! In exercises 2 to 7 you have to decide yourself, when a function should return a dependent pair or record, when a function requires additional arguments, on which you can pattern match, and what other utility functions might be necessary.
-
Proof that the three encodings for nucleobases are isomorphic (meaning: of the same structure) by writing lossless conversion functions from
Acid1
toAcid2
and back. Likewise forAcid1
andAcid3
. -
Sequences of nucleobases can be encoded in one of two directions: Sense and antisense. Declare a new data type to describe the sense of a sequence of nucleobases, and add this as an additional parameter to type
Nucleobase
and typesDNA
andRNA
. -
Refine the types of
complement
andtranscribe
, so that they reflect the changing of sense. In case oftranscribe
, a strand of antisense DNA is converted to a strand of sense RNA. -
Define a dependent record storing the base type and sense together with a sequence of nucleobases.
-
Adjust
readRNA
andreadDNA
in such a way that the sense of a sequence is read from the input string. Sense strands are encoded like so: "5´-CGGTAG-3´". Antisense strands are encoded like so: "3´-CGGTAG-5´". -
Adjust
encode
in such a way that it includes the sense in its output. -
Enhance
getNucleicAcid
andtranscribeProg
in such a way that the sense and base type are stored together with the sequence, and thattranscribeProg
always prints the sense RNA strand (after transcription, if necessary). -
Enjoy the fruits of your labour and test your program at the REPL.
Note: Instead of using a dependent record, we could again have used a sum type of four constructors to encode the different types of sequences. However, the number of constructors required corresponds to the product of the number of values of each type level index. Therefore, this number can grow quickly and sum type encodings can lead to lengthy blocks of pattern matches in these cases.
Use Case: CSV Files with a Schema
module Tutorial.DPair.CSV
import Control.Monad.State
import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect
import Text.CSV
%default total
In this section, we are going to look at an extended example based on our previous work on CSV parsers. We'd like to write a small command-line program, where users can specify a schema for the CSV tables they'd like to parse and load into memory. Before we begin, here is a REPL session running the final program, which you will complete in the exercises:
Solutions.DPair> :exec main
Enter a command: load resources/example
Table loaded. Schema: str,str,fin2023,str?,boolean?
Enter a command: get 3
Row 3:
str | str | fin2023 | str? | boolean?
------------------------------------------
Floor | Jansen | 1981 | | t
Enter a command: add Mikael,Stanne,1974,,
Row prepended:
str | str | fin2023 | str? | boolean?
-------------------------------------------
Mikael | Stanne | 1974 | |
Enter a command: get 1
Row 1:
str | str | fin2023 | str? | boolean?
-------------------------------------------
Mikael | Stanne | 1974 | |
Enter a command: delete 1
Deleted row: 1.
Enter a command: get 1
Row 1:
str | str | fin2023 | str? | boolean?
-----------------------------------------
Rob | Halford | 1951 | |
Enter a command: quit
Goodbye.
This example was inspired by a similar program used as an example in the Type-Driven Development with Idris book.
We'd like to focus on several things here:
- Purity: With the exception of the main program loop, all functions used in the implementation should be pure, which in this context means "not running in any monad with side effects such as
IO
". - Fail early: With the exception of the command parser, all functions updating the table and handling queries should be typed and implemented in such a way that they cannot fail.
We are often well advised to adhere to these two guidelines, as they can make the majority of our functions easier to implement and test.
Since we allow users of our library to specify a schema (order and types of columns) for the table they work with, this information is not known until runtime. The same goes for the current size of the table. We will therefore store both values as fields in a dependent record.
Encoding the Schema
We need to inspect the table schema at runtime. Although theoretically possible, it is not advisable to operate on Idris types directly here. We'd rather use a closed custom data type describing the types of columns we understand. In a first try, we only support some Idris primitives:
data ColType = I64 | Str | Boolean | Float
Schema : Type
Schema = List ColType
Next, we need a way to convert a Schema
to a list of Idris types, which we will then use as the index of a heterogeneous list representing the rows in our table:
IdrisType : ColType -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
Row : Schema -> Type
Row = HList . map IdrisType
We can now describe a table as a dependent record storing the table's content as a vector of rows. In order to safely index rows of the table and parse new rows to be added, the current schema and size of the table must be known at runtime:
record Table where
constructor MkTable
schema : Schema
size : Nat
rows : Vect size (Row schema)
Finally, we define an indexed data type describing commands operating on the current table. Using the current table as the command's index allows us to make sure that indices for accessing and deleting rows are within bounds and that new rows agree with the current schema. This is necessary to uphold our second design principle: All functions operating on tables must do so without the possibility of failure.
data Command : (t : Table) -> Type where
PrintSchema : Command t
PrintSize : Command t
New : (newSchema : Schema) -> Command t
Prepend : Row (schema t) -> Command t
Get : Fin (size t) -> Command t
Delete : Fin (size t) -> Command t
Quit : Command t
We can now implement the main application logic: How user entered commands affect the application's current state. As promised, this comes without the risk of failure, so we don't have to wrap the return type in an Either
:
applyCommand : (t : Table) -> Command t -> Table
applyCommand t PrintSchema = t
applyCommand t PrintSize = t
applyCommand _ (New ts) = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r) = MkTable ts _ $ r :: rs
applyCommand t (Get x) = t
applyCommand t Quit = t
applyCommand (MkTable ts n rs) (Delete x) = case n of
S k => MkTable ts k (deleteAt x rs)
Z => absurd x
Please understand, that the constructors of Command t
are typed in such a way that indices are always within bounds (constructors Get
and Delete
), and new rows adhere to the table's current schema (constructor Prepend
).
One thing you might not have seen so far is the call to absurd
on the last line. This is a derived function of the Uninhabited
interface, which is used to describe types such as Void
or - in the case above - Fin 0
, of which there can be no value. Function absurd
is then just another manifestation of the principle of explosion. If this doesn't make too much sense yet, don't worry. We will look at Void
and its uses in the next chapter.
Parsing Commands
User input validation is an important topic when writing applications. If it happens early, you can keep larger parts of your application pure (which - in this context - means: "without the possibility of failure") and provably total. If done properly, this step encodes and handles most if not all ways in which things can go wrong in your program, allowing you to come up with clear error messages telling users exactly what caused an issue. As you surely have experienced yourself, there are few things more frustrating than a non-trivial computer program terminating with an unhelpful "There was an error" message.
So, in order to treat this important topic with all due respect, we are first going to implement a custom error type. This is not strictly necessary for small programs, but once your software gets more complex, it can be tremendously helpful for keeping track of what can go wrong where. In order to figure out what can possibly go wrong, we first need to decide on how the commands should be entered. Here, we use a single keyword for each command, together with an optional number of arguments separated from the keyword by a single space character. For instance: "new i64,boolean,str,str"
, for initializing an empty table with a new schema. With this settled, here is a list of things that can go wrong, and the messages we'd like to print:
- A bogus command is entered. We repeat the input with a message that we don't know the command plus a list of commands we know about.
- An invalid schema was entered. In this case, we list the position of the first unknown type, the string we found there, and a list of types we know about.
- An invalid CSV encoding of a row was entered. We list the erroneous position, the string encountered there, plus the expected type. In case of a too small or too large number of fields, we also print a corresponding error message.
- An index was out of bounds. This can happen, when users try to access or delete specific rows. We print the current number of rows plus the value entered.
- A value not representing a natural number was entered as an index. We print an according error message.
That's a lot of stuff to keep track of, so let's encode this in a sum type:
data Error : Type where
UnknownCommand : String -> Error
UnknownType : (pos : Nat) -> String -> Error
InvalidField : (pos : Nat) -> ColType -> String -> Error
ExpectedEOI : (pos : Nat) -> String -> Error
UnexpectedEOI : (pos : Nat) -> String -> Error
OutOfBounds : (size : Nat) -> (index : Nat) -> Error
NoNat : String -> Error
In order to conveniently construct our error messages, it is best to use Idris' string interpolation facilities: We can enclose arbitrary string expressions in a string literal by enclosing them in curly braces, the first of which must be escaped with a backslash. Like so: "foo \{myExpr a b c}"
. We can pair this with multiline string literals to get nicely formatted error messages.
showColType : ColType -> String
showColType I64 = "i64"
showColType Str = "str"
showColType Boolean = "boolean"
showColType Float = "float"
showSchema : Schema -> String
showSchema = concat . intersperse "," . map showColType
allTypes : String
allTypes = concat
. List.intersperse ", "
. map showColType
$ [I64,Str,Boolean,Float]
showError : Error -> String
showError (UnknownCommand x) = """
Unknown command: \{x}.
Known commands are: clear, schema, size, new, add, get, delete, quit.
"""
showError (UnknownType pos x) = """
Unknown type at position \{show pos}: \{x}.
Known types are: \{allTypes}.
"""
showError (InvalidField pos tpe x) = """
Invalid value at position \{show pos}.
Expected type: \{showColType tpe}.
Value found: \{x}.
"""
showError (ExpectedEOI k x) = """
Expected end of input.
Position: \{show k}
Input: \{x}
"""
showError (UnexpectedEOI k x) = """
Unxpected end of input.
Position: \{show k}
Input: \{x}
"""
showError (OutOfBounds size index) = """
Index out of bounds.
Size of table: \{show size}
Index: \{show index}
Note: Indices start at 1.
"""
showError (NoNat x) = "Not a natural number: \{x}"
We can now write parsers for the different commands. We need facilities to parse vector indices, schemata, and CSV rows. Since we are using a CSV format for encoding and decoding rows, it makes sense to also encode the schema as a comma-separated list of values:
zipWithIndex : Traversable t => t a -> t (Nat, a)
zipWithIndex = evalState 1 . traverse pairWithIndex
where pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = (,v) <$> get <* modify S
fromCSV : String -> List String
fromCSV = forget . split (',' ==)
readColType : Nat -> String -> Either Error ColType
readColType _ "i64" = Right I64
readColType _ "str" = Right Str
readColType _ "boolean" = Right Boolean
readColType _ "float" = Right Float
readColType n s = Left $ UnknownType n s
readSchema : String -> Either Error Schema
readSchema = traverse (uncurry readColType) . zipWithIndex . fromCSV
We also need to decode CSV content based on the current schema. Note, how we can do so in a type safe manner by pattern matching on the schema, which will not be known until runtime. Unfortunately, we need to reimplement CSV-parsing, because we want to add the expected type to the error messages (a thing that would be much harder to do with interface CSVLine
and error type CSVError
).
decodeField : Nat -> (c : ColType) -> String -> Either Error (IdrisType c)
decodeField k c s =
let err = InvalidField k c s
in case c of
I64 => maybeToEither err $ read s
Str => maybeToEither err $ read s
Boolean => maybeToEither err $ read s
Float => maybeToEither err $ read s
decodeRow : {ts : _} -> String -> Either Error (Row ts)
decodeRow s = go 1 ts $ fromCSV s
where go : Nat -> (cs : Schema) -> List String -> Either Error (Row cs)
go k [] [] = Right []
go k [] (_ :: _) = Left $ ExpectedEOI k s
go k (_ :: _) [] = Left $ UnexpectedEOI k s
go k (c :: cs) (s :: ss) = [| decodeField k c s :: go (S k) cs ss |]
There is no hard and fast rule about whether to pass an index as an implicit argument or not. Some considerations:
- Pattern matching on explicit arguments comes with less syntactic overhead.
- If an argument can be inferred from the context most of the time, consider passing it as an implicit to make your function nicer to use in client code.
- Use explicit (possibly erased) arguments for values that can't be inferred by Idris most of the time.
All that is missing now is a way to parse indices for accessing the current table's rows. We use the conversion for indices to start at one instead of zero, which feels more natural for most non-programmers.
readFin : {n : _} -> String -> Either Error (Fin n)
readFin s = do
S k <- maybeToEither (NoNat s) $ parsePositive {a = Nat} s
| Z => Left $ OutOfBounds n Z
maybeToEither (OutOfBounds n $ S k) $ natToFin k n
We are finally able to implement a parser for user commands. Function Data.String.words
is used for splitting a string at space characters. In most cases, we expect the name of the command plus a single argument without additional spaces. CSV rows can have additional space characters, however, so we use Data.String.unwords
on the split string.
readCommand : (t : Table) -> String -> Either Error (Command t)
readCommand _ "schema" = Right PrintSchema
readCommand _ "size" = Right PrintSize
readCommand _ "quit" = Right Quit
readCommand (MkTable ts n _) s = case words s of
["new", str] => New <$> readSchema str
"add" :: ss => Prepend <$> decodeRow (unwords ss)
["get", str] => Get <$> readFin str
["delete", str] => Delete <$> readFin str
_ => Left $ UnknownCommand s
Running the Application
All that's left to do is to write functions for printing the results of commands to users and run the application in a loop until command "quit"
is entered.
encodeField : (t : ColType) -> IdrisType t -> String
encodeField I64 x = show x
encodeField Str x = show x
encodeField Boolean True = "t"
encodeField Boolean False = "f"
encodeField Float x = show x
encodeRow : (ts : List ColType) -> Row ts -> String
encodeRow ts = concat . intersperse "," . go ts
where go : (cs : List ColType) -> Row cs -> Vect (length cs) String
go [] [] = []
go (c :: cs) (v :: vs) = encodeField c v :: go cs vs
result : (t : Table) -> Command t -> String
result t PrintSchema = "Current schema: \{showSchema t.schema}"
result t PrintSize = "Current size: \{show t.size}"
result _ (New ts) = "Created table. Schema: \{showSchema ts}"
result t (Prepend r) = "Row prepended: \{encodeRow t.schema r}"
result _ (Delete x) = "Deleted row: \{show $ FS x}."
result _ Quit = "Goodbye."
result t (Get x) =
"Row \{show $ FS x}: \{encodeRow t.schema (index x t.rows)}"
covering
runProg : Table -> IO ()
runProg t = do
putStr "Enter a command: "
str <- getLine
case readCommand t str of
Left err => putStrLn (showError err) >> runProg t
Right Quit => putStrLn (result t Quit)
Right cmd => putStrLn (result t cmd) >>
runProg (applyCommand t cmd)
covering
main : IO ()
main = runProg $ MkTable [] _ []
Exercises part 3
The challenges presented here all deal with enhancing our table editor in several interesting ways. Some of them are more a matter of style and less a matter of learning to write dependently typed programs, so feel free to solve these as you please. Exercises 1 to 3 should be considered to be mandatory.
-
Add support for storing Idris types
Integer
andNat
in CSV columns -
Add support for
Fin n
to CSV columns. Note: We need runtime access ton
in order for this to work. -
Add support for optional types to CSV columns. Since missing values should be encoded by empty strings, it makes no sense to allow for nested optional types, meaning that types like
Maybe Nat
should be allowed whileMaybe (Maybe Nat)
should not.Hint: There are several ways to encode these, one being to add a boolean index to
ColType
. -
Add a command for printing the whole table. Bonus points if all columns are properly aligned.
-
Add support for simple queries: Given a column number and a value, list all rows where entries match the given value.
This might be a challenge, as the types get pretty interesting.
-
Add support for loading and saving tables from and to disk. A table should be stored in two files: One for the schema and one for the CSV content.
Note: Reading files in a provably total way can be pretty hard and will be a topic for another day. For now, just use function
readFile
exported fromSystem.File
in base for reading a file as a whole. This function is partial, because it will not terminate when used with an infinite input stream such as/dev/urandom
or/dev/zero
. It is important to not useassert_total
here. Using partial functions likereadFile
might well impose a security risk in a real world application, so eventually, we'd have to deal with this and allow for some way to limit the size of accepted input. It is therefore best to make this partiality visible and annotate all downstream functions accordingly.
You can find an implementation of these additions in the solutions. A small example table can be found in folder resources
.
Note: There are of course tons of projects to pursue from here, such as writing a proper query language, calculating new rows from existing ones, accumulating values in a column, concatenating and zipping tables, and so on. We will stop for now, probably coming back to this in later examples.
Conclusion
Dependent pairs and records are necessary to at runtime inspect the values defining the types we work with. By pattern matching on these values, we learn about the types and possible shapes of other values, allowing us to reduce the number of potential bugs in our programs.
In the next chapter we start learning about how to write data types, which we use as proofs that certain contracts between values hold. These will eventually allow us to define pre- and post conditions for our function arguments and output types.
Propositional Equality
In the last chapter we learned, how dependent pairs and records can be used to calculate types from values only known at runtime by pattern matching on these values. We will now look at how we can describe relations - or contracts - between values as types, and how we can use values of these types as proofs that the contracts hold.
Equality as a Type
module Tutorial.Eq.Eq
import Data.Either
import Data.HList
import Data.Vect
import Data.String
%default total
Imagine, we'd like to concatenate the contents of two CSV files, both of which we stored on disk as tables together with their schemata as shown in our discussion about dependent pairs:
public export
data ColType = I64 | Str | Boolean | Float
public export
Schema : Type
Schema = List ColType
IdrisType : ColType -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
Row : Schema -> Type
Row = HList . map IdrisType
record Table where
constructor MkTable
schema : Schema
size : Nat
rows : Vect size (Row schema)
concatTables1 : Table -> Table -> Maybe Table
We will not be able to implement concatTables1
by appending the two row vectors, unless we can somehow verify that the two schemata are identical. "Well," I hear you say, "that shouldn't be a big issue! Just implement Eq
for ColType
". Let's give this a try:
Eq ColType where
I64 == I64 = True
Str == Str = True
Boolean == Boolean = True
Float == Float = True
_ == _ = False
concatTables1 (MkTable s1 m rs1) (MkTable s2 n rs2) = case s1 == s2 of
True => ?what_now
False => Nothing
Somehow, this doesn't seem to work. If we inspect the context of hole what_now
, Idris still thinks that s1
and s2
are different, and if we go ahead and invoke Vect.(++)
anyway in the True
case, Idris will respond with a type error.
Tutorial.Relations> :t what_now
m : Nat
s1 : List ColType
rs1 : Vect m (HList (map IdrisType s1))
n : Nat
s2 : List ColType
rs2 : Vect n (HList (map IdrisType s2))
------------------------------
what_now : Maybe Table
The problem is, that there is no reason for Idris to unify the two values, even though (==)
returned True
because the result of (==)
holds no other information than the type being a Bool
. We think, if this is True
the two values should be identical, but Idris is not convinced. In fact, the following implementation of Eq ColType
would be perfectly fine as far as the type checker is concerned:
Eq ColType where
_ == _ = True
So Idris is right in not trusting us. You might expect it to inspect the implementation of (==)
and figure out on its own, what the True
result means, but this is not how these things work in general, because most of the time the number of computational paths to check would be far too large. As a consequence, Idris is able to evaluate functions during unification, but it will not trace back information about function arguments from a function's result for us. We can do so manually, however, as we will see later.
A Type for equal Schemata
The problem described above is similar to what we saw when we talked about the benefit of singleton types: The types are not precise enough. What we are going to do now, is something we'll repeat time again for different use cases: We encode a contract between values in an indexed data type:
public export
data SameSchema : (s1 : Schema) -> (s2 : Schema) -> Type where
Same : SameSchema s s
First, note how SameSchema
is a family of types indexed over two values of type Schema
. But note also that the sole constructor restricts the values we allow for s1
and s2
: The two indices must be identical.
Why is this useful? Well, imagine we had a function for checking the equality of two schemata, which would try and return a value of type SameSchema s1 s2
:
sameSchema : (s1, s2 : Schema) -> Maybe (SameSchema s1 s2)
We could then use this function to implement concatTables
:
concatTables : Table -> Table -> Maybe Table
concatTables (MkTable s1 m rs1) (MkTable s2 n rs2) = case sameSchema s1 s2 of
Just Same => Just $ MkTable s1 _ (rs1 ++ rs2)
Nothing => Nothing
It worked! What's going on here? Well, let's inspect the types involved:
concatTables2 : Table -> Table -> Maybe Table
concatTables2 (MkTable s1 m rs1) (MkTable s2 n rs2) = case sameSchema s1 s2 of
Just Same => ?almost_there
Nothing => Nothing
At the REPL, we get the following context for almost_there
:
Tutorial.Relations> :t almost_there
m : Nat
s2 : List ColType
rs1 : Vect m (HList (map IdrisType s2))
n : Nat
rs2 : Vect n (HList (map IdrisType s2))
s1 : List ColType
------------------------------
almost_there : Maybe Table
See, how the types of rs1
and rs2
unify? Value Same
, coming as the result of sameSchema s1 s2
, is a witness that s1
and s2
are actually identical, because this is what we specified in the definition of Same
.
All that remains to do is to implement sameSchema
. For this, we will write another data type for specifying when two values of type ColType
are identical:
public export
data SameColType : (c1, c2 : ColType) -> Type where
SameCT : SameColType c1 c1
We can now define several utility functions. First, one for figuring out if two column types are identical:
sameColType : (c1, c2 : ColType) -> Maybe (SameColType c1 c2)
sameColType I64 I64 = Just SameCT
sameColType Str Str = Just SameCT
sameColType Boolean Boolean = Just SameCT
sameColType Float Float = Just SameCT
sameColType _ _ = Nothing
This will convince Idris, because in each pattern match, the return type will be adjusted according to the values we matched on. For instance, on the first line, the output type is Maybe (SameColType I64 I64)
as you can easily verify yourself by inserting a hole and checking its type at the REPL.
We will need two additional utilities: Functions for creating values of type SameSchema
for the nil and cons cases. Please note, how the implementations are trivial. Still, we often have to quickly write such small proofs (I'll explain in the next section, why I call them proofs), which will then be used to convince the type checker about some fact we already take for granted but Idris does not.
sameNil : SameSchema [] []
sameNil = Same
sameCons : SameColType c1 c2
-> SameSchema s1 s2
-> SameSchema (c1 :: s1) (c2 :: s2)
sameCons SameCT Same = Same
As usual, it can help understanding what's going on by replacing the right hand side of sameCons
with a hole an check out its type and context at the REPL. The presence of values SameCT
and Same
on the left hand side forces Idris to unify c1
and c2
as well as s1
and s2
, from which the unification of c1 :: s1
and c2 :: s2
immediately follows. With these, we can finally implement sameSchema
:
sameSchema [] [] = Just sameNil
sameSchema (x :: xs) (y :: ys) =
[| sameCons (sameColType x y) (sameSchema xs ys) |]
sameSchema (x :: xs) [] = Nothing
sameSchema [] (x :: xs) = Nothing
What we described here is a far stronger form of equality than what is provided by interface Eq
and the (==)
operator: Equality of values that is accepted by the type checker when trying to unify type level indices. This is also called propositional equality: We will see below, that we can view types as mathematical propositions, and values of these types a proofs that these propositions hold.
Type Equal
Propositional equality is such a fundamental concept, that the Prelude exports a general data type for this already: Equal
, with its only data constructor Refl
. In addition, there is a built-in operator for expressing propositional equality, which gets desugared to Equal
: (=)
. This can sometimes lead to some confusion, because the equals symbol is also used for definitional equality: Describing in function implementations that the left-hand side and right-hand side are defined to be equal. If you want to disambiguate propositional from definitional equality, you can also use operator (===)
for the former.
Here is another implementation of concatTables
:
eqColType : (c1,c2 : ColType) -> Maybe (c1 = c2)
eqColType I64 I64 = Just Refl
eqColType Str Str = Just Refl
eqColType Boolean Boolean = Just Refl
eqColType Float Float = Just Refl
eqColType _ _ = Nothing
eqCons : {0 c1,c2 : a}
-> {0 s1,s2 : List a}
-> c1 = c2 -> s1 = s2 -> c1 :: s1 = c2 :: s2
eqCons Refl Refl = Refl
eqSchema : (s1,s2 : Schema) -> Maybe (s1 = s2)
eqSchema [] [] = Just Refl
eqSchema (x :: xs) (y :: ys) = [| eqCons (eqColType x y) (eqSchema xs ys) |]
eqSchema (x :: xs) [] = Nothing
eqSchema [] (x :: xs) = Nothing
concatTables3 : Table -> Table -> Maybe Table
concatTables3 (MkTable s1 m rs1) (MkTable s2 n rs2) = case eqSchema s1 s2 of
Just Refl => Just $ MkTable _ _ (rs1 ++ rs2)
Nothing => Nothing
Exercises part 1
In the following exercises, you are going to implement some very basic properties of equality proofs. You'll have to come up with the types of the functions yourself, as the implementations will be incredibly simple.
Note: If you can't remember what the terms "reflexive", "symmetric", and "transitive" mean, quickly read about equivalence relations.
-
Show that
SameColType
is a reflexive relation. -
Show that
SameColType
is a symmetric relation. -
Show that
SameColType
is a transitive relation. -
Let
f
be a function of typeColType -> a
for an arbitrary typea
. Show that from a value of typeSameColType c1 c2
follows thatf c1
andf c2
are equal.For
(=)
the above properties are available from the Prelude as functionssym
,trans
, andcong
. Reflexivity comes from the data constructorRefl
itself. -
Implement a function for verifying that two natural numbers are identical. Try using
cong
in your implementation. -
Use the function from exercise 5 for zipping two
Table
s if they have the same number of rows.Hint: Use
Vect.zipWith
. You will need to implement custom functionappRows
for this, since Idris will not automatically figure out that the types unify when usingHList.(++)
:appRows : {ts1 : _} -> Row ts1 -> Row ts2 -> Row (ts1 ++ ts2)
We will later learn how to use rewrite rules to circumvent the need of writing custom functions like appRows
and use (++)
in zipWith
directly.
Programs as Proofs
module Tutorial.Eq.ProgramsAsProofs
import Tutorial.Eq.Eq
import Data.Either
import Data.HList
import Data.Vect
import Data.String
%default total
A famous observation by mathematician Haskell Curry and logician William Alvin Howard leads to the conclusion, that we can view a type in a programming language with a sufficiently rich type system as a mathematical proposition and a total program calculating a value of this type as a proof that the proposition holds. This is also known as the Curry-Howard isomorphism.
For instance, here is a simple proof that one plus one equals two:
onePlusOne : the Nat 1 + 1 = 2
onePlusOne = Refl
The above proof is trivial, as Idris solves this by unification. But we already stated some more interesting things in the exercises. For instance, the symmetry and transitivity of SameColType
:
sctSymmetric : SameColType c1 c2 -> SameColType c2 c1
sctSymmetric SameCT = SameCT
sctTransitive : SameColType c1 c2 -> SameColType c2 c3 -> SameColType c1 c3
sctTransitive SameCT SameCT = SameCT
Note, that a type alone is not a proof. For instance, we are free to state that one plus one equals three:
onePlusOneWrong : the Nat 1 + 1 = 3
We will, however, have a hard time implementing this in a provably total way. We say: "The type the Nat 1 + 1 = 3
is uninhabited", meaning, that there is no value of this type.
When Proofs replace Tests
We will see several different use cases for compile time proofs, a very straight forward one being to show that our functions behave as they should by proofing some properties about them. For instance, here is a proposition that map
on list does not change the number of elements in the list:
mapListLength : (f : a -> b) -> (as : List a) -> length as = length (map f as)
Read this as a universally quantified statement: For all functions f
from a
to b
and for all lists as
holding values of type a
, the length of map f as
is the same the as the length of the original list.
We can implement mapListLength
by pattern matching on as
. The Nil
case will be trivial: Idris solves this by unification. It knows the value of the input list (Nil
), and since map
is implemented by pattern matching on the input as well, it follows immediately that the result will be Nil
as well:
mapListLength f [] = Refl
The cons
case is more involved, and we will do this stepwise. First, note that we can proof that the length of a map over the tail will stay the same by means of recursion:
mapListLength f (x :: xs) = case mapListLength f xs of
prf => ?mll1
Let's inspect the types and context we have here:
0 b : Type
0 a : Type
xs : List a
f : a -> b
x : a
prf : length xs = length (map f xs)
------------------------------
mll1 : S (length xs) = S (length (map f xs))
So, we have a proof of type length xs = length (map f xs)
, and from the implementation of map
Idris concludes that what we are actually looking for is a result of type S (length xs) = S (length (map f xs))
. This is exactly what function cong
from the Prelude is for ("cong" is an abbreviation for congruence). We can thus implement the cons case concisely like so:
mapListLength f (x :: xs) = cong S $ mapListLength f xs
Please take a moment to appreciate what we achieved here: A proof in the mathematical sense that our function will not affect the length of our list. We no longer need a unit test or similar program to verify this.
Before we continue, please note an important thing: In our case expression, we used a variable for the result from the recursive call:
mapListLength f (x :: xs) = case mapListLength f xs of
prf => cong S prf
Here, we did not want the two lengths to unify, because we needed the distinction in our call to cong
. Therefore: If you need a proof of type x = y
in order for two variables to unify, use the Refl
data constructor in the pattern match. If, on the other hand, you need to run further computations on such a proof, use a variable and the left and right-hand sides will remain distinct.
Here is another example from the last chapter: We want to show that parsing and printing column types behaves correctly. Writing proofs about parsers can be very hard in general, but here it can be done with a mere pattern match:
showColType : ColType -> String
showColType I64 = "i64"
showColType Str = "str"
showColType Boolean = "boolean"
showColType Float = "float"
readColType : String -> Maybe ColType
readColType "i64" = Just I64
readColType "str" = Just Str
readColType "boolean" = Just Boolean
readColType "float" = Just Float
readColType s = Nothing
showReadColType : (c : ColType) -> readColType (showColType c) = Just c
showReadColType I64 = Refl
showReadColType Str = Refl
showReadColType Boolean = Refl
showReadColType Float = Refl
Such simple proofs give us quick but strong guarantees that we did not make any stupid mistakes.
The examples we saw so far were very easy to implement. In general, this is not the case, and we will have to learn about several additional techniques in order to proof interesting things about our programs. However, when we use Idris as a general purpose programming language and not as a proof assistant, we are free to choose whether some aspect of our code needs such strong guarantees or not.
A Note of Caution: Lowercase Identifiers in Function Types
When writing down the types of proofs as we did above, one has to be very careful not to fall into the following trap: In general, Idris will treat lowercase identifiers in function types as type parameters (erased implicit arguments). For instance, here is a try at proofing the identity functor law for Maybe
:
mapMaybeId1 : (ma : Maybe a) -> map id ma = ma
mapMaybeId1 Nothing = Refl
mapMaybeId1 (Just x) = ?mapMaybeId1_rhs
You will not be able to implement the Just
case, because Idris treats id
as an implicit argument as can easily be seen when inspecting the context of mapMaybeId1_rhs
:
Tutorial.Relations> :t mapMaybeId1_rhs
0 a : Type
0 id : a -> a
x : a
------------------------------
mapMaybeId1_rhs : Just (id x) = Just x
As you can see, id
is an erased argument of type a -> a
. And in fact, when type-checking this module, Idris will issue a warning that parameter id
is shadowing an existing function:
Warning: We are about to implicitly bind the following lowercase names.
You may be unintentionally shadowing the associated global definitions:
id is shadowing Prelude.Basics.id
The same is not true for map
: Since we explicitly pass arguments to map
, Idris treats this as a function name and not as an implicit argument.
You have several options here. For instance, you could use an uppercase identifier, as these will never be treated as implicit arguments:
Id : a -> a
Id = id
mapMaybeId2 : (ma : Maybe a) -> map Id ma = ma
mapMaybeId2 Nothing = Refl
mapMaybeId2 (Just x) = Refl
As an alternative - and this is the preferred way to handle this case - you can prefix id
with part of its namespace, which will immediately resolve the issue:
mapMaybeId : (ma : Maybe a) -> map Prelude.id ma = ma
mapMaybeId Nothing = Refl
mapMaybeId (Just x) = Refl
Note: If you have semantic highlighting turned on in your editor (for instance, by using the idris2-lsp plugin), you will note that map
and id
in mapMaybeId1
get highlighted differently: map
as a function name, id
as a bound variable.
Exercises part 2
In these exercises, you are going to proof several simple properties of small functions. When writing proofs, it is even more important to use holes to figure out what Idris expects from you next. Use the tools given to you, instead of trying to find your way in the dark!
-
Proof that
map id
on anEither e
returns the value unmodified. -
Proof that
map id
on a list returns the list unmodified. -
Proof that complementing a strand of a nucleobase (see the previous chapter) twice leads to the original strand.
Hint: Proof this for single bases first, and use
cong2
from the Prelude in your implementation for sequences of nucleic acids. -
Implement function
replaceVect
:replaceVect : (ix : Fin n) -> a -> Vect n a -> Vect n a
Now proof, that after replacing an element in a vector using
replaceAt
accessing the same element usingindex
will return the value we just added. -
Implement function
insertVect
:insertVect : (ix : Fin (S n)) -> a -> Vect n a -> Vect (S n) a
Use a similar proof as in exercise 4 to show that this behaves correctly.
Note: Functions replaceVect
and insertVect
are available from Data.Vect
as replaceAt
and insertAt
.
Into the Void
module Tutorial.Eq.Void
import Tutorial.Eq.Eq
import Data.Either
import Data.HList
import Data.Vect
import Data.String
%default total
Remember function onePlusOneWrong
from above? This was definitely a wrong statement: One plus one does not equal three. Sometimes, we want to express exactly this: That a certain statement is false and does not hold. Consider for a moment what it means to proof a statement in Idris: Such a statement (or proposition) is a type, and a proof of the statement is a value or expression of this type: The type is said to be inhabited. If a statement is not true, there can be no value of the given type. We say, the given type is uninhabited. If we still manage to get our hands on a value of an uninhabited type, that is a logical contradiction and from this, anything follows (remember ex falso quodlibet).
So this is how to express that a proposition does not hold: We state that if it would hold, this would lead to a contradiction. The most natural way to express a contradiction in Idris is to return a value of type Void
:
onePlusOneWrongProvably : the Nat 1 + 1 = 3 -> Void
onePlusOneWrongProvably Refl impossible
See how this is a provably total implementation of the given type: A function from 1 + 1 = 3
to Void
. We implement this by pattern matching, and there is only one constructor to match on, which leads to an impossible case.
We can also use contradictory statements to proof other such statements. For instance, here is a proof that if the lengths of two lists are not the same, then the two list can't be the same either:
notSameLength1 : (List.length as = length bs -> Void) -> as = bs -> Void
notSameLength1 f prf = f (cong length prf)
This is cumbersome to write and pretty hard to read, so there is function Not
in the prelude to express the same thing more naturally:
notSameLength : Not (List.length as = length bs) -> Not (as = bs)
notSameLength f prf = f (cong length prf)
Actually, this is just a specialized version of the contraposition of cong
: If from a = b
follows f a = f b
, then from not (f a = f b)
follows not (a = b)
:
contraCong : {0 f : _} -> Not (f a = f b) -> Not (a = b)
contraCong fun x = fun $ cong f x
Interface Uninhabited
There is an interface in the Prelude for uninhabited types: Uninhabited
with its sole function uninhabited
. Have a look at its documentation at the REPL. You will see, that there is already an impressive number of implementations available, many of which involve data type Equal
.
We can use Uninhabited
, to for instance express that the empty schema is not equal to a non-empty schema:
Uninhabited (SameSchema [] (h :: t)) where
uninhabited Same impossible
Uninhabited (SameSchema (h :: t) []) where
uninhabited Same impossible
There is a related function you need to know about: absurd
, which combines uninhabited
with void
:
Tutorial.Eq> :printdef absurd
Prelude.absurd : Uninhabited t => t -> a
absurd h = void (uninhabited h)
Decidable Equality
When we implemented sameColType
, we got a proof that two column types are indeed the same, from which we could figure out, whether two schemata are identical. The types guarantee we do not generate any false positives: If we generate a value of type SameSchema s1 s2
, we have a proof that s1
and s2
are indeed identical. However, sameColType
and thus sameSchema
could theoretically still produce false negatives by returning Nothing
although the two values are identical. For instance, we could implement sameColType
in such a way that it always returns Nothing
. This would be in agreement with the types, but definitely not what we want. So, here is what we'd like to do in order to get yet stronger guarantees: We'd either want to return a proof that the two schemata are the same, or return a proof that the two schemata are not the same. (Remember that Not a
is an alias for a -> Void
).
We call a property, which either holds or leads to a contradiction a decidable property, and the Prelude exports data type Dec prop
, which encapsulates this distinction.
Here is a way to encode this for ColType
:
decSameColType : (c1,c2 : ColType) -> Dec (SameColType c1 c2)
decSameColType I64 I64 = Yes SameCT
decSameColType I64 Str = No $ \case SameCT impossible
decSameColType I64 Boolean = No $ \case SameCT impossible
decSameColType I64 Float = No $ \case SameCT impossible
decSameColType Str I64 = No $ \case SameCT impossible
decSameColType Str Str = Yes SameCT
decSameColType Str Boolean = No $ \case SameCT impossible
decSameColType Str Float = No $ \case SameCT impossible
decSameColType Boolean I64 = No $ \case SameCT impossible
decSameColType Boolean Str = No $ \case SameCT impossible
decSameColType Boolean Boolean = Yes SameCT
decSameColType Boolean Float = No $ \case SameCT impossible
decSameColType Float I64 = No $ \case SameCT impossible
decSameColType Float Str = No $ \case SameCT impossible
decSameColType Float Boolean = No $ \case SameCT impossible
decSameColType Float Float = Yes SameCT
First, note how we could use a pattern match in a single argument lambda directly. This is sometimes called the lambda case style, named after an extension of the Haskell programming language. If we use the SameCT
constructor in the pattern match, Idris is forced to try and unify for instance Float
with I64
. This is not possible, so the case as a whole is impossible.
Yet, this was pretty cumbersome to implement. In order to convince Idris we did not miss a case, there is no way around treating every possible pairing of constructors explicitly. However, we get much stronger guarantees out of this: We can no longer create false positives or false negatives, and therefore, decSameColType
is provably correct.
Doing the same thing for schemata requires some utility functions, the types of which we can figure out by placing some holes:
decSameSchema' : (s1, s2 : Schema) -> Dec (SameSchema s1 s2)
decSameSchema' [] [] = Yes Same
decSameSchema' [] (y :: ys) = No ?decss1
decSameSchema' (x :: xs) [] = No ?decss2
decSameSchema' (x :: xs) (y :: ys) = case decSameColType x y of
Yes SameCT => case decSameSchema' xs ys of
Yes Same => Yes Same
No contra => No $ \prf => ?decss3
No contra => No $ \prf => ?decss4
The first two cases are not too hard. The type of decss1
is SameSchema [] (y :: ys) -> Void
, which you can easily verify at the REPL. But that's just uninhabited
, specialized to SameSchema [] (y :: ys)
, and this we already implemented further above. The same goes for decss2
.
The other two cases are harder, so I already filled in as much stuff as possible. We know that we want to return a No
, if either the heads or tails are provably distinct. The No
holds a function, so I already added a lambda, leaving a hole only for the return value. Here are the type and - more important - context of decss3
:
Tutorial.Relations> :t decss3
y : ColType
xs : List ColType
ys : List ColType
x : ColType
contra : SameSchema xs ys -> Void
prf : SameSchema (y :: xs) (y :: ys)
------------------------------
decss3 : Void
The types of contra
and prf
are what we need here: If xs
and ys
are distinct, then y :: xs
and y :: ys
must be distinct as well. This is the contraposition of the following statement: If x :: xs
is the same as y :: ys
, then xs
and ys
are the same as well. We must therefore implement a lemma, which proves that the cons constructor is injective:
consInjective : SameSchema (c1 :: cs1) (c2 :: cs2)
-> (SameColType c1 c2, SameSchema cs1 cs2)
consInjective Same = (SameCT, Same)
We can now pass prf
to consInjective
to extract a value of type SameSchema xs ys
, which we then pass to contra
in order to get the desired value of type Void
. With these observations and utilities, we can now implement decSameSchema
:
decSameSchema : (s1, s2 : Schema) -> Dec (SameSchema s1 s2)
decSameSchema [] [] = Yes Same
decSameSchema [] (y :: ys) = No absurd
decSameSchema (x :: xs) [] = No absurd
decSameSchema (x :: xs) (y :: ys) = case decSameColType x y of
Yes SameCT => case decSameSchema xs ys of
Yes Same => Yes Same
No contra => No $ contra . snd . consInjective
No contra => No $ contra . fst . consInjective
There is an interface called DecEq
exported by module Decidable.Equality
for types for which we can implement a decision procedure for propositional equality. We can implement this to figure out if two values are equal or not.
Exercises part 3
-
Show that there can be no non-empty vector of
Void
by writing a corresponding implementation of uninhabited -
Generalize exercise 1 for all uninhabited element types.
-
Show that if
a = b
cannot hold, thenb = a
cannot hold either. -
Show that if
a = b
holds, andb = c
cannot hold, thena = c
cannot hold either. -
Implement
Uninhabited
forCrud i a
. Try to be as general as possible.data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
-
Implement
DecEq
forColType
. -
Implementations such as the one from exercise 6 are cumbersome to write as they require a quadratic number of pattern matches with relation to the number of data constructors. Here is a trick how to make this more bearable.
-
Implement a function
ctNat
, which assigns every value of typeColType
a unique natural number. -
Proof that
ctNat
is injective. Hint: You will need to pattern match on theColType
values, but four matches should be enough to satisfy the coverage checker. -
In your implementation of
DecEq
forColType
, usedecEq
on the result of applying both column types toctNat
, thus reducing it to only two lines of code.
We will later talk about
with
rules: Special forms of dependent pattern matches, that allow us to learn something about the shape of function arguments by performing computations on them. These will allow us to use a similar technique as shown here to implementDecEq
requiring onlyn
pattern matches for arbitrary sum types withn
data constructors. -
Rewrite Rules
module Tutorial.Eq.Rewrite
import Data.Either
import Data.HList
import Data.Vect
import Data.String
%default total
One of the most important use cases of propositional equality is to replace or rewrite existing types, which Idris can't unify automatically otherwise. For instance, the following is no problem: Idris know that 0 + n
equals n
, because plus
on natural numbers is implemented by pattern matching on the first argument. The two vector lengths therefore unify just fine.
leftZero : List (Vect n Nat)
-> List (Vect (0 + n) Nat)
-> List (Vect n Nat)
leftZero = (++)
However, the example below can't be implemented as easily (try it!), because Idris can't figure out on its own that the two lengths unify.
rightZero' : List (Vect n Nat)
-> List (Vect (n + 0) Nat)
-> List (Vect n Nat)
Probably for the first time we realize, just how little Idris knows about the laws of arithmetics. Idris is able to unify values when
- all values in a computation are known at compile time
- one expression follows directly from the other due to the pattern matches used in a function's implementation.
In expression n + 0
, not all values are known (n
is a variable), and (+)
is implemented by pattern matching on the first argument, about which we know nothing here.
However, we can teach Idris. If we can proof that the two expressions are equivalent, we can replace one expression for the other, so that the two unify again. Here is a lemma and its proof, that n + 0
equals n
, for all natural numbers n
.
addZeroRight : (n : Nat) -> n + 0 = n
addZeroRight 0 = Refl
addZeroRight (S k) = cong S $ addZeroRight k
Note, how the base case is trivial: Since there are no variables left, Idris can immediately figure out that 0 + 0 = 0
. In the recursive case, it can be instructive to replace cong S
with a hole and look at its type and context to figure out how to proceed.
The Prelude exports function replace
for substituting one variable in a term by another, based on a proof of equality. Make sure to inspect its type first before looking at the example below:
replaceVect : Vect (n + 0) a -> Vect n a
replaceVect as = replace {p = \k => Vect k a} (addZeroRight n) as
As you can see, we replace a value of type p x
with a value of type p y
based on a proof that x = y
, where p
is a function from some type t
to Type
, and x
and y
are values of type t
. In our replaceVect
example, t
equals Nat
, x
equals n + 0
, y
equals n
, and p
equals \k => Vect k a
.
Using replace
directly is not very convenient, because Idris can often not infer the value of p
on its own. Indeed, we had to give its type explicitly in replaceVect
. Idris therefore provides special syntax for such rewrite rules, which will get desugared to calls to replace
with all the details filled in for us. Here is an implementation of replaceVect
with a rewrite rule:
rewriteVect : Vect (n + 0) a -> Vect n a
rewriteVect as = rewrite sym (addZeroRight n) in as
One source of confusion is that rewrite uses proofs of equality the other way round: Given an y = x
it replaces p x
with p y
. Hence the need to call sym
in our implementation above.
Use Case: Reversing Vectors
Rewrite rules are often required when we perform interesting type-level computations. For instance, we have already seen many interesting examples of functions operating on Vect
, which allowed us to keep track of the exact lengths of the vectors involved, but one key functionality has been missing from our discussions so far, and for good reasons: Function reverse
. Here is a possible implementation, which is how reverse
is implemented for lists:
revOnto' : Vect m a -> Vect n a -> Vect (m + n) a
revOnto' xs [] = xs
revOnto' xs (x :: ys) = revOnto' (x :: xs) ys
reverseVect' : Vect n a -> Vect n a
reverseVect' = revOnto' []
As you might have guessed, this will not compile as the length indices in the two clauses of revOnto'
do not unify.
The nil case is a case we've already seen above: Here n
is zero, because the second vector is empty, so we have to convince Idris once again that m + 0 = m
:
revOnto : Vect m a -> Vect n a -> Vect (m + n) a
revOnto xs [] = rewrite addZeroRight m in xs
The second case is more complex. Here, Idris fails to unify S (m + len)
with m + S len
, where len
is the length of ys
, the tail of the second vector. Module Data.Nat
provides many proofs about arithmetic operations on natural numbers, one of which is plusSuccRightSucc
. Here's its type:
Tutorial.Eq> :t plusSuccRightSucc
Data.Nat.plusSuccRightSucc : (left : Nat)
-> (right : Nat)
-> S (left + right) = left + S right
In our case, we want to replace S (m + len)
with m + S len
, so we will need the version with arguments flipped. However, there is one more obstacle: We need to invoke plusSuccRightSucc
with the length of ys
, which is not given as an implicit function argument of revOnto
. We therefore need to pattern match on n
(the length of the second vector), in order to bind the length of the tail to a variable. Remember, that we are allowed to pattern match on an erased argument only if the constructor used follows from a match on another, unerased, argument (ys
in this case). Here's the implementation of the second case:
revOnto {n = S len} xs (x :: ys) =
rewrite sym (plusSuccRightSucc m len) in revOnto (x :: xs) ys
I know from my own experience that this can be highly confusing at first. If you use Idris as a general purpose programming language and not as a proof assistant, you probably will not have to use rewrite rules too often. Still, it is important to know that they exist, as they allow us to teach complex equivalences to Idris.
A Note on Erasure
Single value data types like Unit
, Equal
, or SameSchema
have not runtime relevance, as values of these types are always identical. We can therefore always use them as erased function arguments while still being able to pattern match on these values. For instance, when you look at the type of replace
, you will see that the equality proof is an erased argument. This allows us to run arbitrarily complex computations to produce such values without fear of these computations slowing down the compiled Idris program.
Exercises part 4
-
Implement
plusSuccRightSucc
yourself. -
Proof that
minus n n
equals zero for all natural numbersn
. -
Proof that
minus n 0
equals n for all natural numbersn
-
Proof that
n * 1 = n
and1 * n = n
for all natural numbersn
. -
Proof that addition of natural numbers is commutative.
-
Implement a tail-recursive version of
map
for vectors. -
Proof the following proposition:
mapAppend : (f : a -> b) -> (xs : List a) -> (ys : List a) -> map f (xs ++ ys) = map f xs ++ map f ys
-
Use the proof from exercise 7 to implement again a function for zipping two
Table
s, this time using a rewrite rule plusData.HList.(++)
instead of custom functionappRows
.
Conclusion
The concept of types as propositions, values as proofs is a very powerful tool for writing provably correct programs. We will therefore spend some more time defining data types for describing contracts between values, and values of these types as proofs that the contracts hold. This will allow us to describe necessary pre- and postconditions for our functions, thus reducing the need to return a Maybe
or other failure type, because due to the restricted input, our functions can no longer fail.
Predicates and Proof Search
In the last chapter we learned about propositional equality, which allowed us to proof that two values are equal. Equality is a relation between values, and we used an indexed data type to encode this relation by limiting the degrees of freedom of the indices in the sole data constructor. There are other relations and contracts we can encode this way. This will allow us to restrict the values we accept as a function's arguments or the values returned by functions.
Preconditions
module Tutorial.Predicates.Preconditions
import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality
import Text.CSV
import System.File
%default total
Often, when we implement functions operating on values of a given type, not all values are considered to be valid arguments for the function in question. For instance, we typically do not allow division by zero, as the result is undefined in the general case. This concept of putting a precondition on a function argument comes up pretty often, and there are several ways to go about this.
A very common operation when working with lists or other container types is to extract the first value in the sequence. This function, however, cannot work in the general case, because in order to extract a value from a list, the list must not be empty. Here are a couple of ways to encode and implement this, each with its own advantages and disadvantages:
-
Wrap the result in a failure type, such as a
Maybe
orEither e
with some custom error typee
. This makes it immediately clear that the function might not be able to return a result. It is a natural way to deal with unvalidated input from unknown sources. The drawback of this approach is that results will carry theMaybe
stain, even in situations when we know that the nil case is impossible, for instance because we know the value of the list argument at compile-time, or because we already refined the input value in such a way that we can be sure it is not empty (due to an earlier pattern match, for instance). -
Define a new data type for non-empty lists and use this as the function's argument. This is the approach taken in module
Data.List1
. It allows us to return a pure value (meaning "not wrapped in a failure type" here), because the function cannot possibly fail, but it comes with the burden of reimplementing many of the utility functions and interfaces we already implemented forList
. For a very common data structure this can be a valid option, but for rare use cases it is often too cumbersome. -
Use an index to keep track of the property we are interested in. This was the approach we took with type family
List01
, which we saw in several examples and exercises in this guide so far. This is also the approach taken with vectors, where we use the exact length as our index, which is even more expressive. While this allows us to implement many functions only once and with greater precision at the type level, it also comes with the burden of keeping track of changes in the types, making for more complex function types and forcing us to at times return existentially quantified wrappers (for instance, dependent pairs), because the outcome of a computation is not known until runtime. -
Fail with a runtime exception. This is a popular solution in many programming languages (even Haskell), but in Idris we try to avoid this, because it breaks totality in a way, which also affects client code. Luckily, we can make use of our powerful type system to avoid this situation in general.
-
Take an additional (possibly erased) argument of a type we can use as a witness that the input value is of the correct kind or shape. This is the solution we will discuss in this chapter in great detail. It is an incredibly powerful way to talk about restrictions on values without having to replicate a lot of already existing functionality.
There is a time and place for most if not all of the solutions listed above in Idris, but we will often turn to the last one and refine function arguments with predicates (so called preconditions), because it makes our functions nice to use at runtime and compile time.
Example: Non-empty Lists
Remember how we implemented an indexed data type for propositional equality: We restricted the valid values of the indices in the constructors. We can do the same thing for a predicate for non-empty lists:
data NotNil : (as : List a) -> Type where
IsNotNil : NotNil (h :: t)
This is a single-value data type, so we can always use it as an erased function argument and still pattern match on it. We can now use this to implement a safe and pure head
function:
head1 : (as : List a) -> (0 _ : NotNil as) -> a
head1 (h :: _) _ = h
head1 [] IsNotNil impossible
Note, how value IsNotNil
is a witness that its index, which corresponds to our list argument, is indeed non-empty, because this is what we specified in its type. The impossible case in the implementation of head1
is not strictly necessary here. It was given above for completeness.
We call NotNil
a predicate on lists, as it restricts the values allowed in the index. We can express a function's preconditions by adding additional (possibly erased) predicates to the function's list of arguments.
The first really cool thing is how we can safely use head1
, if we can at compile-time show that our list argument is indeed non-empty:
headEx1 : Nat
headEx1 = head1 [1,2,3] IsNotNil
It is a bit cumbersome that we have to pass the IsNotNil
proof manually. Before we scratch that itch, we will first discuss what to do with lists, the values of which are not known until runtime. For these cases, we have to try and produce a value of the predicate programmatically by inspecting the runtime list value. In the most simple case, we can wrap the proof in a Maybe
, but if we can show that our predicate is decidable, we can get even stronger guarantees by returning a Dec
:
Uninhabited (NotNil []) where
uninhabited IsNotNil impossible
nonEmpty : (as : List a) -> Dec (NotNil as)
nonEmpty (x :: xs) = Yes IsNotNil
nonEmpty [] = No uninhabited
With this, we can implement function headMaybe
, which is to be used with lists of unknown origin:
headMaybe1 : List a -> Maybe a
headMaybe1 as = case nonEmpty as of
Yes prf => Just $ head1 as prf
No _ => Nothing
Of course, for trivial functions like headMaybe
it makes more sense to implement them directly by pattern matching on the list argument, but we will soon see examples of predicates the values of which are more cumbersome to create.
Auto Implicits
Having to manually pass a proof of being non-empty to head1
makes this function unnecessarily verbose to use at compile time. Idris allows us to define implicit function arguments, the values of which it tries to assemble on its own by means of a technique called proof search. This is not to be confused with type inference, which means inferring values or types from the surrounding context. It's best to look at some examples to explain the difference.
Let us first have a look at the following implementation of replicate
for vectors:
replicate' : {n : _} -> a -> Vect n a
replicate' {n = 0} _ = []
replicate' {n = S _} v = v :: replicate' v
Function replicate'
takes an unerased implicit argument. The value of this argument must be derivable from the surrounding context. For instance, in the following example it is immediately clear that n
equals three, because that is the length of the vector we want:
replicateEx1 : Vect 3 Nat
replicateEx1 = replicate' 12
In the next example, the value of n
is not known at compile time, but it is available as an unerased implicit, so this can again be passed as is to replicate'
:
replicateEx2 : {n : _} -> Vect n Nat
replicateEx2 = replicate' 12
However, in the following example, the value of n
can't be inferred, as the intermediary vector is immediately converted to a list of unknown length. Although Idris could try and insert any value for n
here, it won't do so, because it can't be sure that this is the length we want. We therefore have to pass the length explicitly:
replicateEx3 : List Nat
replicateEx3 = toList $ replicate' {n = 17} 12
Note, how the value of n
had to be inferable in these examples, which means it had to make an appearance in the surrounding context. With auto implicit arguments, this works differently. Here is the head
example, this time with an auto implicit:
head : (as : List a) -> {auto 0 prf : NotNil as} -> a
head (x :: _) = x
head [] impossible
Note the auto
keyword before the quantity of implicit argument prf
. This means, we want Idris to construct this value on its own, without it being visible in the surrounding context. In order to do so, Idris will have to at compile time know the structure of the list argument as
. It will then try and build such a value from the data type's constructors. If it succeeds, this value will then be automatically filled in as the desired argument, otherwise, Idris will fail with a type error.
Let's see this in action:
headEx3 : Nat
headEx3 = Preconditions.head [1,2,3]
The following example fails with an error:
failing "Can't find an implementation\nfor NotNil []."
errHead : Nat
errHead = Preconditions.head []
Wait! "Can't find an implementation for..."? Is this not the error message we get for missing interface implementations? That's correct, and I'll show you that interface resolution is just proof search at the end of this chapter. What I can show you already, is that writing the lengthy {auto prf : t} ->
all the times can be cumbersome. Idris therefore allows us to use the same syntax as for constrained functions instead: (prf : t) =>
, or even t =>
, if we don't need to name the constraint. As usual, we can then access a constraint in the function body by its name (if any). Here is another implementation of head
:
head' : (as : List a) -> (0 _ : NotNil as) => a
head' (x :: _) = x
head' [] impossible
During proof search, Idris will also look for values of the required type in the current function context. This allows us to implement headMaybe
without having to pass on the NotNil
proof manually:
headMaybe : List a -> Maybe a
headMaybe as = case nonEmpty as of
-- `prf` is available during proof seach
Yes prf => Just $ Preconditions.head as
No _ => Nothing
To conclude: Predicates allow us to restrict the values a function accepts as arguments. At runtime, we need to build such witnesses by pattern matching on the function arguments. These operations can typically fail. At compile time, we can let Idris try and build these values for us using a technique called proof search. This allows us to make functions safe and convenient to use at the same time.
Exercises part 1
In these exercises, you'll have to implement several functions making use of auto implicits, to constrain the values accepted as function arguments. The results should be pure, that is, not wrapped in a failure type like Maybe
.
-
Implement
tail
for lists. -
Implement
concat1
andfoldMap1
for lists. These should work likeconcat
andfoldMap
, but taking only aSemigroup
constraint on the element type. -
Implement functions for returning the largest and smallest element in a list.
-
Define a predicate for strictly positive natural numbers and use it to implement a safe and provably total division function on natural numbers.
-
Define a predicate for a non-empty
Maybe
and use it to safely extract the value stored in aJust
. Show that this predicate is decidable by implementing a corresponding conversion function. -
Define and implement functions for safely extracting values from a
Left
and aRight
by using suitable predicates. Show again that these predicates are decidable.
The predicates you implemented in these exercises are already available in the base library: Data.List.NonEmpty
, Data.Maybe.IsJust
, Data.Either.IsLeft
, Data.Either.IsRight
, and Data.Nat.IsSucc
.
Contracts between Values
module Tutorial.Predicates.Contracts
import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality
import Text.CSV
import System.File
%default total
The predicates we saw so far restricted the values of a single type, but it is also possible to define predicates describing contracts between several values of possibly distinct types.
The Elem
Predicate
Assume we'd like to extract a value of a given type from a heterogeneous list:
get' : (0 t : Type) -> HList ts -> t
This can't work in general: If we could implement this we would immediately have a proof of void:
voidAgain : Void
voidAgain = get' Void []
The problem is obvious: The type of which we'd like to extract a value must be an element of the index of the heterogeneous list. Here is a predicate, with which we can express this:
public export
data Elem : (elem : a) -> (as : List a) -> Type where
Here : Elem x (x :: xs)
There : Elem x xs -> Elem x (y :: xs)
This is a predicate describing a contract between two values: A value of type a
and a list of a
s. Values of this predicate are witnesses that the value is an element of the list. Note, how this is defined recursively: The case where the value we look for is at the head of the list is handled by the Here
constructor, where the same variable (x
) is used for the element and the head of the list. The case where the value is deeper within the list is handled by the There
constructor. This can be read as follows: If x
is an element of xs
, then x
is also an element of y :: xs
for any value y
. Let's write down some examples to get a feel for these:
MyList : List Nat
MyList = [1,3,7,8,4,12]
oneElemMyList : Elem 1 MyList
oneElemMyList = Here
sevenElemMyList : Elem 7 MyList
sevenElemMyList = There $ There Here
Now, Elem
is just another way of indexing into a list of values. Instead of using a Fin
index, which is limited by the list's length, we use a proof that a value can be found at a certain position.
We can use the Elem
predicate to extract a value from the desired type of a heterogeneous list:
get : (0 t : Type) -> HList ts -> (prf : Elem t ts) => t
It is important to note that the auto implicit must not be erased in this case. This is no longer a single value data type, and we must be able to pattern match on this value in order to figure out, how far within the heterogeneous list our value is stored:
get t (v :: vs) {prf = Here} = v
get t (v :: vs) {prf = There p} = get t vs
get _ [] impossible
It can be instructive to implement get
yourself, using holes on the right hand side to see the context and types of values Idris infers based on the value of the Elem
predicate.
Let's give this a spin at the REPL:
Tutorial.Predicates> get Nat ["foo", Just "bar", S Z]
1
Tutorial.Predicates> get Nat ["foo", Just "bar"]
Error: Can't find an implementation for Elem Nat [String, Maybe String].
(Interactive):1:1--1:28
1 | get Nat ["foo", Just "bar"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
With this example we start to appreciate what proof search actually means: Given a value v
and a list of values vs
, Idris tries to find a proof that v
is an element of vs
. Now, before we continue, please note that proof search is not a silver bullet. The search algorithm has a reasonably limited search depth, and will fail with the search if this limit is exceeded. For instance:
Tps : List Type
Tps = List.replicate 50 Nat ++ [Maybe String]
hlist : HList Tps
hlist = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, Nothing ]
And at the REPL:
Tutorial.Predicates> get (Maybe String) hlist
Error: Can't find an implementation for Elem (Maybe String) [Nat,...
As you can see, Idris fails to find a proof that Maybe String
is an element of Tps
. The search depth can be increased with the %auto_implicit_depth
directive, which will hold for the rest of the source file or until set to a different value. The default value is set at 25. In general, it is not advisable to set this to a too large value as this can drastically increase compile times.
%auto_implicit_depth 100
aMaybe : Maybe String
aMaybe = get _ hlist
%auto_implicit_depth 25
Use Case: A nicer Schema
In the chapter about sigma types, we introduced a schema for CSV files. This was not very nice to use, because we had to use natural numbers to access a certain column. Even worse, users of our small library had to do the same. There was no way to define a name for each column and access columns by name. We are going to change this. Here is an encoding for this use case:
public export
data ColType = I64 | Str | Boolean | Float
public export
IdrisType : ColType -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
public export
record Column where
constructor MkColumn
name : String
type : ColType
infixr 8 :>
public export
(:>) : String -> ColType -> Column
(:>) = MkColumn
public export
Schema : Type
Schema = List Column
export
Show ColType where
show I64 = "I64"
show Str = "Str"
show Boolean = "Boolean"
show Float = "Float"
Show Column where
show (MkColumn n ct) = "\{n}:\{show ct}"
export
showSchema : Schema -> String
showSchema = concat . intersperse "," . map show
As you can see, in a schema we now pair a column's type with its name. Here is an example schema for a CSV file holding information about employees in a company:
EmployeeSchema : Schema
EmployeeSchema = [ "firstName" :> Str
, "lastName" :> Str
, "email" :> Str
, "age" :> I64
, "salary" :> Float
, "management" :> Boolean
]
Such a schema could of course again be read from user input, but we will wait with implementing a parser until later in this chapter. Using this new schema with an HList
directly led to issues with type inference, therefore I quickly wrote a custom row type: A heterogeneous list indexed over a schema.
public export
data Row : Schema -> Type where
Nil : Row []
(::) : {0 name : String}
-> {0 type : ColType}
-> (v : IdrisType type)
-> Row ss
-> Row (name :> type :: ss)
In the signature of cons, I list the erased implicit arguments explicitly. This is good practice, as otherwise Idris will often issue shadowing warnings when using such data constructors in client code.
We can now define a type alias for CSV rows representing employees:
0 Employee : Type
Employee = Row EmployeeSchema
hock : Employee
hock = [ "Stefan", "Höck", "hock@foo.com", 46, 5443.2, False ]
Note, how I gave Employee
a zero quantity. This means, we are only ever allowed to use this function at compile time but never at runtime. This is a safe way to make sure our type-level functions and aliases do not leak into the executable when we build our application. We are allowed to use zero-quantity functions and values in type signatures and when computing other erased values, but not for runtime-relevant computations.
We would now like to access a value in a row based on the name given. For this, we write a custom predicate, which serves as a witness that a column with the given name is part of the schema. Now, here is an important thing to note: In this predicate we include an index for the type of the column with the given name. We need this, because when we access a column by name, we need a way to figure out the return type. But during proof search, this type will have to be derived by Idris based on the column name and schema in question (otherwise, the proof search will fail unless the return type is known in advance). We therefore must tell Idris, that it can't include this type in the list of search criteria, otherwise it will try and infer the column type from the context (using type inference) before running the proof search. This can be done by listing the indices to be used in the search like so: [search name schema]
.
public export
data InSchema : (name : String)
-> (schema : Schema)
-> (colType : ColType)
-> Type where
[search name schema]
IsHere : InSchema n (n :> t :: ss) t
IsThere : InSchema n ss t -> InSchema n (fld :: ss) t
export
Uninhabited (InSchema n [] c) where
uninhabited IsHere impossible
uninhabited (IsThere _) impossible
With this, we are now ready to access the value at a given column based on the column's name:
export
getAt : {0 ss : Schema}
-> (name : String)
-> (row : Row ss)
-> (prf : InSchema name ss c)
=> IdrisType c
getAt name (v :: vs) {prf = IsHere} = v
getAt name (_ :: vs) {prf = IsThere p} = getAt name vs
Below is an example how to use this at compile time. Note the amount of work Idris performs for us: It first comes up with proofs that firstName
, lastName
, and age
are indeed valid names in the Employee
schema. From these proofs it automatically figures out the return types of the calls to getAt
and extracts the corresponding values from the row. All of this happens in a provably total and type safe way.
shoeck : String
shoeck = getAt "firstName" hock
++ " "
++ getAt "lastName" hock
++ ": "
++ show (getAt "age" hock)
++ " years old."
In order to at runtime specify a column name, we need a way for computing values of type InSchema
by comparing the column names with the schema in question. Since we have to compare two string values for being propositionally equal, we use the DecEq
implementation for String
here (Idris provides DecEq
implementations for all primitives). We extract the column type at the same time and pair this (as a dependent pair) with the InSchema
proof:
export
inSchema : (ss : Schema) -> (n : String) -> Maybe (c ** InSchema n ss c)
inSchema [] _ = Nothing
inSchema (MkColumn cn t :: xs) n = case decEq cn n of
Yes Refl => Just (t ** IsHere)
No contra => case inSchema xs n of
Just (t ** prf) => Just $ (t ** IsThere prf)
Nothing => Nothing
At the end of this chapter we will use InSchema
in our CSV command-line application to list all values in a column.
Exercises part 2
-
Show that
InSchema
is decidable by changing the output type ofinSchema
toDec (c ** InSchema n ss c)
. -
Declare and implement a function for modifying a field in a row based on the column name given.
-
Define a predicate to be used as a witness that one list contains only elements in the second list in the same order and use this predicate to extract several columns from a row at once.
For instance,
[2,4,5]
contains elements from[1,2,3,4,5,6]
in the correct order, but[4,2,5]
does not. -
Improve the functionality from exercise 3 by defining a new predicate, witnessing that all strings in a list correspond to column names in a schema (in arbitrary order). Use this to extract several columns from a row at once in arbitrary order.
Hint: Make sure to include the resulting schema as an index, but search only based on the list of names and the input schema.
Use Case: Flexible Error Handling
module Tutorial.Predicates.ErrorHandling
import Tutorial.Predicates.Contracts
import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality
import Text.CSV
import System.File
%default total
A recurring pattern when writing larger applications is the combination of different parts of a program each with their own failure types in a larger effectful computation. We saw this, for instance, when implementing a command-line tool for handling CSV files. There, we read and wrote data from and to files, we parsed column types and schemata, we parsed row and column indices and command-line commands. All these operations came with the potential of failure and might be implemented in different parts of our application. In order to unify these different failure types, we wrote a custom sum type encapsulating each of them, and wrote a single handler for this sum type. This approach was alright then, but it does not scale well and is lacking in terms of flexibility. We are therefore trying a different approach here. Before we continue, we quickly implement a couple of functions with the potential of failure plus some custom error types:
public export
record NoNat where
constructor MkNoNat
str : String
readNat' : String -> Either NoNat Nat
readNat' s = maybeToEither (MkNoNat s) $ parsePositive s
public export
record NoColType where
constructor MkNoColType
str : String
readColType' : String -> Either NoColType ColType
readColType' "I64" = Right I64
readColType' "Str" = Right Str
readColType' "Boolean" = Right Boolean
readColType' "Float" = Right Float
readColType' s = Left $ MkNoColType s
However, if we wanted to parse a Fin n
, there'd be already two ways how this could fail: The string in question could not represent a natural number (leading to a NoNat
error), or it could be out of bounds (leading to an OutOfBounds
error). We have to somehow encode these two possibilities in the return type, for instance, by using an Either
as the error type:
public export
record OutOfBounds where
constructor MkOutOfBounds
size : Nat
index : Nat
readFin' : {n : _} -> String -> Either (Either NoNat OutOfBounds) (Fin n)
readFin' s = do
ix <- mapFst Left (readNat' s)
maybeToEither (Right $ MkOutOfBounds n ix) $ natToFin ix n
This is incredibly ugly. A custom sum type might have been slightly better, but we still would have to use mapFst
when invoking readNat'
, and writing custom sum types for every possible combination of errors will get cumbersome very quickly as well. What we are looking for, is a generalized sum type: A type indexed by a list of types (the possible choices) holding a single value of exactly one of the types in question. Here is a first naive try:
data Sum : List Type -> Type where
MkSum : (val : t) -> Sum ts
However, there is a crucial piece of information missing: We have not verified that t
is an element of ts
, nor which type it actually is. In fact, this is another case of an erased existential, and we will have no way to at runtime learn something about t
. What we need to do is to pair the value with a proof, that its type t
is an element of ts
. We could use Elem
again for this, but for some use cases we will require access to the number of types in the list. We will therefore use a vector instead of a list as our index. Here is a predicate similar to Elem
but for vectors:
public export
data Has : (v : a) -> (vs : Vect n a) -> Type where
Z : Has v (v :: vs)
S : Has v vs -> Has v (w :: vs)
export
Uninhabited (Has v []) where
uninhabited Z impossible
uninhabited (S _) impossible
A value of type Has v vs
is a witness that v
is an element of vs
. With this, we can now implement an indexed sum type (also called an open union):
public export
data Union : Vect n Type -> Type where
U : (ix : Has t ts) -> (val : t) -> Union ts
export
Uninhabited (Union []) where
uninhabited (U ix _) = absurd ix
Note the difference between HList
and Union
. HList
is a generalized product type: It holds a value for each type in its index. Union
is a generalized sum type: It holds only a single value, which must be of a type listed in the index. With this we can now define a much more flexible error type:
public export
0 Err : Vect n Type -> Type -> Type
Err ts t = Either (Union ts) t
A function returning an Err ts a
describes a computation, which can fail with one of the errors listed in ts
. We first need some utility functions.
inject : (prf : Has t ts) => (v : t) -> Union ts
inject v = U prf v
export
fail : Has t ts => (err : t) -> Err ts a
fail err = Left $ inject err
failMaybe : Has t ts => (err : Lazy t) -> Maybe a -> Err ts a
failMaybe err = maybeToEither (inject err)
Next, we can write more flexible versions of the parsers we wrote above:
readNat : Has NoNat ts => String -> Err ts Nat
readNat s = failMaybe (MkNoNat s) $ parsePositive s
readColType : Has NoColType ts => String -> Err ts ColType
readColType "I64" = Right I64
readColType "Str" = Right Str
readColType "Boolean" = Right Boolean
readColType "Float" = Right Float
readColType s = fail $ MkNoColType s
Before we implement readFin
, we introduce a short cut for specifying that several error types must be present:
public export
0 Errs : List Type -> Vect n Type -> Type
Errs [] _ = ()
Errs (x :: xs) ts = (Has x ts, Errs xs ts)
Function Errs
returns a tuple of constraints. This can be used as a witness that all listed types are present in the vector of types: Idris will automatically extract the proofs from the tuple as needed.
export
readFin : {n : _} -> Errs [NoNat, OutOfBounds] ts => String -> Err ts (Fin n)
readFin s = do
S ix <- readNat s | Z => fail (MkOutOfBounds n Z)
failMaybe (MkOutOfBounds n (S ix)) $ natToFin ix n
As a last example, here are parsers for schemata and CSV rows:
fromCSV : String -> List String
fromCSV = forget . split (',' ==)
public export
record InvalidColumn where
constructor MkInvalidColumn
str : String
readColumn : Errs [InvalidColumn, NoColType] ts => String -> Err ts Column
readColumn s = case forget $ split (':' ==) s of
[n,ct] => MkColumn n <$> readColType ct
_ => fail $ MkInvalidColumn s
export
readSchema : Errs [InvalidColumn, NoColType] ts => String -> Err ts Schema
readSchema = traverse readColumn . fromCSV
public export
data RowError : Type where
InvalidField : (row, col : Nat) -> (ct : ColType) -> String -> RowError
UnexpectedEOI : (row, col : Nat) -> RowError
ExpectedEOI : (row, col : Nat) -> RowError
decodeField : Has RowError ts
=> (row,col : Nat)
-> (c : ColType)
-> String
-> Err ts (IdrisType c)
decodeField row col c s =
let err = InvalidField row col c s
in case c of
I64 => failMaybe err $ read s
Str => failMaybe err $ read s
Boolean => failMaybe err $ read s
Float => failMaybe err $ read s
export
decodeRow : Has RowError ts
=> {s : _}
-> (row : Nat)
-> (str : String)
-> Err ts (Row s)
decodeRow row = go 1 s . fromCSV
where go : Nat -> (cs : Schema) -> List String -> Err ts (Row cs)
go k [] [] = Right []
go k [] (_ :: _) = fail $ ExpectedEOI row k
go k (_ :: _) [] = fail $ UnexpectedEOI row k
go k (MkColumn n c :: cs) (s :: ss) =
[| decodeField row k c s :: go (S k) cs ss |]
Here is an example REPL session, where I test readSchema
. I defined variable ts
using the :let
command to make this more convenient. Note, how the order of error types is of no importance, as long as types InvalidColumn
and NoColType
are present in the list of errors:
Tutorial.Predicates> :let ts = the (Vect 3 _) [NoColType,NoNat,InvalidColumn]
Tutorial.Predicates> readSchema {ts} "foo:bar"
Left (U Z (MkNoColType "bar"))
Tutorial.Predicates> readSchema {ts} "foo:Float"
Right [MkColumn "foo" Float]
Tutorial.Predicates> readSchema {ts} "foo Float"
Left (U (S (S Z)) (MkInvalidColumn "foo Float"))
Error Handling
There are several techniques for handling errors, all of which are useful at times. For instance, we might want to handle some errors early on and individually, while dealing with others much later in our application. Or we might want to handle them all in one fell swoop. We look at both approaches here.
First, in order to handle a single error individually, we need to split a union into one of two possibilities: A value of the error type in question or a new union, holding one of the other error types. We need a new predicate for this, which not only encodes the presence of a value in a vector but also the result of removing that value:
data Rem : (v : a) -> (vs : Vect (S n) a) -> (rem : Vect n a) -> Type where
[search v vs]
RZ : Rem v (v :: rem) rem
RS : Rem v vs rem -> Rem v (w :: vs) (w :: rem)
Once again, we want to use one of the indices (rem
) in our functions' return types, so we only use the other indices during proof search. Here is a function for splitting off a value from an open union:
split : (prf : Rem t ts rem) => Union ts -> Either t (Union rem)
split {prf = RZ} (U Z val) = Left val
split {prf = RZ} (U (S x) val) = Right (U x val)
split {prf = RS p} (U Z val) = Right (U Z val)
split {prf = RS p} (U (S x) val) = case split {prf = p} (U x val) of
Left vt => Left vt
Right (U ix y) => Right $ U (S ix) y
This tries to extract a value of type t
from a union. If it works, the result is wrapped in a Left
, otherwise a new union is returned in a Right
, but this one has t
removed from its list of possible types.
With this, we can implement a handler for single errors. Error handling often happens in an effectful context (we might want to print a message to the console or write the error to a log file), so we use an applicative effect type to handle errors in.
handle : Applicative f
=> Rem t ts rem
=> (h : t -> f a)
-> Err ts a
-> f (Err rem a)
handle h (Left x) = case split x of
Left v => Right <$> h v
Right err => pure $ Left err
handle _ (Right x) = pure $ Right x
For handling all errors at once, we can use a handler type indexed by the vector of errors, and parameterized by the output type:
namespace Handler
public export
data Handler : (ts : Vect n Type) -> (a : Type) -> Type where
Nil : Handler [] a
(::) : (t -> a) -> Handler ts a -> Handler (t :: ts) a
extract : Handler ts a -> Has t ts -> t -> a
extract (f :: _) Z val = f val
extract (_ :: fs) (S y) val = extract fs y val
extract [] ix _ = absurd ix
handleAll : Applicative f => Handler ts (f a) -> Err ts a -> f a
handleAll _ (Right v) = pure v
handleAll h (Left $ U ix v) = extract h ix v
Below, we will see an additional way of handling all errors at once by defining a custom interface for error handling.
Exercises part 3
-
Implement the following utility functions for
Union
:project : (0 t : Type) -> (prf : Has t ts) => Union ts -> Maybe t project1 : Union [t] -> t safe : Err [] a -> a
-
Implement the following two functions for embedding an open union in a larger set of possibilities. Note the unerased implicit in
extend
!weaken : Union ts -> Union (ts ++ ss) extend : {m : _} -> {0 pre : Vect m _} -> Union ts -> Union (pre ++ ts)
-
Find a general way to embed a
Union ts
in aUnion ss
, so that the following is possible:embedTest : Err [NoNat,NoColType] a -> Err [FileError, NoColType, OutOfBounds, NoNat] a embedTest = mapFst embed
-
Make
handle
more powerful, by letting the handler convert the error in question to anf (Err rem a)
.
The Truth about Interfaces
module Tutorial.Predicates.Truth
import Tutorial.Predicates.Contracts
import Tutorial.Predicates.ErrorHandling
import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality
import Text.CSV
import System.File
%default total
Well, here it finally is: The truth about interfaces. Internally, an interface is just a record data type, with its fields corresponding to the members of the interface. An interface implementation is a value of such a record, annotated with a %hint
pragma (see below) to make the value available during proof search. Finally, a constrained function is just a function with one or more auto implicit arguments. For instance, here is the same function for looking up an element in a list, once with the known syntax for constrained functions, and once with an auto implicit argument. The code produced by Idris is the same in both cases:
isElem1 : Eq a => a -> List a -> Bool
isElem1 v [] = False
isElem1 v (x :: xs) = x == v || isElem1 v xs
isElem2 : {auto _ : Eq a} -> a -> List a -> Bool
isElem2 v [] = False
isElem2 v (x :: xs) = x == v || isElem2 v xs
Being mere records, we can also take interfaces as regular function arguments and dissect them with a pattern match:
eq : Eq a -> a -> a -> Bool
eq (MkEq feq fneq) = feq
A manual Interface Definition
I'll now demonstrate how we can achieve the same behavior with proof search as with a regular interface definition plus implementations. Since I want to finish the CSV example with our new error handling tools, we are going to implement some error handlers. First, an interface is just a record:
record Print a where
constructor MkPrint
print' : a -> String
In order to access the record in a constrained function, we use the %search
keyword, which will try to conjure a value of the desired type (Print a
in this case) by means of a proof search:
print : Print a => a -> String
print = print' %search
As an alternative, we could use a named constraint, and access it directly via its name:
print2 : (impl : Print a) => a -> String
print2 = print' impl
As yet another alternative, we could use the syntax for auto implicit arguments:
print3 : {auto impl : Print a} -> a -> String
print3 = print' impl
All three versions of print
behave exactly the same at runtime. So, whenever we write {auto x : Foo} ->
we can just as well write (x : Foo) =>
and vice versa.
Interface implementations are just values of the given record type, but in order to be available during proof search, these need to be annotated with a %hint
pragma:
%hint
noNatPrint : Print NoNat
noNatPrint = MkPrint $ \e => "Not a natural number: \{e.str}"
%hint
noColTypePrint : Print NoColType
noColTypePrint = MkPrint $ \e => "Not a column type: \{e.str}"
%hint
outOfBoundsPrint : Print OutOfBounds
outOfBoundsPrint = MkPrint $ \e => "Index is out of bounds: \{show e.index}"
%hint
rowErrorPrint : Print RowError
rowErrorPrint = MkPrint $
\case InvalidField r c ct s =>
"Not a \{show ct} in row \{show r}, column \{show c}. \{s}"
UnexpectedEOI r c =>
"Unexpected end of input in row \{show r}, column \{show c}."
ExpectedEOI r c =>
"Expected end of input in row \{show r}, column \{show c}."
We can also write an implementation of Print
for a union or errors. For this, we first come up with a proof that all types in the union's index come with an implementation of Print
:
0 All : (f : a -> Type) -> Vect n a -> Type
All f [] = ()
All f (x :: xs) = (f x, All f xs)
unionPrintImpl : All Print ts => Union ts -> String
unionPrintImpl (U Z val) = print val
unionPrintImpl (U (S x) val) = unionPrintImpl $ U x val
%hint
unionPrint : All Print ts => Print (Union ts)
unionPrint = MkPrint unionPrintImpl
Defining interfaces this way can be an advantage, as there is much less magic going on, and we have more fine grained control over the types and values of our fields. Note also, that all of the magic comes from the search hints, with which our "interface implementations" were annotated. These made the corresponding values and functions available during proof search.
Parsing CSV Commands
To conclude this chapter, we reimplement our CSV command parser, using the flexible error handling approach from the last section. While not necessarily less verbose than the original parser, this approach decouples the handling of errors and printing of error messages from the rest of the application: Functions with a possibility of failure are reusable in different contexts, as are the pretty printers we use for the error messages.
First, we repeat some stuff from earlier chapters. I sneaked in a new command for printing all values in a column:
record Table where
constructor MkTable
schema : Schema
size : Nat
rows : Vect size (Row schema)
data Command : (t : Table) -> Type where
PrintSchema : Command t
PrintSize : Command t
New : (newSchema : Schema) -> Command t
Prepend : Row (schema t) -> Command t
Get : Fin (size t) -> Command t
Delete : Fin (size t) -> Command t
Col : (name : String)
-> (tpe : ColType)
-> (prf : InSchema name t.schema tpe)
-> Command t
Quit : Command t
applyCommand : (t : Table) -> Command t -> Table
applyCommand t PrintSchema = t
applyCommand t PrintSize = t
applyCommand _ (New ts) = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r) = MkTable ts _ $ r :: rs
applyCommand t (Get x) = t
applyCommand t Quit = t
applyCommand t (Col _ _ _) = t
applyCommand (MkTable ts n rs) (Delete x) = case n of
S k => MkTable ts k (deleteAt x rs)
Z => absurd x
Next, below is the command parser reimplemented. In total, it can fail in seven different was, at least some of which might also be possible in other parts of a larger application.
record UnknownCommand where
constructor MkUnknownCommand
str : String
%hint
unknownCommandPrint : Print UnknownCommand
unknownCommandPrint = MkPrint $ \v => "Unknown command: \{v.str}"
record NoColName where
constructor MkNoColName
str : String
%hint
noColNamePrint : Print NoColName
noColNamePrint = MkPrint $ \v => "Unknown column: \{v.str}"
0 CmdErrs : Vect 7 Type
CmdErrs = [ InvalidColumn
, NoColName
, NoColType
, NoNat
, OutOfBounds
, RowError
, UnknownCommand ]
readCommand : (t : Table) -> String -> Err CmdErrs (Command t)
readCommand _ "schema" = Right PrintSchema
readCommand _ "size" = Right PrintSize
readCommand _ "quit" = Right Quit
readCommand (MkTable ts n _) s = case words s of
["new", str] => New <$> readSchema str
"add" :: ss => Prepend <$> decodeRow 1 (unwords ss)
["get", str] => Get <$> readFin str
["delete", str] => Delete <$> readFin str
["column", str] => case inSchema ts str of
Just (ct ** prf) => Right $ Col str ct prf
Nothing => fail $ MkNoColName str
_ => fail $ MkUnknownCommand s
Note, how we could invoke functions like readFin
or readSchema
directly, because the necessary error types are part of our list of possible errors.
To conclude this sections, here is the functionality for printing the result of a command plus the application's main loop. Most of this is repeated from earlier chapters, but note how we can handle all errors at once with a single call to print
:
encodeField : (t : ColType) -> IdrisType t -> String
encodeField I64 x = show x
encodeField Str x = show x
encodeField Boolean True = "t"
encodeField Boolean False = "f"
encodeField Float x = show x
encodeRow : (s : Schema) -> Row s -> String
encodeRow s = concat . intersperse "," . go s
where go : (s' : Schema) -> Row s' -> Vect (length s') String
go [] [] = []
go (MkColumn _ c :: cs) (v :: vs) = encodeField c v :: go cs vs
encodeCol : (name : String)
-> (c : ColType)
-> InSchema name s c
=> Vect n (Row s)
-> String
encodeCol name c = unlines . toList . map (\r => encodeField c $ getAt name r)
result : (t : Table) -> Command t -> String
result t PrintSchema = "Current schema: \{showSchema t.schema}"
result t PrintSize = "Current size: \{show t.size}"
result _ (New ts) = "Created table. Schema: \{showSchema ts}"
result t (Prepend r) = "Row prepended: \{encodeRow t.schema r}"
result _ (Delete x) = "Deleted row: \{show $ FS x}."
result _ Quit = "Goodbye."
result t (Col n c prf) = "Column \{n}:\n\{encodeCol n c t.rows}"
result t (Get x) =
"Row \{show $ FS x}: \{encodeRow t.schema (index x t.rows)}"
covering
runProg : Table -> IO ()
runProg t = do
putStr "Enter a command: "
str <- getLine
case readCommand t str of
Left err => putStrLn (print err) >> runProg t
Right Quit => putStrLn (result t Quit)
Right cmd => putStrLn (result t cmd) >>
runProg (applyCommand t cmd)
covering
main : IO ()
main = runProg $ MkTable [] _ []
Here is an example REPL session:
Tutorial.Predicates> :exec main
Enter a command: new name:Str,age:Int64,salary:Float
Not a column type: Int64
Enter a command: new name:Str,age:I64,salary:Float
Created table. Schema: name:Str,age:I64,salary:Float
Enter a command: add John Doe,44,3500
Row prepended: "John Doe",44,3500.0
Enter a command: add Jane Doe,50,4000
Row prepended: "Jane Doe",50,4000.0
Enter a command: get 1
Row 1: "Jane Doe",50,4000.0
Enter a command: column salary
Column salary:
4000.0
3500.0
Enter a command: quit
Goodbye.
Conclusion
Predicates allow us to describe contracts between types and to refine the values we accept as valid function arguments. They allow us to make a function safe and convenient to use at runtime and compile time by using them as auto implicit arguments, which Idris should try to construct on its own if it has enough information about the structure of a function's arguments.
Primitives
In the topics we covered so far, we hardly ever talked about primitive types in Idris. They were around and we used them in some computations, but I never really explained how they work and where they come from, nor did I show in detail what we can and can't do with them.
How Primitives are Implemented
module Tutorial.Prim.Prim
import Data.Bits
import Data.String
%default total
A Short Note on Backends
According to Wikipedia, a compiler is "a computer program that translates computer code written in one programming language (the source language) into another language (the target language)". The Idris compiler is exactly that: A program translating programs written in Idris into programs written in Chez Scheme. This scheme code is then parsed and interpreted by a Chez Scheme interpreter, which must be installed on the computers we use to run compiled Idris programs.
But that's only part of the story. Idris 2 was from the beginning designed to support different code generators (so called backends), which allows us to write Idris code to target different platforms, and your Idris installation comes with several additional backends available. You can specify the backend to use with the --cg
command-line argument (cg
stands for code generator). For instance:
idris2 --cg racket
Here is a non-comprehensive list of the backends available with a standard Idris installation (the name to be used in the command-line argument is given in parentheses):
- Racket Scheme (
racket
): This is a different flavour of the scheme programming language, which can be useful to use when Chez Scheme is not available on your operating system. - Node.js (
node
): This converts an Idris program to JavaScript. - Browser (
javascript
): Another JavaScript backend which allows you to write web applications which run in the browser in Idris. - RefC (
refc
): A backend compiling Idris to C code, which is then further compiled by a C compiler.
I plan to at least cover the JavaScript backends in some more detail in another part of this Idris guide, as I use them pretty often myself.
There are also several external backends not officially supported by the Idris project, amongst which are backends for compiling Idris code to Java and Python. You can find a list of external backends on the Idris Wiki.
The Idris Primitives
A primitive data type is a type that is built into the Idris compiler together with a set of primitive functions, which are used to perform calculations on the primitives. You will therefore not find a definition of a primitive type or function in the source code of the Prelude.
Here is again the list of primitive types in Idris:
- Signed, fixed precision integers:
Int8
: Integer in the range [-128,127]Int16
: Integer in the range [-32768,32767]Int32
: Integer in the range [-2147483648,2147483647]Int64
: Integer in the range [-9223372036854775808,9223372036854775807]
- Unsigned, fixed precision integers:
Bits8
: Integer in the range [0,255]Bits16
: Integer in the range [0,65535]Bits32
: Integer in the range [0,4294967295]Bits64
: Integer in the range [0,18446744073709551615]
Integer
: A signed, arbitrary precision integer.Double
: A double precision (64 bit) floating point number.Char
: A unicode character.String
: A sequence of unicode characters.%World
: A symbolic representation of the current world state. We learned about this when I showed you howIO
is implemented. Most of the time, you will not handle values of this type in your own code.Int
: This one is special. It is a fixed precision, signed integer, but the bit size is somewhat dependent on the backend and (maybe) platform we use. For instance, if you use the default Chez Scheme backend,Int
is a 64 bit signed integer, while on the JavaScript backends it is a 32 bit signed integer for performance reasons. Therefore,Int
comes with very few guarantees, and you should use one of the well specified integer types listed above whenever possible.
It can be instructive to learn, where in the compiler's source code the primitive types and functions are defined. This source code can be found in folder src
of the Idris project and the primitive types are the constant constructors of data type Core.TT.Constant
.
Primitive Functions
All calculations operating on primitives are based on two kinds of primitive functions: The ones built into the compiler (see below) and the ones defined by programmers via the foreign function interface (FFI), about which I'll talk in another chapter.
Built-in primitive functions are functions known to the compiler the definition of which can not be found in the Prelude. They define the core functionality available for the primitive types. Typically, you do not invoke these directly (although it is perfectly fine to do so in most cases) but via functions and interfaces exported by the Prelude or the base library.
For instance, the primitive function for adding two eight bit unsigned integers is prim__add_Bits8
. You can inspect its type and behavior at the REPL:
Tutorial.Prim> :t prim__add_Bits8
prim__add_Bits8 : Bits8 -> Bits8 -> Bits8
Tutorial.Prim> prim__add_Bits8 12 100
112
If you look at the source code implementing interface Num
for Bits8
, you will see that the plus operator just invokes prim__add_Bits8
internally. The same goes for most of the other functions in primitive interface implementations. For instance, every primitive type with the exception of %World
comes with primitive comparison functions. For Bits8
, these are: prim__eq_Bits8
, prim__gt_Bits8
, prim__lt_Bits8
, prim__gte_Bits8
, and prim__lte_Bits8
. Note, that these functions do not return a Bool
(which is not a primitive type in Idris), but an Int
. They are therefore not as safe or convenient to use as the corresponding operator implementations from interfaces Eq
and Comp
. On the other hand, they do not go via a conversion to Bool
and might therefore perform slightly better in performance critical code (which you can only identify after some serious profiling).
As with primitive types, the primitive functions are listed as constructors in a data type (Core.TT.PrimFn
) in the compiler sources. We will look at most of these in the following sections.
Consequences of being Primitive
Primitive functions and types are opaque to the compiler in most regards: They have to be defined and implemented by each backend individually, therefore, the compiler knows nothing about the inner structure of a primitive value nor about the inner workings of primitive functions. For instance, in the following recursive function, we know that the argument in the recursive call must be converging towards the base case (unless there is a bug in the backend we use), but the compiler does not:
covering
replicateBits8' : Bits8 -> a -> List a
replicateBits8' 0 _ = []
replicateBits8' n v = v :: replicateBits8' (n - 1) v
In these cases, we either must be content with just a covering function, or we use assert_smaller
to convince the totality checker (the preferred way):
replicateBits8 : Bits8 -> a -> List a
replicateBits8 0 _ = []
replicateBits8 n v = v :: replicateBits8 (assert_smaller n $ n - 1) v
I have shown you the risks of using assert_smaller
before, so we must be extra careful in making sure that the new function argument is indeed smaller with relation to the base case.
While Idris knows nothing about the internal workings of primitives and related functions, most of these functions still reduce during evaluation when fed with values known at compile time. For instance, we can trivially proof that for Bits8
the following equation holds:
zeroBits8 : the Bits8 0 = 255 + 1
zeroBits8 = Refl
Having no clue about the internal structure of a primitive nor about the implementations of primitive functions, Idris can't help us proofing any general properties of such functions and values. Here is an example to demonstrate this. Assume we'd like to wrap a list in a data type indexed by the list's length:
data LenList : (n : Nat) -> Type -> Type where
MkLenList : (as : List a) -> LenList (length as) a
When we concatenate two LenList
s, the length indices should be added. That's how list concatenation affects the length of lists. We can safely teach Idris that this is true:
0 concatLen : (xs,ys : List a) -> length xs + length ys = length (xs ++ ys)
concatLen [] ys = Refl
concatLen (x :: xs) ys = cong S $ concatLen xs ys
With the above lemma, we can implement concatenation of LenList
:
(++) : LenList m a -> LenList n a -> LenList (m + n) a
MkLenList xs ++ MkLenList ys =
rewrite concatLen xs ys in MkLenList (xs ++ ys)
The same is not possible for strings. There are applications where pairing a string with its length would be useful (for instance, if we wanted to make sure that strings are getting strictly shorter during parsing and will therefore eventually be wholly consumed), but Idris cannot help us getting these things right. There is no way to implement and thus proof the following lemma in a safe way:
0 concatLenStr : (a,b : String) -> length a + length b = length (a ++ b)
Believe Me!
In order to implement concatLenStr
, we have to abandon all safety and use the ten ton wrecking ball of type coercion: believe_me
. This primitive function allows us to freely coerce a value of any type into a value of any other type. Needless to say, this is only safe if we really know what we are doing:
concatLenStr a b = believe_me $ Refl {x = length a + length b}
The explicit assignment of variable x
in {x = length a + length b}
is necessary, because otherwise Idris will complain about an unsolved hole: It can't infer the type of parameter x
in the Refl
constructor. We could assign any type to x
here, because we are passing the result to believe_me
anyway, but I consider it to be good practice to assign one of the two sides of the equality to make our intention clear.
The higher the complexity of a primitive type, the riskier it is to assume even the most basic properties for it to hold. For instance, we might act under the delusion that floating point addition is associative:
0 doubleAddAssoc : (x,y,z : Double) -> x + (y + z) = (x + y) + z
doubleAddAssoc x y z = believe_me $ Refl {x = x + (y + z)}
Well, guess what: That's a lie. And lies lead us straight into the Void
:
Tiny : Double
Tiny = 0.0000000000000001
One : Double
One = 1.0
wrong : (0 _ : 1.0000000000000002 = 1.0) -> Void
wrong Refl impossible
boom : Void
boom = wrong (doubleAddAssoc One Tiny Tiny)
Here's what happens in the code above: The call to doubleAddAssoc
returns a proof that One + (Tiny + Tiny)
is equal to (One + Tiny) + Tiny
. But One + (Tiny + Tiny)
equals 1.0000000000000002
, while (One + Tiny) + Tiny
equals 1.0
. We can therefore pass our (wrong) proof to wrong
, because it is of the correct type, and from this follows a proof of Void
.
Working with Strings
module Tutorial.Prim.Strings
import Data.Bits
import Data.String
%default total
Module Data.String
in base offers a rich set of functions for working with strings. All these are based on the following primitive operations built into the compiler:
prim__strLength
: Returns the length of a string.prim__strHead
: Extracts the first character from a string.prim__strTail
: Removes the first character from a string.prim__strCons
: Prepends a character to a string.prim__strAppend
: Appends two strings.prim__strIndex
: Extracts a character at the given position from a string.prim__strSubstr
: Extracts the substring between the given positions.
Needless to say, not all of these functions are total. Therefore, Idris must make sure that invalid calls do not reduce during compile time, as otherwise the compiler would crash. If, however we force the evaluation of a partial primitive function by compiling and running the corresponding program, this program will crash with an error:
Tutorial.Prim> prim__strTail ""
prim__strTail ""
Tutorial.Prim> :exec putStrLn (prim__strTail "")
Exception in substring: 1 and 0 are not valid start/end indices for ""
Note, how prim__strTail ""
is not reduced at the REPL and how the same expression leads to a runtime exception if we compile and execute the program. Valid calls to prim__strTail
are reduced just fine, however:
tailExample : prim__strTail "foo" = "oo"
tailExample = Refl
Pack and Unpack
Two of the most important functions for working with strings are unpack
and pack
, which convert a string to a list of characters and vice versa. This allows us to conveniently implement many string operations by iterating or folding over the list of characters instead. This might not always be the most efficient thing to do, but unless you plan to handle very large amounts of text, they work and perform reasonably well.
String Interpolation
Idris allows us to include arbitrary string expressions in a string literal by wrapping them in curly braces, the first of which has to be escaped with a backslash. For instance:
interpEx1 : Bits64 -> Bits64 -> String
interpEx1 x y = "\{show x} + \{show y} = \{show $ x + y}"
This is a very convenient way to assemble complex strings from values of different types. In addition, there is interface Interpolation
, which allows us to use values in interpolated strings without having to convert them to strings first:
data Element = H | He | C | N | O | F | Ne
Formula : Type
Formula = List (Element,Nat)
Interpolation Element where
interpolate H = "H"
interpolate He = "He"
interpolate C = "C"
interpolate N = "N"
interpolate O = "O"
interpolate F = "F"
interpolate Ne = "Ne"
Interpolation (Element,Nat) where
interpolate (_, 0) = ""
interpolate (x, 1) = "\{x}"
interpolate (x, k) = "\{x}\{show k}"
Interpolation Formula where
interpolate = foldMap interpolate
ethanol : String
ethanol = "The formulat of ethanol is: \{[(C,2),(H,6),(O, the Nat 1)]}"
Raw and Multiline String Literals
In string literals, we have to escape certain characters like quotes, backslashes or new line characters. For instance:
escapeExample : String
escapeExample = "A quote: \". \nThis is on a new line.\nA backslash: \\"
Idris allows us to enter raw string literals, where there is no need to escape quotes and backslashes, by pre- and postfixing the wrapping quote characters with the same number of hash characters. For instance:
rawExample : String
rawExample = #"A quote: ". A blackslash: \"#
rawExample2 : String
rawExample2 = ##"A quote: ". A blackslash: \"##
With raw string literals, it is still possible to use string interpolation, but the opening curly brace has to be prefixed with a backslash and the same number of hashes as are being used for opening and closing the string literal:
rawInterpolExample : String
rawInterpolExample = ##"An interpolated "string": \##{rawExample}"##
Finally, Idris also allows us to conveniently write multiline strings. These can be pre- and postfixed with hashes if we want raw multiline string literals, and they also can be combined with string interpolation. Multiline literals are opened and closed with triple quote characters. Indenting the closing triple quotes allows us to indent the whole multiline literal. Whitespace used for indentation will not appear in the resulting string. For instance:
multiline1 : String
multiline1 = """
And I raise my head and stare
Into the eyes of a stranger
I've always known that the mirror never lies
People always turn away
From the eyes of a stranger
Afraid to see what hides behind the stare
"""
multiline2 : String
multiline2 = #"""
An example for a simple expression:
"foo" ++ "bar".
This is reduced to "\#{"foo" ++ "bar"}".
"""#
Make sure to look at the example strings at the REPL to see the effect of interpolation and raw string literals and compare it with the syntax we used.
Exercises part 1
In these exercises, you are supposed to implement a bunch of utility functions for consuming and converting strings. I don't give the expected types here, because you are supposed to come up with those yourself.
-
Implement functions similar to
map
,filter
, andmapMaybe
for strings. The output type of these should always be a string. -
Implement functions similar to
foldl
andfoldMap
for strings. -
Implement a function similar to
traverse
for strings. The output type should be a wrapped string. -
Implement the bind operator for strings. The output type should again be a string.
Integers
module Tutorial.Prim.Integers
import Data.Bits
import Data.String
%default total
As listed at the beginning of this chapter, Idris provides different fixed-precision signed and unsigned integer types as well as Integer
, an arbitrary precision signed integer type. All of them come with the following primitive functions (given here for Bits8
as an example):
prim__add_Bits8
: Integer addition.prim__sub_Bits8
: Integer subtraction.prim__mul_Bits8
: Integer multiplication.prim__div_Bits8
: Integer division.prim__mod_Bits8
: Modulo function.prim__shl_Bits8
: Bitwise left shift.prim__shr_Bits8
: Bitwise right shift.prim__and_Bits8
: Bitwise and.prim__or_Bits8
: Bitwise or.prim__xor_Bits8
: Bitwise xor.
Typically, you use the functions for addition and multiplication through the operators from interface Num
, the function for subtraction through interface Neg
, and the functions for division (div
and mod
) through interface Integral
. The bitwise operations are available through interfaces Data.Bits.Bits
and Data.Bits.FiniteBits
.
For all integral types, the following laws are assumed to hold for numeric operations (x
, y
, and z
are arbitrary value of the same primitive integral type):
x + y = y + x
: Addition is commutative.x + (y + z) = (x + y) + z
: Addition is associative.x + 0 = x
: Zero is the neutral element of addition.x - x = x + (-x) = 0
:-x
is the additive inverse ofx
.x * y = y * x
: Multiplication is commutative.x * (y * z) = (x * y) * z
: Multiplication is associative.x * 1 = x
: One is the neutral element of multiplication.x * (y + z) = x * y + x * z
: The distributive law holds.y * (x `div` y) + (x `mod` y) = x
(fory /= 0
).
Please note, that the officially supported backends use Euclidian modulus for calculating mod
: For y /= 0
, x `mod` y
is always a non-negative value strictly smaller than abs y
, so that the law given above does hold. If x
or y
are negative numbers, this is different to what many other languages do but for good reasons as explained in the following article.
Unsigned Integers
The unsigned fixed precision integer types (Bits8
, Bits16
, Bits32
, and Bits64
) come with implementations of all integral interfaces (Num
, Neg
, and Integral
) and the two interfaces for bitwise operations (Bits
and FiniteBits
). All functions with the exception of div
and mod
are total. Overflows are handled by calculating the remainder modulo 2^bitsize
. For instance, for Bits8
, all operations calculate their results modulo 256:
Main> the Bits8 255 + 1
0
Main> the Bits8 255 + 255
254
Main> the Bits8 128 * 2 + 7
7
Main> the Bits8 12 - 13
255
Signed Integers
Like the unsigned integer types, the signed fixed precision integer types (Int8
, Int16
, Int32
, and Int64
) come with implementations of all integral interfaces and the two interfaces for bitwise operations (Bits
and FiniteBits
). Overflows are handled by calculating the remainder modulo 2^bitsize
and subtracting 2^bitsize
if the result is still out of range. For instance, for Int8
, all operations calculate their results modulo 256, subtracting 256 if the result is still out of bounds:
Main> the Int8 2 * 127
-2
Main> the Int8 3 * 127
125
Bitwise Operations
Module Data.Bits
exports interfaces for performing bitwise operations on integral types. I'm going to show a couple of examples on unsigned 8-bit numbers (Bits8
) to explain the concept to readers new to bitwise arithmetics. Note, that this is much easier to grasp for unsigned integer types than for the signed versions. Those have to include information about the sign of numbers in their bit pattern, and it is assumed that signed integers in Idris use a two's complement representation, about which I will not go into the details here.
An unsigned 8-bit binary number is represented internally as a sequence of eight bits (with values 0 or 1), each of which corresponds to a power of 2. For instance, the number 23 (= 16 + 4 + 2 + 1) is represented as 0001 0111
:
23 in binary: 0 0 0 1 0 1 1 1
Bit number: 7 6 5 4 3 2 1 0
Decimal value: 128 64 32 16 8 4 2 1
We can use function testBit
to check if the bit at the given position is set or not:
Tutorial.Prim> testBit (the Bits8 23) 0
True
Tutorial.Prim> testBit (the Bits8 23) 1
True
Tutorial.Prim> testBit (the Bits8 23) 3
False
Likewise, we can use functions setBit
and clearBit
to set or unset a bit at a certain position:
Tutorial.Prim> setBit (the Bits8 23) 3
31
Tutorial.Prim> clearBit (the Bits8 23) 2
19
There are also operators (.&.)
(bitwise and) and (.|.)
(bitwise or) as well as function xor
(bitwise exclusive or) for performing boolean operations on integral values. For instance x .&. y
has exactly those bits set, which both x
and y
have set, while x .|. y
has all bits set that are either set in x
or y
(or both), and x `xor` y
has those bits set that are set in exactly one of the two values:
23 in binary: 0 0 0 1 0 1 1 1
11 in binary: 0 0 0 0 1 0 1 1
23 .&. 11 in binary: 0 0 0 0 0 0 1 1
23 .|. 11 in binary: 0 0 0 1 1 1 1 1
23 `xor` 11 in binary: 0 0 0 1 1 1 0 0
And here are the examples at the REPL:
Tutorial.Prim> the Bits8 23 .&. 11
3
Tutorial.Prim> the Bits8 23 .|. 11
31
Tutorial.Prim> the Bits8 23 `xor` 11
28
Finally, it is possible to shift all bits to the right or left by a certain number of steps by using functions shiftR
and shiftL
, respectively (overflowing bits will just be dropped). A left shift can therefore be viewed as a multiplication by a power of two, while a right shift can be seen as a division by a power of two:
22 in binary: 0 0 0 1 0 1 1 0
22 `shiftL` 2 in binary: 0 1 0 1 1 0 0 0
22 `shiftR` 1 in binary: 0 0 0 0 1 0 1 1
And at the REPL:
Tutorial.Prim> the Bits8 22 `shiftL` 2
88
Tutorial.Prim> the Bits8 22 `shiftR` 1
11
Bitwise operations are often used in specialized code or certain high-performance applications. As programmers, we have to know they exist and how they work.
Integer Literals
So far, we always required an implementation of Num
in order to be able to use integer literals for a given type. However, it is actually only necessary to implement a function fromInteger
converting an Integer
to the type in question. As we will see in the last section, such a function can even restrict the values allowed as valid literals.
For instance, assume we'd like to define a data type for representing the charge of a chemical molecule. Such a value can be positive or negative and (theoretically) of almost arbitrary magnitude:
record Charge where
constructor MkCharge
value : Integer
It makes sense to be able to sum up charges, but not to multiply them. They should therefore have an implementation of Monoid
but not of Num
. Still, we'd like to have the convenience of integer literals when using constant charges at compile time. Here's how to do this:
fromInteger : Integer -> Charge
fromInteger = MkCharge
Semigroup Charge where
x <+> y = MkCharge $ x.value + y.value
Monoid Charge where
neutral = 0
Alternative Bases
In addition to the well known decimal literals, it is also possible to use integer literals in binary, octal, or hexadecimal representation. These have to be prefixed with a zero following by a b
, o
, or x
for binary, octal, and hexadecimal, respectively:
Tutorial.Prim> 0b1101
13
Tutorial.Prim> 0o773
507
Tutorial.Prim> 0xffa2
65442
Exercises part 2
-
Define a wrapper record for integral values and implement
Monoid
so that(<+>)
corresponds to(.&.)
.Hint: Have a look at the functions available from interface
Bits
to find a value suitable as the neutral element. -
Define a wrapper record for integral values and implement
Monoid
so that(<+>)
corresponds to(.|.)
. -
Use bitwise operations to implement a function, which tests if a given value of type
Bits64
is even or not. -
Convert a value of type
Bits64
to a string in binary representation. -
Convert a value of type
Bits64
to a string in hexadecimal representation.Hint: Use
shiftR
and(.&. 15)
to access subsequent packages of four bits.
Refined Primitives
module Tutorial.Prim.Refined
import Data.Bits
import Data.String
%default total
We often do not want to allow all values of a type in a certain context. For instance, String
as an arbitrary sequence of UTF-8 characters (several of which are not even printable), is too general most of the time. Therefore, it is usually advisable to rule out invalid values early on, by pairing a value with an erased proof of validity.
We have learned how we can write elegant predicates, with which we can proof our functions to be total, and from which we can - in the ideal case - derive other, related predicates. However, when we define predicates on primitives they are to a certain degree doomed to live in isolation, unless we come up with a set of primitive axioms (implemented most likely using believe_me
), with which we can manipulate our predicates.
Use Case: ASCII Strings
String encodings is a difficult topic, so in many low level routines it makes sense to rule out most characters from the beginning. Assume therefore, we'd like to make sure the strings we accept in our application only consist of ASCII characters:
isAsciiChar : Char -> Bool
isAsciiChar c = ord c <= 127
isAsciiString : String -> Bool
isAsciiString = all isAsciiChar . unpack
We can now refine a string value by pairing it with an erased proof of validity:
record Ascii where
constructor MkAscii
value : String
0 prf : isAsciiString value === True
It is now impossible to at runtime or compile time create a value of type Ascii
without first validating the wrapped string. With this, it is already pretty easy to safely wrap strings at compile time in a value of type Ascii
:
hello : Ascii
hello = MkAscii "Hello World!" Refl
And yet, it would be much more convenient to still use string literals for this, without having to sacrifice the comfort of safety. To do so, we can't use interface FromString
, as its function fromString
would force us to convert any string, even an invalid one. However, we actually don't need an implementation of FromString
to support string literals, just like we didn't require an implementation of Num
to support integer literals. What we really need is a function named fromString
. Now, when string literals are desugared, they are converted to invocations of fromString
with the given string value as its argument. For instance, literal "Hello"
gets desugared to fromString "Hello"
. This happens before type checking and filling in of (auto) implicit values. It is therefore perfectly fine, to define a custom fromString
function with an erased auto implicit argument as a proof of validity:
fromString : (s : String) -> {auto 0 prf : isAsciiString s === True} -> Ascii
fromString s = MkAscii s prf
With this, we can use (valid) string literals for coming up with values of type Ascii
directly:
hello2 : Ascii
hello2 = "Hello World!"
In order to at runtime create values of type Ascii
from strings of an unknown source, we can use a refinement function returning some kind of failure type:
test : (b : Bool) -> Dec (b === True)
test True = Yes Refl
test False = No absurd
ascii : String -> Maybe Ascii
ascii x = case test (isAsciiString x) of
Yes prf => Just $ MkAscii x prf
No contra => Nothing
Disadvantages of Boolean Proofs
For many use cases, what we described above for ASCII strings can take us very far. However, one drawback of this approach is that we can't safely perform any computations with the proofs at hand.
For instance, we know it will be perfectly fine to concatenate two ASCII strings, but in order to convince Idris of this, we will have to use believe_me
, because we will not be able to proof the following lemma otherwise:
0 allAppend : (f : Char -> Bool)
-> (s1,s2 : String)
-> (p1 : all f (unpack s1) === True)
-> (p2 : all f (unpack s2) === True)
-> all f (unpack (s1 ++ s2)) === True
allAppend f s1 s2 p1 p2 = believe_me $ Refl {x = True}
namespace Ascii
export
(++) : Ascii -> Ascii -> Ascii
MkAscii s1 p1 ++ MkAscii s2 p2 =
MkAscii (s1 ++ s2) (allAppend isAsciiChar s1 s2 p1 p2)
The same goes for all operations extracting a substring from a given string: We will have to implement according rules using believe_me
. Finding a reasonable set of axioms to conveniently deal with refined primitives can therefore be challenging at times, and whether such axioms are even required very much depends on the use case at hand.
Use Case: Sanitized HTML
Assume you write a simple web application for scientific discourse between registered users. To keep things simple, we only consider unformatted text input here. Users can write arbitrary text in a text field and upon hitting Enter, the message is displayed to all other registered users.
Assume now a user decides to enter the following text:
<script>alert("Hello World!")</script>
Well, it could have been (much) worse. Still, unless we take measures to prevent this from happening, this might embed a JavaScript program in our web page we never intended to have there! What I described here, is a well known security vulnerability called cross-site scripting. It allows users of web pages to enter malicious JavaScript code in text fields, which will then be included in the page's HTML structure and executed when it is being displayed to other users.
We want to make sure, that this cannot happen on our own web page. In order to protect us from this attack, we could for instance disallow certain characters like '<'
or '>'
completely (although this might not be enough!), but if our chat service is targeted at programmers, this will be overly restrictive. An alternative is to escape certain characters before rendering them on the page.
escape : String -> String
escape = concat . map esc . unpack
where esc : Char -> String
esc '<' = "<"
esc '>' = ">"
esc '"' = """
esc '&' = "&"
esc '\'' = "'"
esc c = singleton c
What we now want to do is to store a string together with a proof that is was properly escaped. This is another form of existential quantification: "Here is a string, and there once existed another string, which we passed to escape
and arrived at the string we have now". Here's how to encode this:
record Escaped where
constructor MkEscaped
value : String
0 origin : String
0 prf : escape origin === value
Whenever we now embed a string of unknown origin in our web page, we can request a value of type Escaped
and have the very strong guarantee that we are no longer vulnerable to cross-site scripting attacks. Even better, it is also possible to safely embed string literals known at compile time without the need to escape them first:
namespace Escaped
export
fromString : (s : String) -> {auto 0 prf : escape s === s} -> Escaped
fromString s = MkEscaped s s prf
escaped : Escaped
escaped = "Hello World!"
Exercises part 3
In this massive set of exercises, you are going to build a small library for working with predicates on primitives. We want to keep the following goals in mind:
- We want to use the usual operations of propositional logic to combine predicates: Negation, conjuction (logical and), and disjunction (logical or).
- All predicates should be erased at runtime. If we proof something about a primitive number, we want to make sure not to carry around a huge proof of validity.
- Calculations on predicates should make no appearance at runtime (with the exception of
decide
; see below). - Recursive calculations on predicates should be tail recursive if they are used in implementations of
decide
. This might be tough to achieve. If you can't find a tail recursive solution for a given problem, use what feels most natural instead.
A note on efficiency: In order to be able to run computations on our predicates, we try to convert primitive values to algebraic data types as often and as soon as possible: Unsigned integers will be converted to Nat
using cast
, and strings will be converted to List Char
using unpack
. This allows us to work with proofs on Nat
and List
most of the time, and such proofs can be implemented without resorting to believe_me
or other cheats. However, the one advantage of primitive types over algebraic data types is that they often perform much better. This is especially critical when comparing integral types with Nat
: Operations on natural numbers often run with O(n)
time complexity, where n
is the size of one of the natural numbers involved, while with Bits64
, for instance, many operations run in fast constant time (O(1)
). Luckily, the Idris compiler optimizes many functions on natural number to use the corresponding Integer
operations at runtime. This has the advantage that we can still use proper induction to proof stuff about natural numbers at compile time, while getting the benefit of fast integer operations at runtime. However, operations on Nat
do run with O(n)
time complexity and compile time. Proofs working on large natural number will therefore drastically slow down the compiler. A way out of this is discussed at the end of this section of exercises.
Enough talk, let's begin! To start with, you are given the following utilities:
-- Like `Dec` but with erased proofs. Constructors `Yes0`
-- and `No0` will be converted to constants `0` and `1` by
-- the compiler!
data Dec0 : (prop : Type) -> Type where
Yes0 : (0 prf : prop) -> Dec0 prop
No0 : (0 contra : prop -> Void) -> Dec0 prop
-- For interfaces with more than one parameter (`a` and `p`
-- in this example) sometimes one parameter can be determined
-- by knowing the other. For instance, if we know what `p` is,
-- we will most certainly also know what `a` is. We therefore
-- specify that proof search on `Decidable` should only be
-- based on `p` by listing `p` after a vertical bar: `| p`.
-- This is like specifing the search parameter(s) of
-- a data type with `[search p]` as was shown in the chapter
-- about predicates.
-- Specifying a single search parameter as shown here can
-- drastically help with type inference.
interface Decidable (0 a : Type) (0 p : a -> Type) | p where
decide : (v : a) -> Dec0 (p v)
-- We often have to pass `p` explicitly in order to help Idris with
-- type inference. In such cases, it is more convenient to use
-- `decideOn pred` instead of `decide {p = pred}`.
decideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> Dec0 (p v)
decideOn _ = decide
-- Some primitive predicates can only be reasonably implemented
-- using boolean functions. This utility helps with decidability
-- on such proofs.
test0 : (b : Bool) -> Dec0 (b === True)
test0 True = Yes0 Refl
test0 False = No0 absurd
We also want to run decidable computations at compile time. This is often much more efficient than running a direct proof search on an inductive type. We therefore come up with a predicate witnessing that a Dec0
value is actually a Yes0
together with two utility functions:
data IsYes0 : (d : Dec0 prop) -> Type where
ItIsYes0 : {0 prf : _} -> IsYes0 (Yes0 prf)
0 fromYes0 : (d : Dec0 prop) -> (0 prf : IsYes0 d) => prop
fromYes0 (Yes0 x) = x
fromYes0 (No0 contra) impossible
0 safeDecideOn : (0 p : a -> Type)
-> Decidable a p
=> (v : a)
-> (0 prf : IsYes0 (decideOn p v))
=> p v
safeDecideOn p v = fromYes0 $ decideOn p v
Finally, as we are planning to refine mostly primitives, we will at times require some sledge hammer to convince Idris that we know what we are doing:
-- only use this if you are sure that `decideOn p v`
-- will return a `Yes0`!
0 unsafeDecideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> p v
unsafeDecideOn p v = case decideOn p v of
Yes0 prf => prf
No0 _ =>
assert_total $ idris_crash "Unexpected refinement failure in `unsafeRefineOn`"
-
We start with equality proofs. Implement
Decidable
forEqual v
.Hint: Use
DecEq
from moduleDecidable.Equality
as a constraint and make sure thatv
is available at runtime. -
We want to be able to negate a predicate:
data Neg : (p : a -> Type) -> a -> Type where IsNot : {0 p : a -> Type} -> (contra : p v -> Void) -> Neg p v
Implement
Decidable
forNeg p
using a suitable constraint. -
We want to describe the conjunction of two predicates:
data (&&) : (p,q : a -> Type) -> a -> Type where Both : {0 p,q : a -> Type} -> (prf1 : p v) -> (prf2 : q v) -> (&&) p q v
Implement
Decidable
for(p && q)
using suitable constraints. -
Come up with a data type called
(||)
for the disjunction (logical or) of two predicates and implementDecidable
using suitable constraints. -
Proof De Morgan's laws by implementing the following propositions:
negOr : Neg (p || q) v -> (Neg p && Neg q) v andNeg : (Neg p && Neg q) v -> Neg (p || q) v orNeg : (Neg p || Neg q) v -> Neg (p && q) v
The last of De Morgan's implications is harder to type and proof as we need a way to come up with values of type
p v
andq v
and show that not both can exist. Here is a way to encode this (annotated with quantity 0 as we will need to access an erased contraposition):0 negAnd : Decidable a p => Decidable a q => Neg (p && q) v -> (Neg p || Neg q) v
When you implement
negAnd
, remember that you can freely access erased (implicit) arguments, becausenegAnd
itself can only be used in an erased context.So far, we implemented the tools to algebraically describe and combine several predicate. It is now time to come up with some examples. As a first use case, we will focus on limiting the valid range of natural numbers. For this, we use the following data type:
-- Proof that m <= n data (<=) : (m,n : Nat) -> Type where ZLTE : 0 <= n SLTE : m <= n -> S m <= S n
This is similar to
Data.Nat.LTE
but I find operator notation often to be clearer. We also can define and use the following aliases:(>=) : (m,n : Nat) -> Type m >= n = n <= m (<) : (m,n : Nat) -> Type m < n = S m <= n (>) : (m,n : Nat) -> Type m > n = n < m LessThan : (m,n : Nat) -> Type LessThan m = (< m) To : (m,n : Nat) -> Type To m = (<= m) GreaterThan : (m,n : Nat) -> Type GreaterThan m = (> m) From : (m,n : Nat) -> Type From m = (>= m) FromTo : (lower,upper : Nat) -> Nat -> Type FromTo l u = From l && To u Between : (lower,upper : Nat) -> Nat -> Type Between l u = GreaterThan l && LessThan u
-
Coming up with a value of type
m <= n
by pattern matching onm
andn
is highly inefficient for large values ofm
, as it will requirem
iterations to do so. However, while in an erased context, we don't need to hold a value of typem <= n
. We only need to show, that such a value follows from a more efficient computation. Such a computation iscompare
for natural numbers: Although this is implemented in the Prelude with a pattern match on its arguments, it is optimized by the compiler to a comparison of integers which runs in constant time even for very large numbers. SincePrelude.(<=)
for natural numbers is implemented in terms ofcompare
, it runs just as efficiently.We therefore need to proof the following two lemmas (make sure to not confuse
Prelude.(<=)
withPrim.(<=)
in these declarations):0 fromLTE : (n1,n2 : Nat) -> (n1 <= n2) === True -> n1 <= n2 0 toLTE : (n1,n2 : Nat) -> n1 <= n2 -> (n1 <= n2) === True
They come with a quantity of 0, because they are just as inefficient as the other computations we discussed above. We therefore want to make absolutely sure that they will never be used at runtime!
Now, implement
Decidable Nat (<= n)
, making use oftest0
,fromLTE
, andtoLTE
. Likewise, implementDecidable Nat (m <=)
, because we require both kinds of predicates.Note: You should by now figure out yourself that
n
must be available at runtime and how to make sure that this is the case. -
Proof that
(<=)
is reflexive and transitive by declaring and implementing corresponding propositions. As we might require the proof of transitivity to chain several values of type(<=)
, it makes sense to also define a short operator alias for this. -
Proof that from
n > 0
followsIsSucc n
and vise versa. -
Declare and implement safe division and modulo functions for
Bits64
, by requesting an erased proof that the denominator is strictly positive when cast to a natural number. In case of the modulo function, return a refined value carrying an erased proof that the result is strictly smaller than the modulus:safeMod : (x,y : Bits64) -> (0 prf : cast y > 0) => Subset Bits64 (\v => cast v < cast y)
-
We will use the predicates and utilities we defined so far to convert a value of type
Bits64
to a string of digits in baseb
with2 <= b && b <= 16
. To do so, implement the following skeleton definitions:-- this will require some help from `assert_total` -- and `idris_crash`. digit : (v : Bits64) -> (0 prf : cast v < 16) => Char record Base where constructor MkBase value : Bits64 0 prf : FromTo 2 16 (cast value) base : Bits64 -> Maybe Base namespace Base public export fromInteger : (v : Integer) -> {auto 0 _ : IsJust (base $ cast v)} -> Base
Finally, implement
digits
, usingsafeDiv
andsafeMod
in your implementation. This might be challenging, as you will have to manually transform some proofs to satisfy the type checker. You might also requireassert_smaller
in the recursive step.digits : Bits64 -> Base -> String
We will now turn our focus on strings. Two of the most obvious ways in which we can restrict the strings we accept are by limiting the set of characters and limiting their lengths. More advanced refinements might require strings to match a certain pattern or regular expression. In such cases, we might either go for a boolean check or use a custom data type representing the different parts of the pattern, but we will not cover these topics here.
-
Implement the following aliases for useful predicates on characters.
Hint: Use
cast
to convert characters to natural numbers, use(<=)
andInRange
to specify regions of characters, and use(||)
to combine regions of characters.-- Characters <= 127 IsAscii : Char -> Type -- Characters <= 255 IsLatin : Char -> Type -- Characters in the interval ['A','Z'] IsUpper : Char -> Type -- Characters in the interval ['a','z'] IsLower : Char -> Type -- Lower or upper case characters IsAlpha : Char -> Type -- Characters in the range ['0','9'] IsDigit : Char -> Type -- Digits or characters from the alphabet IsAlphaNum : Char -> Type -- Characters in the ranges [0,31] or [127,159] IsControl : Char -> Type -- An ASCII character that is not a control character IsPlainAscii : Char -> Type -- A latin character that is not a control character IsPlainLatin : Char -> Type
-
The advantage of this more modular approach to predicates on primitives is that we can safely run calculations on our predicates and get the strong guarantees from the existing proofs on inductive types like
Nat
andList
. Here are some examples of such calculations and conversions, all of which can be implemented without cheating:0 plainToAscii : IsPlainAscii c -> IsAscii c 0 digitToAlphaNum : IsDigit c -> IsAlphaNum c 0 alphaToAlphaNum : IsAlpha c -> IsAlphaNum c 0 lowerToAlpha : IsLower c -> IsAlpha c 0 upperToAlpha : IsUpper c -> IsAlpha c 0 lowerToAlphaNum : IsLower c -> IsAlphaNum c 0 upperToAlphaNum : IsUpper c -> IsAlphaNum c
The following (
asciiToLatin
) is trickier. Remember that(<=)
is transitive. However, in your invocation of the proof of transitivity, you will not be able to apply direct proof search using%search
because the search depth is too small. You could increase the search depth, but it is much more efficient to usesafeDecideOn
instead.0 asciiToLatin : IsAscii c -> IsLatin c 0 plainAsciiToPlainLatin : IsPlainAscii c -> IsPlainLatin c
Before we turn our full attention to predicates on strings, we have to cover lists first, because we will often treat strings as lists of characters.
-
Implement
Decidable
forHead
:data Head : (p : a -> Type) -> List a -> Type where AtHead : {0 p : a -> Type} -> (0 prf : p v) -> Head p (v :: vs)
-
Implement
Decidable
forLength
:data Length : (p : Nat -> Type) -> List a -> Type where HasLength : {0 p : Nat -> Type} -> (0 prf : p (List.length vs)) -> Length p vs
-
The following predicate is a proof that all values in a list of values fulfill the given predicate. We will use this to limit the valid set of characters in a string.
data All : (p : a -> Type) -> (as : List a) -> Type where Nil : All p [] (::) : {0 p : a -> Type} -> (0 h : p v) -> (0 t : All p vs) -> All p (v :: vs)
Implement
Decidable
forAll
.For a real challenge, try to make your implementation of
decide
tail recursive. This will be important for real world applications on the JavaScript backends, where we might want to refine strings of thousands of characters without overflowing the stack at runtime. In order to come up with a tail recursive implementation, you will need an additional data typeAllSnoc
witnessing that a predicate holds for all elements in aSnocList
. -
It's time to come to an end here. An identifier in Idris is a sequence of alphanumeric characters, possibly separated by underscore characters (
_
). In addition, all identifiers must start with a letter. Given this specification, implement predicateIdentChar
, from which we can define a new wrapper type for identifiers:0 IdentChars : List Char -> Type record Identifier where constructor MkIdentifier value : String 0 prf : IdentChars (unpack value)
Implement a factory method
identifier
for converting strings of unknown source at runtime:identifier : String -> Maybe Identifier
In addition, implement
fromString
forIdentifier
and verify, that the following is a valid identifier:testIdent : Identifier testIdent = "fooBar_123"
Final remarks: Proofing stuff about the primitives can be challenging, both when deciding on what axioms to use and when trying to make things perform well at runtime and compile time. I'm experimenting with a library, which deals with these issues. It is not yet finished, but you can have a look at it here.
Getting Started with pack and Idris2
Here I describe what I find to be the most convenient way to get up and running with Idris2. We are going to install the pack package manager, which will install a recent version of the Idris compiler along the way. However, this means that you need access to a Unix-like operating system such as Linux or macOS. Windows users can make use of WSL to get access to a Linux environment on their system. As a prerequisite, it is assumed that readers know how to start a terminal session on their system, and how to run commands from the terminal's command-line. In addition, readers need to know how to add directories to the $PATH
variable on their system.
Installing pack
In order to install the pack package manager together with a recent version of the Idris2 compiler, follow the instructions on pack's GitHub page.
If all goes well, I suggest you take a moment to inspect the default settings available in your global pack.toml
file, which can be found at $HOME/.pack/user/pack.toml
(unless you explicitly set the $PACK_DIR
environment variable to a different directory). If possible, I suggest you install the rlwrap tool and change the following setting in your global pack.toml
file to true
:
repl.rlwrap = true
This will lead to a nicer experience when running REPL sessions. You might also want to set up your editor to make use of the interactive editing features provided by Idris. Instruction to do this for Neovim can be found here.
Updating pack and Idris
Both projects, pack and the Idris compiler, are still being actively developed. It is therefore a good idea to update them at regular occasions. To update pack itself, just run the following command:
pack update
To build and install the latest commit of the Idris compiler and use the latest package collection, run
pack switch latest
Setting up your Playground
If you are going to solve the exercises in this tutorial (you should!), you'll have to write a lot of code. It is best to setup a small playground project for tinkering with Idris. In a directory of your choice, run the following command:
pack new lib tut
This will setup a minimal Idris package in directory tut
together with an .ipkg
file called tut.ipkg
, a directory to put your Idris sources called src
, and a minimal Idris module at src/Tut.idr
.
In addition, it sets up a minimal test suite in directory test
. All of this is put together and made accessible to pack in a pack.toml
file in the project's root directory. Take your time and quickly inspect the content of every file created by pack: The .idr
files contain Idris source code. The .ipkg
files contain detailed descriptions of packages for the Idris compiler including where the sources are located, the modules a package makes available to other projects, and a list of packages the project itself depends on. Finally, the pack.toml
file informs pack about the local packages in the current project.
With this, here is a bunch of things you can do, but first, make sure you are in the project's root directory (called tut
if you followed my suggestion) or one of its child folders when running these commands.
To typecheck the library sources, run
pack typecheck tut
To build and execute the test suite, run
pack test tut
To start a REPL session with src/Tut.idr
loaded, run
pack repl src/Tut.idr
Conclusion
In this very short tutorial you set up an environment for working on Idris projects and following along with the main part of the tutorial. You are now ready to start with the first chapter, or - if you already wrote some Idris code - to learn about the details of the Idris module system.
Please note that this tutorial itself is setup as a pack project: It contains a pack.toml
and tutorial.ipkg
file in its root directory (have a look at them to get a feel for how such projects are setup) and a lot of Idris sources in the subfolders of directory src
.
Interactive Editing in Neovim
Idris provides extensive capabilities to interactively analyze the types of values and expressions in our programs and fill out skeleton implementations and sometimes even whole programs for us based on the types provided. These interactive editing features are available via plugins in different editors. Since I am a Neovim user, I explain the Idris related parts of my own setup in detail here.
The main component required to get all these features to run in Neovim is an executable provided by the idris2-lsp project. This executable makes use of the Idris compiler API (application programming interface) internally and can check the syntax and types of the source code we are working on. It communicates with Neovim via the language server protocol (LSP). This communication is setup through the idris2-nvim plugin.
As we will see in this tutorial, the idris2-lsp
executable not only supports syntax and type checking, but comes also with additional interactive editing features. Finally, the Idris compiler API supports semantic highlighting of Idris source code: Identifiers and keywords are highlighted not only based on the language's syntax (that would be syntax highlighting, a feature expected from all modern programming environments and editors), but also based on their semantics. For instance, a local variable in a function implementation gets highlighted differently than the name of a top level function, although syntactically these are both just identifiers.
module Appendices.Neovim
import Data.Vect
%default total
Setup
In order to make full use of interactive Idris editing in Neovim, at least the following tools need to be installed:
- A recent version of Neovim (version 0.5 or later).
- A recent version of the Idris compiler (at least version 0.5.1).
- The Idris compiler API.
- The idris2-lsp package.
- The following Neovim plugins:
The idris2-lsp
project gives detailed instructions about how to install Idris 2 together with its standard libraries and compiler API. Make sure to follow these instructions so that your compiler and idris2-lsp
executable are in sync.
If you are new to Neovim, you might want to use the init.vim
file provided in the resources
folder. In that case, the necessary Neovim plugins are already included, but you need to install vim-plug, a plugin manager. Afterwards, copy all or parts of resources/init.vim
to your own init.vim
file. (Use :help init.vim
from within Neovim in order to find out where to look for this file.). After setting up your init.vim
file, restart Neovim and run :PlugUpdate
to install the necessary plugins.
A Typical Workflow
In order to checkout the interactive editing features available to us, we will reimplement some small utilities from the Prelude. To follow along, you should have already worked through the Introduction, Functions Part 1, and at least parts of Algebraic Data Types, otherwise it will be hard to understand what's going on here.
Before we begin, note that the commands and actions shown in this tutorial might not work correctly after you edited a source file but did not write your changes to disk. Therefore, the first thing you should try if the things described here do not work, is to quickly save the current file (:w
).
Let's start with negation of a boolean value:
negate1 : Bool -> Bool
Typically, when writing Idris code we follow the mantra "types first". Although you might already have an idea about how to implement a certain piece of functionality, you still need to provide an accurate type before you can start writing your implementation. This means, when programming in Idris, we have to mentally keep track of the implementation of an algorithm and the types involved at the same time, both of which can become arbitrarily complex. Or do we? Remember that Idris knows at least as much about the variables and their types available in the current context of a function implementation as we do, so we probably should ask it for guidance instead of trying to do everything on our own.
So, in order to proceed, we ask Idris for a skeleton function body: In normal editor mode, move your cursor on the line where negate1
is declared and enter <LocalLeader>a
in quick succession. <LocalLeader>
is a special key that can be specified in the init.vim
file. If you use the init.vim
from the resources
folder, it is set to the comma character (,
), in which case the above command consists of a comma quickly followed by the lowercase letter "a". See also :help leader
and :help localleader
in Neovim
Idris will generate a skeleton implementation similar to the following:
negate2 : Bool -> Bool
negate2 x = ?negate2_rhs
Note, that on the left hand side a new variable with name x
was introduced, while on the right hand side Idris added a metavariable (also called a hole). This is an identifier prefixed with a question mark. It signals to Idris, that we will implement this part of the function at a later time. The great thing about holes is, that we can hover over them and inspect their types and the types of values in the surrounding context. You can do so by placing the cursor on the identifier of a hole and entering K
(the uppercase letter) in normal mode. This will open a popup displaying the type of the variable under the cursor plus the types and quantities of the variables in the surrounding context. You can also have this information displayed in a separate window: Enter <LocalLeader>so
to open this window and repeat the hovering. The information will appear in the new window and as an additional benefit, it will be semantically highlighted. Enter <LocalLeader>sc
to close this window again. Go ahead and checkout the type and context of ?negate2_rhs
.
Most functions in Idris are implemented by pattern matching on one or more of the arguments. Idris, knowing the data constructors of all non-primitive data types, can write such pattern matches for us (a process also called case splitting). To give this a try, move the cursor onto the x
in the skeleton implementation of negate2
, and enter <LocalLeader>c
in normal mode. The result will look as follows:
negate3 : Bool -> Bool
negate3 False = ?negate3_rhs_0
negate3 True = ?negate3_rhs_1
As you can see, Idris inserted a hole for each of the cases on the right hand side. We can again inspect their types or replace them with a proper implementation directly.
This concludes the introduction of the (in my opinion) core features of interactive editing: Hovering on metavariables, adding skeleton function implementations, and case splitting (which also works in case blocks and for nested pattern matches). You should start using these all the time now!
Expression Search
Sometimes, Idris knows enough about the types involved to come up with a function implementation on its own. For instance, let us implement function either
from the Prelude. After giving its type, creating a skeleton implementation, and case splitting on the Either
argument, we arrive at something similar to the following:
either2 : (a -> c) -> (b -> c) -> Either a b -> c
either2 f g (Left x) = ?either2_rhs_0
either2 f g (Right x) = ?either2_rhs_1
Idris can come up with expressions for the two metavariables on its own, because the types are specific enough. Move the cursor onto one of the metavariables and enter <LocalLeader>o
in normal mode. You will be given a selection of possible expressions (only one in this case), of which you can choose a fitting one (or abort with q
).
Here is another example: A reimplementation of function maybe
. If you run an expression search on ?maybe2_rhs1
, you will get a larger list of choices.
maybe2 : b -> (a -> b) -> Maybe a -> b
maybe2 x f Nothing = x
maybe2 x f (Just y) = ?maybe2_rhs_1
Idris is also sometimes capable of coming up with complete function implementations based on a function's type. For this to work well in practice, the number of possible implementations satisfying the type checker must be pretty small. As an example, here is function zipWith
for vectors. You might not have heard about vectors yet: They will be introduced in the chapter about dependent types. You can still give this a go to check out its effect. Just move the cursor on the line declaring zipWithV
, enter <LocalLeader>gd
and select the first option. This will automatically generate the whole function body including case splits and implementations.
zipWithV : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
Expression search only works well if the types are specific enough. If you feel like that might be the case, go ahead and give it a go, either by running <LocalLeader>o
on a metavariable, or by trying <LocalLeader>gd
on a function declaration.
More Code Actions
There are other shortcuts available for generating part of your code, two of which I'll explain here.
First, it is possible to add a new case block by entering <LocalLeader>mc
in normal mode when on a metavariable. For instance, here is part of an implementation of filterList
, which appears in an exercise in the chapter about algebraic data types. I arrived at this by letting Idris generate a skeleton implementation followed by a case split and an expression search on the first metavariable:
filterList : (a -> Bool) -> List a -> List a
filterList f [] = []
filterList f (x :: xs) = ?filterList_rhs_1
We will next have to pattern match on the result of applying x
to f
. Idris can introduce a new case block for us, if we move the cursor onto metavariable ?filterList_rhs_1
and enter <LocalLeader>mc
in normal mode. We can then continue with our implementation by first giving the expression to use in the case block (f x
) followed by a case split on the new variable in the case block. This will lead us to an implementation similar to the following (I had to fix the indentation, though):
filterList2 : (a -> Bool) -> List a -> List a
filterList2 f [] = []
filterList2 f (x :: xs) = case f x of
False => ?filterList2_rhs_2
True => ?filterList2_rhs_3
Sometimes, we want to extract a utility function from an implementation we are working on. For instance, this is often useful or even necessary when we write proofs about our code (see chapters Propositional Equality and Predicates, for instance). In order to do so, we can move the cursor on a metavariable, and enter <LocalLeader>ml
. Give this a try with ?whatNow
in the following example (this will work better in a regular Idris source file instead of the literate file I use for this tutorial):
traverseEither : (a -> Either e b) -> List a -> Either e (List b)
traverseEither f [] = Right []
traverseEither f (x :: xs) = ?whatNow x xs f (f x) (traverseEither f xs)
Idris will create a new function declaration with the type and name of ?whatNow
, which takes as arguments all variables currently in scope. It also replaces the hole in traverseEither
with a call to this new function. Typically, you will have to manually remove unneeded arguments afterwards. This led me to the following version:
whatNow2 : Either e b -> Either e (List b) -> Either e (List b)
traverseEither2 : (a -> Either e b) -> List a -> Either e (List b)
traverseEither2 f [] = Right []
traverseEither2 f (x :: xs) = whatNow2 (f x) (traverseEither f xs)
Getting Information
The idris2-lsp
executable and through it, the idris2-nvim
plugin, not only supports the code actions described above. Here is a non-comprehensive list of other capabilities. I suggest you try out each of them from within this source file.
- Typing
K
when on an identifier or operator in normal mode shows its type and namespace (if any). In case of a metavariable, variables in the current context are displayed as well together with their types and quantities (quantities will be explained in Functions Part 2). If you don't like popups, enter<LocalLeader>so
to open a new window where this information is displayed and semantically highlighted instead. - Typing
gd
on a function, operator, data constructor or type constructor in normal mode jumps to the item's definition. For external modules, this works only if the module in question has been installed together with its source code (by using theidris2 --install-with-src
command). - Typing
<LocalLeader>mm
opens a popup window listing all metavariables in the current module. You can place the cursor on an entry and jump to its location by pressing<Enter>
. - Typing
<LocalLeader>mn
(or<LocalLeader>mp
) jumps to the next (or previous) metavariable in the current module. - Typing
<LocalLeader>br
opens a popup where you can enter a namespace. Idris will then show all functions (plus their types) exported from that namespace in a popup window, and you can jump to a function's definition by pressing enter on one of the entries. Note: The module in question must be imported in the current source file. - Typing
<LocalLeader>x
opens a popup where you can enter a REPL command or Idris expression, and the plugin will reply with a response from the REPL. Whenever REPL examples are shown in the main part of this guide, you can try them from within Neovim with this shortcut if you like. - Typing
<LocalLeader><LocalLeader>e
will display the error message from the current line in a popup window. This can be highly useful, if error messages are too long to fit on a single line. Likewise,<LocalLeader><LocalLeader>el
will list all error messages from the current buffer in a new window. You can then select an error message and jump to its origin by pressing<Enter>
.
Other use cases and examples are described on the GitHub page of the idris2-nvim
plugin and can be included as described there.
The %name
Pragma
When you ask Idris for a skeleton implementation with <LocalLeader>a
or a case split with <LocalLeader>c
, it has to decide on what names to use for the new variables it introduces. If these variables already have predefined names (from the function's signature, record fields, or named data constructor arguments), those names will be used, but otherwise Idris will as a default use names x
, y
, and z
, followed by other letters. You can change this default behavior by specifying a list of names to use for such occasions for any data type.
For instance:
data Element = H | He | C | N | O | F | Ne
%name Element e,f
Idris will then use these names (followed by these names postfixed with increasing integers), when it has to come up with variable names of this type on its own. For instance, here is a test function and the result of adding a skeleton definition to it:
test : Element -> Element -> Element -> Element -> Element -> Element
test e f e1 f1 e2 = ?test_rhs
Conclusion
Neovim, together with the idris2-lsp
executable and the idris2-nvim
editor plugin, provides extensive utilities for interactive editing when programming in Idris. Similar functionality is available for some other editors, so feel free to ask what's available for your editor of choice, for instance on the Idris 2 Discord channel.
Structuring Idris Projects
In this section I'm going to show how to organize, install, and depend on larger Idris projects. We will have a look at Idris packages, the module system, visibility of types and functions, writing comments and doc strings, and using pack for managing our libraries.
This section should be useful for all readers who have already written a bit of Idris code. We will not do any fancy type level wizardry in here, but I'll demonstrate several concepts using failing
code blocks, which you might not have seen before. This rather new addition to the language allows us to write code that is expected to fail during elaboration (type checking). For instance:
failing "Can't find an implementation for FromString Bits8."
ohno : Bits8
ohno = "Oh no!"
As part of a failing block, we can give a substring of the compiler's error message for documentation purposes and to make sure the block fails with the expected error.
Modules
Every Idris source file defines a module, typically starting with a module header like the one below:
module Appendices.Projects
A module's name consists of several upper case identifiers separated by dots, which must reflect the path of the .idr
file where the module is stored. For instance, this module is stored in file Appendices/Projects.md
, so the module's name is Appendices.Projects
.
"But wait!", I hear you say, "What about the parent folder(s) of Appendices
? Why aren't those part of the module's name?" In order to understand this, we must talk about the concept of the source directory. The source directory is where Idris is looking for source files. It defaults to the directory, from which the Idris executable is run. For instance, when in folder src
of this project, you can open this source file like so:
idris2 Appendices/Projects.md
This will not work, however, if you try the same thing from this project's root folder:
$ idris2 src/Appendices/Projects.md
...
Error: Module name Appendices.Projects does not match file name "src/Appendices/Projects.md"
...
So, which folder names to include in a module name depends on the parent folder we consider to be our source directory. It is common practice to name the source directory src
, although this is not mandatory (as I said above, the default is actually the directory, from which we run Idris). It is possible to change the source directory with the --source-dir
command-line option. The following works from within this project's root directory:
idris2 --source-dir src src/Appendices/Projects.md
And the following would work from a parent directory (assuming this tutorial is stored in folder tutorial
):
idris2 --source-dir tutorial/src tutorial/src/Appendices/Projects.md
Most of the time, however, you will specify an .ipkg
file for your project (see later in this section) and define the source directory there. Afterwards, you can use pack (instead of the idris2
executable) to start REPL sessions and load your source files.
Module Imports
You often need to import functions and data types from other modules when writing Idris code. This can be done with an import
statement. Here are several examples showing how these might look like:
import Data.String
import Data.List
import Text.CSV
import public Appendices.Neovim
import Data.Vect as V
import public Data.List1 as L
The first two lines import modules from another package (we will learn about packages below): Data.List
from the base package, which will be installed as part of your Idris installation.
The second line imports module Text.CSV
from within our own source directory src
. It is always possible to import modules that are part of the same source directory as the file we are working on.
The third line imports module Appendices.Neovim
, again from our own source directory. Note, however, that this import
statement comes with an additional public
keyword. This allows us to re-export a module, so that it is available from within other modules in addition to the current module: If another module imports Appendices.Projects
, module Appendices.Neovim
will be imported as well without the need of an additional import
statement. This is useful when we split some complex functionality across different modules and want to import the lot via a single catch-all module See module Control.Monad.State
in base for an example. You can look at the Idris sources on GitHub or locally after cloning the Idris2 project. The base library can be found in the libs/base
subfolder.
It often happens that in order to make use of functions from some module A
we also require utilities from another module B
, so A
should re-export B
. For instance, Data.Vect
in base re-exports Data.Fin
, because the latter is often required when working with vectors.
The fourth line imports module Data.Vect
, giving it a new name V
, to be used as a shorter prefix. If you often need to disambiguate identifiers by prefixing them with a module's name, this can help making your code more concise:
vectSum : Nat
vectSum = sum $ V.fromList [1..10]
Finally, on the fifth line we publicly import a module and give it a new name. This name will then be the one seen when we transitively import Data.List1
via Appendices.Projects
. To see this, start a REPL session (after type checking the tutorial) without loading a source file from this project's root folder:
pack typecheck tutorial
pack repl
Now load module Appendices.Projects
and checkout the type of singleton
:
Main> :module Appendices.Projects
Imported module Appendices.Projects
Main> :t singleton
Data.String.singleton : Char -> String
Data.List.singleton : a -> List a
L.singleton : a -> List1 a
As you can see, the List1
version of singleton
is now prefixed with L
instead of Data.List1
. It is still possible to use the "official" prefix, though:
Main> List1.singleton 12
12 ::: []
Main> L.singleton 12
12 ::: []
Namespaces
At times, we want to define several functions or data types with the same name in a single module. Idris does not allow this, because every name must be unique in its namespace, and the namespace of a module is just the fully qualified module name. However, it is possible to define additional namespaces within a module by using the namespace
keyword followed by the name of the namespace. All functions which should belong to this namespace must then be indented by the same amount of whitespace.
Here's an example:
data HList : List Type -> Type where
Nil : HList []
(::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)
head : HList (t :: ts) -> t
head (v :: _) = v
tail : HList (t :: ts) -> HList ts
tail (_ :: vs) = vs
namespace HVect
public export
data HVect : Vect n Type -> Type where
Nil : HVect []
(::) : (v : t) -> (vs : HVect ts) -> HVect (t :: ts)
public export
head : HVect (t :: ts) -> t
head (v :: _) = v
public export
tail : HVect (t :: ts) -> HVect ts
tail (_ :: vs) = vs
Function names HVect.head
and HVect.tail
as well as constructors HVect.Nil
and HVect.(::)
would clash with functions and constructors of the same names from the outer namespace (Appendices.Projects
), so we had to put them in their own namespace. In order to be able to use them from outside their namespace, they need to be exported (see the section on visibility below). In case we need to disambiguate between these names, we can prefix them with part of their namespace. For instance, the following fails with a disambiguation error, because there are several functions called head
in scope and it is not clear from head
's argument (some data type supporting list syntax, of which again several are in scope), which version we want:
failing "Ambiguous elaboration."
whatHead : Nat
whatHead = head [12,"foo"]
By prefixing head
with part of its namespace, we can resolve both ambiguities. It is now immediately clear, that [12,"foo"]
must be an HVect
, because that's the type of HVect.head
's argument:
thisHead : Nat
thisHead = HVect.head [12,"foo"]
In the following subsection I'll make use of namespaces to demonstrate the principles of visibility.
Visibility
In order to use functions and data types outside of the module or namespace they were defined in, we need to change their visibility. The default visibility is private
: Such a function or data type is not visible from outside its module or namespace:
namespace Foo
foo : Nat
foo = 12
failing "Name Appendices.Projects.Foo.foo is private."
bar : Nat
bar = 2 * foo
To make a function visible, annotate it with the export
keyword:
namespace Square
export
square : Num a => a -> a
square v = v * v
This will allow us to invoke function square
from within other modules or namespaces (after importing Appendices.Projects
):
OneHundred : Bits8
OneHundred = square 10
However, the implementation of square
will not be exported, so square
will not reduce during elaboration:
failing "Can't solve constraint between: 100 and square 10."
checkOneHundred : OneHundred === 100
checkOneHundred = Refl
For this to work, we need to publicly export square
:
namespace SquarePub
public export
squarePub : Num a => a -> a
squarePub v = v * v
OneHundredAgain : Bits8
OneHundredAgain = squarePub 10
checkOneHundredAgain : OneHundredAgain === 100
checkOneHundredAgain = Refl
Therefore, if you need a function to reduce during elaboration, annotate it with public export
instead of export
. This is especially important if you use a function to compute a type. Such function's must reduce during elaboration, otherwise they are completely useless:
namespace Stupid
export
0 NatOrString : Type
NatOrString = Either String Nat
failing "Can't solve constraint between: Either String ?b and NatOrString."
natOrString : NatOrString
natOrString = Left "foo"
If we publicly export our type alias, everything type checks fine:
namespace Better
public export
0 NatOrString : Type
NatOrString = Either String Nat
natOrString : Better.NatOrString
natOrString = Left "bar"
Visibility of Data Types
Visibility of data types behaves slightly differently. If set to private
(the default), neither the type constructor nor the data constructors are visible outside of the namespace they where defined in. If annotated with export
, the type constructor is exported but not the data constructors:
namespace Export
export
data Foo : Type where
Foo1 : String -> Foo
Foo2 : Nat -> Foo
export
mkFoo1 : String -> Export.Foo
mkFoo1 = Foo1
foo1 : Export.Foo
foo1 = mkFoo1 "foo"
As you can see, we can use the type Foo
as well as function mkFoo1
outside of namespace Export
. However, we cannot use the Foo1
constructor to create a value of type Foo
directly:
failing "Export.Foo1 is private."
foo : Export.Foo
foo = Foo1 "foo"
This changes when we publicly export the data type:
namespace PublicExport
public export
data Foo : Type where
Foo1 : String -> PublicExport.Foo
Foo2 : Nat -> PublicExport.Foo
foo2 : PublicExport.Foo
foo2 = Foo2 12
The same goes for interfaces: If they are publicly exported, the interface (a type constructor) plus all its functions are exported and you can write implementations outside the namespace where they where defined:
namespace PEI
public export
interface Sized a where
size : a -> Nat
Sized Nat where size = id
sumSizes : Foldable t => Sized a => t a -> Nat
sumSizes = foldl (\n,e => n + size e) 0
If they are not publicly exported, you will not be able to write implementations outside the namespace they were defined in (but you can still use the type and its functions in your code):
namespace EI
export
interface Empty a where
empty : a -> Bool
export
Empty (List a) where
empty [] = True
empty _ = False
failing
Empty Nat where
empty Z = True
empty (S _) = False
nonEmpty : Empty a => a -> Bool
nonEmpty = not . empty
Child Namespaces
Sometimes, it is necessary to access a private function in another module or namespace. This is possible from within child namespaces (for want of a better name): Modules and namespaces sharing the parent module's or namespace's prefix. For instance:
namespace Inner
testEmpty : Bool
testEmpty = nonEmpty (the (List Nat) [12])
As you can see, we can access function nonEmpty
from within namespace Appendices.Projects.Inner
, although it is a private function of module Appendices.Projects
. This is even possible for modules: If we were to write a module Data.List.Magic
, we'd have access to private utility functions defined in module Data.List
in base. Actually, I did just that and added module Data.List.Magic
demonstrating this quirk of the Idris module system (go have a look!). In general, this is a rather hacky way to work around visibility constraints, but it can be useful at times.
Parameter Blocks
In this subsection, we are going to have a look at a language construct called a parameters
block, which enables us to share a set of common read-only arguments (parameters) across several functions, thus allowing us to write more concise function signatures. I'm going to demonstrate their usability with a small example program.
The most basic way to make some piece of external information available to a function is by passing it as an additional argument. In object-orientied programming, this principle is sometimes called dependency injection, and a lot of fuss is being made about it, and whole libraries and frameworks have been built around it.
In functional programming, we can be perfectly relaxed about all of this: Need access to some configuration data for your application? Pass it as an additional argument to your functions. Want to use some local mutable state? Pass the corresponding IORef
as an additional argument to your functions. This is both highly efficient and incredibly simple. The only drawback it has: It can blow up our function signatures. There is even a monad for abstracting over this concept, called the Reader
monad. It can be found in module Control.Monad.Reader
, in the base library.
In Idris, however, there is an even simpler approach: We can use proof search with auto implicit arguments for dependency injection. Here's some example code:
data Error : Type where
NoNat : String -> Error
NoBool : String -> Error
record Console where
constructor MkConsole
read : IO String
put : String -> IO ()
record ErrorHandler where
constructor MkHandler
handle : Error -> IO ()
getCount' : (h : ErrorHandler) => (c : Console) => IO Nat
getCount' = do
str <- c.read
case parsePositive str of
Nothing => h.handle (NoNat str) $> 0
Just n => pure n
getText' : (h : ErrorHandler) => (c : Console) => (n : Nat) -> IO (Vect n String)
getText' n = sequence $ replicate n c.read
prog' : ErrorHandler => (c : Console) => IO ()
prog' = do
c.put "Please enter the number of lines to read."
n <- getCount'
c.put "Please enter \{show n} lines of text."
ls <- getText' n
c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."
The example program reads input from and prints output to some Console
type, the implementation of which is left to the caller of the function. This is a typical example of dependency injection: Our IO
actions know nothing about how to read and write lines of text (they do, for instance, not invoke putStrLn
or getLine
directly), but rely on an external object to handle these tasks for us. This allows us to use a simple mock object during testing, while using - for instance - two file handles or data base connections when running the application for real. These are typical techniques often found in object-oriented programming, and in fact, this example emulates typical object-oriented patterns in a purely functional programming language: A type like Console
can be viewed as a class providing pieces of functionality (methods read
and put
), and a value of type Console
can be viewed as an object of this class, on which we can invoke those methods.
The same goes for error handling: Our error handler could just silently ignore any error that occurs, or it could print it to stderr
and write it to a log file at the same time. Whatever it does, our functions need not care.
Note, however, that even in this very simple example we already introduced two additional function arguments, and we can easily see how in a real-world application we might need many more of those and how this would quickly blow up our function signatures. Luckily, there is a very clean and simple solution to this in Idris: parameter
blocks. These allow us to specify lists of parameters (unchanging function arguments) shared by all functions listed inside the block. These arguments need then no longer be listed with each function, thus decluttering our function signatures. Here's the example from above in a parameter block:
parameters {auto c : Console} {auto h : ErrorHandler}
getCount : IO Nat
getCount = do
str <- c.read
case parsePositive str of
Nothing => h.handle (NoNat str) $> 0
Just n => pure n
getText : (n : Nat) -> IO (Vect n String)
getText n = sequence $ replicate n c.read
prog : IO ()
prog = do
c.put "Please enter the number of lines to read."
n <- getCount
c.put "Please enter \{show n} lines of text."
ls <- getText n
c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."
We are free to list arbitrary arguments (implicit, explicit, auto-implicit, named and unnamed) of any quantity as the parameters in a parameters
block, but it works best with implicit and auto implicit arguments. Explicit arguments will have to be passed explicitly to functions in a parameter block, even when invoking them from other parameter blocks with the same explicit argument. This can be rather confusing.
To complete this example, here is a main function for running the program. Note, how we explicitly assemble the Console
and ErrorHandler
to be used when invoking prog
.
main : IO ()
main =
let cons := MkConsole (trim <$> getLine) putStrLn
err := MkHandler (const $ putStrLn "It didn't work")
in prog
Dependency injection via auto-implicit arguments is only one possible application of parameter blocks. They are useful in general whenever we have repeating argument lists for several functions.
Documentation
Documentation is key. Be it for other programmers using a library we wrote, or for people (including our future selves) trying to understand our code, it is important to annotate our code with comments explaining non-trivial implementation details and docstrings describing the intent and functionality of exported data types and functions.
Comments
Writing a comment in an Idris source file is as simple as adding some text after two hyphens:
-- this is a truly boring comment
boring : Bits8 -> Bits8
boring a = a -- probably I should just use `id` from the Prelude
Whenever a line contains two hyphens that are not part of a string literal, the remainder of the line will be interpreted as a comment by Idris.
It is also possible to write multiline comments using delimiters {-
and -}
:
{-
This is a multiline comment. It can be used to comment
out whole blocks of code, for instance if we get several
type errors in a larger source file.
-}
Doc Strings
While comments are targeted at programmers reading and trying to understand our source code, doc strings provide documentation for exported functions and data types, explaining their intent and behavior to others.
Here's and example of a documented function:
||| Tries to extract the first two elements from the beginning
||| of a list.
|||
||| Returns a pair of values wrapped in a `Just` if the list has
||| two elements or more. Returns `Nothing` if the list has fewer
||| than two elements.
export
firstTwo : List a -> Maybe (a,a)
firstTwo (x :: y :: _) = Just (x,y)
firstTwo _ = Nothing
We can view a doc string at the REPL:
Appendices.Projects> :doc firstTwo
Appendices.Projects.firstTwo : List a -> Maybe (a,a)
Tries to extract the first two elements from the beginning
of a list.
Returns a pair of values wrapped in a `Just` if the list has
two elements or more. Returns `Nothing` if the list has fewer
than two elements.
Visibility: export
We can document data types and their constructors in a similar manner:
||| A binary tree index by the number of values it holds.
|||
||| @param `n` : Number of values stored in the `Tree`
||| @param `a` : Type of values stored in the `Tree`
public export
data Tree : (n : Nat) -> (a : Type) -> Type where
||| A single value stored at the leaf of a binary tree.
Leaf : (v : a) -> Tree 1 a
||| A branch unifying two subtrees.
Branch : Tree m a -> Tree n a -> Tree (m + n) a
Go ahead and have a look at the doc strings this generates at the REPL.
Documenting our code is very important. You will realize this, once you try to understand other people's code, or when you come back to a non-trivial piece of source code you wrote yourself a couple of months a ago and since then haven't looked at. If it is not well documented, this can be an unpleasant experience. Idris provides us with the tools necessary to document and annotate our code, so should take our time and do so. It is time well spent.
Packages
Idris packages allow us to assemble several modules into a logical unit and make them available to other Idris projects by installing the packages. In this section, we are going to learn about the structure of an Idris package and how to depend on other packages in our projects.
The .ipkg
File
At the heart of an Idris package lies its .ipkg
file, which is usually but not necessarily stored at a project's root directory. For instance, for this Idris tutorial, there is file tutorial.ipkg
at the tutorial's root directory.
An .ipkg
file consists of several key-value pairs (most of them optional), the most important of which I'll describe here. By far the easiest way to setup a new Idris project is by letting pack or Idris itself do it for you. Just run
pack new lib pkgname
to create the skeleton of a new library or
pack new bin appname
to setup a new application. In addition to creating a new directory plus a suitable .ipkg
file, these commands will also add a pack.toml
file, which we will discuss further below.
Dependencies
One of the most important aspects of an .ipkg
file is listing the packages the library depends on in the depends
field. Here is an example from the hedgehog package, a framework for writing property tests in Idris:
depends = base >= 0.5.1
, contrib >= 0.5.1
, elab-util >= 0.5.0
, pretty-show >= 0.5.0
, sop >= 0.5.0
As you can see, hedgehog depends on base and contrib, both of which are part of every Idris installation, but also on elab-util, a library of utilities for writing elaborator scripts (a powerful technique for creating Idris declarations by writing Idris code; it comes with its own lengthy tutorial if you are interested), sop, a library for generically deriving interface implementations via a sum of products representation (this is a useful thing you might want to check out some day), and pretty-show, a library for pretty printing Idris values (hedgehog makes use of this in case a test fails).
So, before you actually can use hedgehog to write some property tests for your own project, you will need to install the packages it depends on before installing hedgehog itself. Since this can be tedious to do manually, it is best let a package manager like pack handle this task for you.
Dependency Versions
You might want to specify a certain version (or a range) Idris should use for your dependencies. This might be useful if you have several versions of the same package installed and not all of them are compatible with your project. Here are several examples:
depends = base == 0.5.1
, contrib == 0.5.1
, elab-util >= 0.5.0
, pretty-show
, sop >= 0.5.0 && < 0.6.0
This will look for packages base and contrib of exactly the given version, package elab-util of a version greater than or equal to 0.5.0
, package pretty-show of any version, and package sop of a version in the given range. In all cases, if several installed versions of a package match the specified range, the latest version will be used.
In order to make use of this for your own packages, every .ipkg
file should give the package's name and current version:
package tutorial
version = 0.1.0
As I'll show below, package versions play a much less crucial role when using pack and its curated package collection. But even then you might want to consider restricting the versions of packages you accept in order to make sure you catch any braking changes introduced upstream.
Library Modules
Many if not most Idris packages available on GitHub are programming libraries: They implement some piece of functionality and make it available to all projects depending on the given package. This is unlike Idris applications, which are supposed to be compiled to an executable that can then be run on your computer. The Idris project itself provides both: The Idris compiler application, which we use to type check and build other Idris libraries and applications, and several libraries like prelude, base, and contrib, which provide basic data types and functions useful in most Idris projects.
In order to type check and install the modules you wrote in a library, you must list them in the .ipkg
file's modules
field. Here is an excerpt from the sop package:
modules = Data.Lazy
, Data.SOP
, Data.SOP.Interfaces
, Data.SOP.NP
, Data.SOP.NS
, Data.SOP.POP
, Data.SOP.SOP
, Data.SOP.Utils
Modules missing from this list will not be installed and hence will not be available for other packages depending on the sop library.
Pack and its curated Collection of Packages
When the dependency graph of your project is getting large and complex, that is, when your project depends on many libraries, which themselves depend on yet other libraries, it can happen that two packages depend both on different - and, possibly, incompatible - versions of a third package. This situation can be nigh to impossible to resolve, and can lead to a lot of frustration when working with conflicting libraries.
It is therefore the philosophy of the pack project to avoid such a situation from the very beginning by making use of curated package collections. A pack collection consists of a specific Git commit of the Idris compiler and a set of packages, again each at a specific Git commit, all of which have been tested to work well and without issues together. You can see a list of packages available to pack here.
Whenever a project you are working on depends on one of the libraries listed in pack's package collection, pack will automatically install it and all of its dependencies for you. However, you might also want to depend on a library that is not yet part of pack's collection. In that case, you must specify the library in question in one of your pack.toml
files - the global one found at $HOME/.pack/user/pack.toml
, or one local to your current project or one of its parent directories (if any). There, you can either specify a dependency local to your system or a Git project (local or remote). An example for each is shown below:
[custom.all.foo]
type = "local"
path = "/path/to/foo"
ipkg = "foo.ipkg"
[custom.all.bar]
type = "github"
url = "https://github.com/me/bar"
commit = "latest:main"
ipkg = "bar.ipkg"
As you can see, in both cases you have to specify where the project can be found as well as the name and location of its .ipkg
file. In case of a Git project, you also need to tell pack the commit it should use. In the example above, we want to use the latest commit from the main
branch. We can use pack fetch
to fetch and store the currently latest commit hash.
Entries like the ones given above are all that is needed to add support to custom libraries to pack. You can now list these libraries as dependencies in your own project's .ipkg
file and pack will automatically install them for you.
Conclusion
This concludes our section about structuring Idris projects. We have learned about several types of code blocks - failing
blocks for showing that a piece of code fails to elaborate, namespace
s for having overloaded names in the same source file, and parameter blocks for sharing lists of parameters between functions - and how to group several source files into an Idris library or application. Finally, we learned how to include external libraries in an Idris project and how to use pack to help us keep track of these dependencies.
A Deep Dive into Quantitative Type Theory
This section was guest-written by Kiana Sheibani.
In the tutorial proper, when discussing functions, Idris 2's quantity system was introduced. The description was intentionally a bit simplified - the inner workings of quantities are complicated, and that complication would have only confused any newcomers to Idris 2.
Here, I'll provide a more proper and thorough treatment of Quantitative Type Theory (QTT), including how quantity checking is performed and the theory behind it. Most of the information here will be unnecessary for understanding and writing Idris programs, and you are free to keep thinking about quantities like they were explained before. When working with quantities in their full complexity, however, a better understanding of how they work can be helpful to avoid misconceptions.
The Quantity Semiring
Quantitative Type Theory, as you probably already know, uses a set of quantities. The core theory allows for any quantities to be used, but Idris 2 in particular has three: erased, linear, and unrestricted. These are usually written as 0
, 1
, and ω
(the Greek lowercase omega) respectively.
As QTT requires, these three quantities are equipped with the structure of an ordered semiring. The exact mathematical details of what that means aren't important; what it means for us is that quantities can be added and multiplied together, and that there is an ordering relation on them. Here are the tables for each of these operations, where the first argument is on the left and the second is on the top:
Addition
+ | 0 | 1 | ω |
---|---|---|---|
0 | 0 | 1 | ω |
1 | 1 | ω | ω |
ω | ω | ω | ω |
Multiplication
* | 0 | 1 | ω |
---|---|---|---|
0 | 0 | 0 | 0 |
1 | 0 | 1 | ω |
ω | 0 | ω | ω |
Order
≤ | 0 | 1 | ω |
---|---|---|---|
0 | true | false | true |
1 | false | true | true |
ω | false | false | true |
These operations behave mostly how you might expect, with 0
and 1
being the usual numbers and ω
being a sort of "infinity" value. (We have 1 + 1 = ω
instead of 2
because there isn't a 2
quantity in our system.)
There is one big difference in our ordering, though: 0 ≤ 1
is false! We have that 0 ≤ ω
and 1 ≤ ω
, but not 0 ≤ 1
, or 1 ≤ 0
for that matter. In the language of mathematics, we say that 0
and 1
are incomparable. We'll get into why this is the case later, when we talk about what these operations mean and how they're used.
Variables and Contexts
In QTT, each variable in each context has an associated quantity. These quantities can be plainly seen when inspecting holes in the REPL. Here's an example from the tutorial:
0 b : Type
0 a : Type
xs : List a
f : a -> b
x : a
prf : length xs = length (map f xs)
------------------------------
mll1 : S (length xs) = S (length (map f xs))
In this hole's context, The type variables a
and b
have 0
quantity, while the others have ω
quantity.
Since the context is what stores quantities, only names that appear in the context can have a quantity, including:
- Function/lambda parameters
- Pattern matching bindings
let
bindings
These do not appear in the context, and thus do NOT have quantities:
- Top-level definitions
where
definitions- All non-variable expressions
A Change in Perspective
When writing Idris programs using holes, we tend to use a top-to-bottom approach: we start with looking at the context for the whole function, and then we look at smaller and smaller sub-expressions as we fill in the code. This means that quantities in the context tend to decrease over time - if the variable x
has quantity 1
and you use it once, the quantity will decrease to 0
.
When looking at how typechecking works, however, it's more natural to look at contexts in the other direction, from smaller sub-expressions to larger ones. This means that the quantities we're looking at will tend to increase instead. As an example, let's look at this simple function:
square : Num a => a -> a
square x = x * x
Let's first look at the context for the smallest sub-expression of this function, just the variable x
:
0 a : Type
1 x : a
------------------------------
x : a
Now let's look at the context for the larger expression x * x
:
0 a : Type
x : a
------------------------------
(x * x) : a
The quantity of the parameter x
increased from 1
to ω
, since we went from using it once to using it multiple times. When looking at expressions like this, we can think of the quantity q
as saying that the variable is "used q
times" in the expression.
Quantity Checking
With all of that background information established, we can finally see how quantity checking actually works. Let's follow what happens to a single variable x
in our context as we perform different operations.
To illustrate how quantities evolve, I will provide Idris-style context diagrams showing the various cases. In these, capital-letter names T
, E
, etc. stand for any expression, and q
, r
, etc. stand for any quantity.
Variables and Literals
1 x : T
------------------------------
x : T
In the simplest case, an expression is just a single variable. That variable will have quantity 1
in the context, while all others have quantity 0
. (Other variables may also be missing entirely, which for quantity checking is equivalent to them having 0
quantity.)
0 x : T
------------------------------
True : Bool
For literals such as 1
, or constructors such as True
, all variables in the context have quantity 0, since all variables are used 0 times in a constructor.
Function Application
qf x : T
------------------------------
F : (r _ : A) -> B
qe x : T
------------------------------
E : A
(qf + r*qe) x : T
------------------------------
(F E) : B
This is the most complicated of QTT's rules. We have a function F
whose parameter has r
quantity, and we're applying it to E
. If our variable x
is used qf
times in F
and qe
times in E
, then it is used qf + r*qe
times in the full expression.
To better understand this rule, let's look at some simpler cases. First, let's assume that x
is not used in the function F
, so that qf = 0
. Then, x
's full quantity is r * qe
. For example, let's look at these two functions:
f x = id x
g x = id 1
Here, id
has type a -> a
, where its input is unrestricted (ω
). In the first function, we can see that x
is used once in the input of id
, so the quantity of x
in the whole expression is ω * 1 = ω
. In the second function, x
is used zero times in the input of id
, so its quantity in the whole expression is ω * 0 = 0
. The function g
will typecheck if you mark its input as erased, but not f
.
As another simplified case, let's assume that F
is a linear function, meaning that r = 1
. Then x
's full quantity is qf + qe
, the simple sum of the quantities of each part. Here's a function that demonstrates this:
ldup x = (#) x x
The linear pair constructor (#)
is linear in both arguments, so to find the quantity of x
in the full expression we can just add up the quantities in each part. x
is used zero times in (#)
and one time in x
, so the total quantity is 0 + 1 + 1 = ω
. If the second x
were replaced by something else, like a literal, the quantity would only be 0 + 1 + 0 = 1
. Intuitively, you can think of these as "parallel expressions", and the addition operation tells you how quantities combine in parallel.
Subusaging
q x : T
------------------------------
E : T'
(q ≤ r)
r x : T
------------------------------
E : T'
This rule is where the order relation on quantities comes in. It allows us to convert a quantity in our context to another one, given that the new context is greater than or equal to the old one. Type theorists call this subusaging, as it lets us use variables less often than we claim in our types.
Subusaging is why this function definition is allowed:
ignore : a -> Int
ignore x = 42
The input x
is used zero times, which would normally mean its quantity would have to be 0
; however, since 0 ≤ ω
, we can use subusaging to increase the quantity to ω
.
This also explains the mysterious fact we pointed out earlier, that 0 ≰ 1
in our quantity ordering. If it were true that 0 ≤ 1
, then we could also increase the quantity of x
from 0
to 1
:
ignoreLinear : (1 x : a) -> Int
ignoreLinear x = 42
This would mean that the quantity 1
would be for variables used at most once, rather than exactly once. Idris's designers decided that they wanted linearity to have the second meaning, not the first.
Lambdas and Other Bindings
q x : A
------------------------------
E : B
(\q x => E) : (q x : A) -> B
This rule is the most important, as it is the only one in which quantities actually impact typechecking. It is also one of the most straightforward: a lambda expression \q x => E
is only valid if x
is used q
times inside E
. This rule doesn't only apply to lambdas, actually - it applies to any syntax where a variable that has a quantity is bound, such as function parameters, let
, case
, with
, and so on.
let x = 1 in x + x
To see how quantity checking would work with this let-expression, we can simply desugar it into its equivalent lambda form:
(\x => x + x) 1
An explicit quantity q
isn't given for the lambda in this expression, so Idris will try to infer the quantity, then check to see if it's valid. In this case, Idris will infer that x
is unrestricted.
Pattern Matching
All of the binding constructs that this rule applies to support pattern matching, so we need to determine how quantities interact with patterns. To be more specific, if we have a function that pattern-matches like this:
func : (1 _ : LPair a b) -> c
func (x # y) = ?impl
How does the linear quantity of this function's input "descend" into the bindings x
and y
?
A simple rule is to apply the same function-application rule we looked at earlier, but to the left side of the equation. For example, here's how we compute the quantity required for x
in this function definition:
func (((#) x) y)
0 + 1 * (( 0 + 1 * 1) + 1 * 0) = 1
We start from the outside and work our way inwards, applying the qf + r*qe
rule as we go. x
is used zero times in the constant func
, and its argument is linear. We know that x
is used once inside of the linear pair (x # y)
(aside from being obvious, we can compute this fact ourselves), so the number of times x
must be used in func
's definition is 0 + 1 * 1 = 1
.
The same argument applies to y
, meaning that y
should also be used once inside of func
for this definition to pass quantity checking. And in fact, if we look at the context of the hole ?impl
, that's exactly what we see!
0 a : Type
0 b : Type
0 c : Type
1 x : a
1 y : b
------------------------------
impl : c
As a final note, pattern matching in Idris 2 is only allowed when the value in question exists at runtime, meaning that it isn't erased. This is because in QTT, a value must be constructed before it can be pattern-matched: if you match on a variable x
, the resources required to make that variable's value are added to the total count.
1 x : T
------------------------------
x : T
q x : T
------------------------------
E : T'
(1 + q) x : T
------------------------------
(case x of ... => E) : T'
For this reason, the total uses of the variable x
when pattern-matching on it must be 1 + q
, where q
is the uses of x
after the pattern-match (x
is still possible to use with an as-pattern x@...
). This prevents the quantity from being 0
.
The Erased Fragment
Earlier I stated that only variables in the context can have quantities, which in particular means top-level definitions cannot have them. This is mostly true, but there is one slight exception: a function can be marked as erased by placing a 0
before its name.
0 erasedId : (0 x : a) -> a
erasedId x = x
This tells the type system to define this function within the erased fragment, which is a fragment of the type system wherein all quantity checks are ignored. In the erasedId
function above, we use the function's input x
once despite labeling it as erased. This would normally result in a quantity error, but this function is allowed due to being defined in the erased fragment.
This quantity freedom the erased fragment gives us comes with a big drawback, though - erased functions are banned from being used at runtime. In terms of the type theory, what this means is that an erased function can only ever be used in these two places:
- Inside of another erased-fragment function or expression;
- Inside of a function argument that's erased:
constInt : (0 _ : a) -> Int
constInt _ = 2
erased2 : Int
erased2 = constInt (erasedId 1)
This makes sure that quantities are always handled correctly at runtime, which is where it matters!
There is another important place where the erased fragment comes into play, and that's in type signatures. The type signatures of definitions are always erased, so erased functions can be used inside of them.
erasedPrf : erasedId 0 = 0
erasedPrf = Refl
For this reason, erased functions are sometimes thought of as "exclusively type-level functions", though as we've seen, that's not entirely accurate.
Conclusion
This concludes our thorough discussion of Quantitative Type Theory. In this section, we learned about the various operations on quantities: their addition, multiplication, and ordering. We saw how quantities were linked to the context, and how to properly think about the context when analyzing type systems (bottom-to-top instead of top-to-bottom). We then moved on to studying QTT proper, and we saw how the quantities in our context change as the expressions we write grow more complex. Finally, we looked at the erased fragment, and how we can define erased functions.
In Idris 2's current state, most of this information is still entirely unnecessary for learning the language. That may not always be the case, though: there have been some discussions to change the quantity semiring that Idris 2 uses, or even to allow the programmer to choose which set of quantities to use. Whether those discussions lead to anything or not, it can still useful to better understand how Quantitative Type Theory functions in order to write better Idris 2 code.
A Note on Mathematical Accuracy
The information in this appendix is partially based on Robert Atkey's 2018 paper Syntax and Semantics of Quantitative Type Theory, which outlines QTT in the standard language of type theory. The QTT presented in Atkey's paper is roughly similar to Idris 2's type system except for these differences:
- Atkey's theory does not have subusaging, and so the quantity semiring in Atkey's paper is not ordered.
- In Atkey's theory, types can only be constructed in the erased fragment, which means it is impossible to construct a type at runtime. Idris 2 allows constructing types at runtime, but still uses the erased fragment when inside of type signatures.
To resolve these differences, I directly observed how Idris 2's type system behaved in practice in order to determine where to deviate from Atkey's paper.
While I tried to be as mathematically accurate as possible in this section, some accuracy had to be sacrificed for the sake of simplicity. In particular, the description of pattern matching given here is substantially oversimplified. A proper formal treatment of pattern matching would require introducing an eliminator function for each datatype; this eliminator would serve to determine how that datatype's constructors interacted with quantity checking. The details of how this would work for a few simple types (such as the boolean type Bool
) are in Atkey's paper above. I did not include these details because I decided that what I was describing was complicated enough already.
src/Solutions/Functions1.idr
module Solutions.Functions1
--------------------------------------------------------------------------------
-- Exercise 1
--------------------------------------------------------------------------------
square : Integer -> Integer
square n = n * n
testSquare : (Integer -> Bool) -> Integer -> Bool
testSquare fun = fun . square
twice : (Integer -> Integer) -> Integer -> Integer
twice f = f . f
--------------------------------------------------------------------------------
-- Exercise 2
--------------------------------------------------------------------------------
isEven : Integer -> Bool
isEven n = (n `mod` 2) == 0
isOdd : Integer -> Bool
isOdd = not . isEven
--------------------------------------------------------------------------------
-- Exercise 3
--------------------------------------------------------------------------------
isSquareOf : Integer -> Integer -> Bool
isSquareOf n x = n == x * x
--------------------------------------------------------------------------------
-- Exercise 4
--------------------------------------------------------------------------------
isSmall : Integer -> Bool
isSmall n = n <= 100
--------------------------------------------------------------------------------
-- Exercise 5
--------------------------------------------------------------------------------
absIsSmall : Integer -> Bool
absIsSmall = isSmall . abs
--------------------------------------------------------------------------------
-- Exercise 6
--------------------------------------------------------------------------------
and : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
and f1 f2 n = f1 n && f2 n
or : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
or f1 f2 n = f1 n || f2 n
negate : (Integer -> Bool) -> Integer -> Bool
negate f = not . f
--------------------------------------------------------------------------------
-- Exercise 7
--------------------------------------------------------------------------------
(&&) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
(&&) = and
(||) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
(||) = or
not : (Integer -> Bool) -> Integer -> Bool
not = negate
src/Solutions/DataTypes.idr
module Solutions.DataTypes
-- If all or almost all functions in a module are provably
-- total, it is convenient to add the following pragma
-- at the top of the module. It is then no longer necessary
-- to annotate each function with the `total` keyword.
%default total
--------------------------------------------------------------------------------
-- Enumerations
--------------------------------------------------------------------------------
-- 1
and : Bool -> Bool -> Bool
and True b = b
and False _ = False
or : Bool -> Bool -> Bool
or True _ = True
or False b = b
--2
data UnitOfTime = Second | Minute | Hour | Day | Week
toSeconds : UnitOfTime -> Integer -> Integer
toSeconds Second y = y
toSeconds Minute y = 60 * y
toSeconds Hour y = 60 * 60 * y
toSeconds Day y = 24 * 60 * 60 * y
toSeconds Week y = 7 * 24 * 60 * 60 * y
fromSeconds : UnitOfTime -> Integer -> Integer
fromSeconds u s = s `div` toSeconds u 1
convert : UnitOfTime -> Integer -> UnitOfTime -> Integer
convert u1 n u2 = fromSeconds u2 (toSeconds u1 n)
--3
data Element = H | C | N | O | F
atomicMass : Element -> Double
atomicMass H = 1.008
atomicMass C = 12.011
atomicMass N = 14.007
atomicMass O = 15.999
atomicMass F = 18.9984
--------------------------------------------------------------------------------
-- Sum Types
--------------------------------------------------------------------------------
data Title = Mr | Mrs | Other String
eqTitle : Title -> Title -> Bool
eqTitle Mr Mr = True
eqTitle Mrs Mrs = True
eqTitle (Other x) (Other y) = x == y
eqTitle _ _ = False
isOther : Title -> Bool
isOther (Other _) = True
isOther _ = False
data LoginError = UnknownUser String | InvalidPassword | InvalidKey
showError : LoginError -> String
showError (UnknownUser x) = "Unknown user: " ++ x
showError InvalidPassword = "Invalid password"
showError InvalidKey = "Invalid key"
--------------------------------------------------------------------------------
-- Records
--------------------------------------------------------------------------------
-- 1
record TimeSpan where
constructor MkTimeSpan
unit : UnitOfTime
value : Integer
timeSpanToSeconds : TimeSpan -> Integer
timeSpanToSeconds (MkTimeSpan unit value) = toSeconds unit value
-- 2
eqTimeSpan : TimeSpan -> TimeSpan -> Bool
eqTimeSpan x y = timeSpanToSeconds x == timeSpanToSeconds y
-- alternative equality check using `on` from the Idris Prelude
eqTimeSpan' : TimeSpan -> TimeSpan -> Bool
eqTimeSpan' = (==) `on` timeSpanToSeconds
-- 3
showUnit : UnitOfTime -> String
showUnit Second = "s"
showUnit Minute = "min"
showUnit Hour = "h"
showUnit Day = "d"
showUnit Week = "w"
prettyTimeSpan : TimeSpan -> String
prettyTimeSpan (MkTimeSpan Second v) = show v ++ " s"
prettyTimeSpan (MkTimeSpan u v) =
show v ++ " " ++ showUnit u ++ "(" ++ show (toSeconds u v) ++ " s)"
-- 4
compareUnit : UnitOfTime -> UnitOfTime -> Ordering
compareUnit = compare `on` (\x => toSeconds x 1)
minUnit : UnitOfTime -> UnitOfTime -> UnitOfTime
minUnit x y = case compareUnit x y of
LT => x
_ => y
addTimeSpan : TimeSpan -> TimeSpan -> TimeSpan
addTimeSpan (MkTimeSpan u1 v1) (MkTimeSpan u2 v2) =
case minUnit u1 u2 of
u => MkTimeSpan u (convert u1 v1 u + convert u2 v2 u)
--------------------------------------------------------------------------------
-- Generic Data Types
--------------------------------------------------------------------------------
-- 1
mapMaybe : (a -> b) -> Maybe a -> Maybe b
mapMaybe _ Nothing = Nothing
mapMaybe f (Just x) = Just (f x)
appMaybe : Maybe (a -> b) -> Maybe a -> Maybe b
appMaybe (Just f) (Just v) = Just (f v)
appMaybe _ _ = Nothing
bindMaybe : Maybe a -> (a -> Maybe b) -> Maybe b
bindMaybe Nothing _ = Nothing
bindMaybe (Just x) f = f x
filterMaybe : (a -> Bool) -> Maybe a -> Maybe a
filterMaybe f Nothing = Nothing
filterMaybe f (Just x) = if (f x) then Just x else Nothing
first : Maybe a -> Maybe a -> Maybe a
first Nothing y = y
first (Just x) _ = Just x
last : Maybe a -> Maybe a -> Maybe a
last x y = first y x
foldMaybe : (acc -> el -> acc) -> acc -> Maybe el -> acc
foldMaybe f x = maybe x (f x)
-- 2
mapEither : (a -> b) -> Either e a -> Either e b
mapEither _ (Left x) = Left x
mapEither f (Right x) = Right (f x)
appEither : Either e (a -> b) -> Either e a -> Either e b
appEither (Left x) _ = Left x
appEither (Right _) (Left x) = Left x
appEither (Right f) (Right v) = Right (f v)
bindEither : Either e a -> (a -> Either e b) -> Either e b
bindEither (Left x) _ = Left x
bindEither (Right x) f = f x
firstEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
firstEither fun (Left e1) (Left e2) = Left (fun e1 e2)
firstEither _ (Left e1) y = y
firstEither _ (Right x) _ = Right x
-- instead of implementing this via pattern matching, we use
-- firstEither and swap the arguments. Since this would mean that
-- in the case of two `Left`s the errors would be in the wrong
-- order, we have to swap the arguments of `fun` as well.
-- Function `flip` from the prelude does this for us.
lastEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
lastEither fun x y = firstEither (flip fun) y x
fromEither : (e -> c) -> (a -> c) -> Either e a -> c
fromEither f _ (Left x) = f x
fromEither _ g (Right x) = g x
-- 3
mapList : (a -> b) -> List a -> List b
mapList f Nil = Nil
mapList f (x :: xs) = f x :: mapList f xs
filterList : (a -> Bool) -> List a -> List a
filterList f Nil = Nil
filterList f (x :: xs) =
if f x then x :: filterList f xs else filterList f xs
(++) : List a -> List a -> List a
(++) Nil ys = ys
(++) (x :: xs) ys = x :: (Solutions.DataTypes.(++) xs ys)
headMaybe : List a -> Maybe a
headMaybe Nil = Nothing
headMaybe (x :: _) = Just x
tailMaybe : List a -> Maybe (List a)
tailMaybe Nil = Nothing
tailMaybe (x :: xs) = Just xs
lastMaybe : List a -> Maybe a
lastMaybe Nil = Nothing
lastMaybe (x :: Nil) = Just x
lastMaybe (_ :: xs) = lastMaybe xs
initMaybe : List a -> Maybe (List a)
initMaybe l = case l of
Nil => Nothing
x :: xs => case initMaybe xs of
Nothing => Just Nil
Just ys => Just (x :: ys)
foldList : (acc -> el -> acc) -> acc -> List el -> acc
foldList fun vacc Nil = vacc
foldList fun vacc (x :: xs) = foldList fun (fun vacc x) xs
-- 4
record Client where
constructor MkClient
name : String
title : Title
age : Bits8
passwordOrKey : Either Bits64 String
data Credentials = Password String Bits64 | Key String String
login1 : Client -> Credentials -> Either LoginError Client
login1 c (Password u y) =
if c.name == u then
if c.passwordOrKey == Left y then Right c else Left InvalidPassword
else Left (UnknownUser u)
login1 c (Key u x) =
if c.name == u then
if c.passwordOrKey == Right x then Right c else Left InvalidKey
else Left (UnknownUser u)
login : List Client -> Credentials -> Either LoginError Client
login Nil (Password u _) = Left (UnknownUser u)
login Nil (Key u _) = Left (UnknownUser u)
login (x :: xs) cs = case login1 x cs of
Right c => Right c
Left InvalidPassword => Left InvalidPassword
Left InvalidKey => Left InvalidKey
Left _ => login xs cs
--5
formulaMass : List (Element,Nat) -> Double
formulaMass [] = 0
formulaMass ((e, n) :: xs) = atomicMass e * cast n + formulaMass xs
src/Solutions/Interfaces.idr
module Solutions.Interfaces
%default total
--------------------------------------------------------------------------------
-- Basics
--------------------------------------------------------------------------------
interface Comp a where
comp : a -> a -> Ordering
-- 1
anyLarger : Comp a => a -> List a -> Bool
anyLarger va [] = False
anyLarger va (x :: xs) = comp va x == GT || anyLarger va xs
-- 2
allLarger : Comp a => a -> List a -> Bool
allLarger va [] = True
allLarger va (x :: xs) = comp va x == GT && allLarger va xs
-- 3
maxElem : Comp a => List a -> Maybe a
maxElem [] = Nothing
maxElem (x :: xs) = case maxElem xs of
Nothing => Just x
Just v => if comp x v == GT then Just x else Just v
minElem : Comp a => List a -> Maybe a
minElem [] = Nothing
minElem (x :: xs) = case minElem xs of
Nothing => Just x
Just v => if comp x v == LT then Just x else Just v
-- 4
interface Concat a where
concat : a -> a -> a
implementation Concat String where
concat = (++)
implementation Concat (List a) where
concat = (++)
-- 5
concatList : Concat a => List a -> Maybe a
concatList [] = Nothing
concatList (x :: xs) = case concatList xs of
Nothing => Just x
Just v => Just (concat x v)
--------------------------------------------------------------------------------
-- More about Interfaces
--------------------------------------------------------------------------------
-- 1
interface Equals a where
eq : a -> a -> Bool
neq : a -> a -> Bool
neq x y = not (eq x y)
interface Concat a => Empty a where
empty : a
Equals a => Equals b => Equals (a,b) where
eq (x1,y1) (x2,y2) = eq x1 x2 && eq y1 y2
Comp a => Comp b => Comp (a,b) where
comp (x1,y1) (x2,y2) = case comp x1 x2 of
EQ => comp y1 y2
v => v
Concat a => Concat b => Concat (a,b) where
concat (x1,y1) (x2,y2) = (concat x1 x2, concat y1 y2)
Empty a => Empty b => Empty (a,b) where
empty = (empty, empty)
-- 2
data Tree : Type -> Type where
Leaf : a -> Tree a
Node : Tree a -> Tree a -> Tree a
Equals a => Equals (Tree a) where
eq (Leaf x) (Leaf y) = eq x y
eq (Node l1 r1) (Node l2 r2) = eq l1 l2 && eq r1 r2
eq _ _ = False
Concat (Tree a) where
concat = Node
--------------------------------------------------------------------------------
-- Interfaces in the Prelude
--------------------------------------------------------------------------------
-- 1
record Complex where
constructor MkComplex
rel : Double
img : Double
Eq Complex where
MkComplex r1 i1 == MkComplex r2 i2 = r1 == r2 && i1 == i2
Num Complex where
MkComplex r1 i1 + MkComplex r2 i2 = MkComplex (r1 + r2) (i1 + i2)
MkComplex r1 i1 * MkComplex r2 i2 =
MkComplex (r1 * r2 - i1 * i2) (r1 * i2 + r2 * i1)
fromInteger n = MkComplex (fromInteger n) 0.0
Neg Complex where
negate (MkComplex r i) = MkComplex (negate r) (negate i)
MkComplex r1 i1 - MkComplex r2 i2 = MkComplex (r1 - r2) (i1 - i2)
Fractional Complex where
MkComplex r1 i1 / MkComplex r2 i2 = case r2 * r2 + i2 * i2 of
denom => MkComplex ((r1 * r2 + i1 * i2) / denom)
((i1 * r2 - r1 * i2) / denom)
-- 2
Show Complex where
showPrec p c = showCon p "MkComplex" (showArg c.rel ++ showArg c.img)
-- 3
record First a where
constructor MkFirst
value : Maybe a
pureFirst : a -> First a
pureFirst = MkFirst . Just
mapFirst : (a -> b) -> First a -> First b
mapFirst f = MkFirst . map f . value
mapFirst2 : (a -> b -> c) -> First a -> First b -> First c
mapFirst2 f (MkFirst (Just va)) (MkFirst (Just vb)) = pureFirst (f va vb)
mapFirst2 _ _ _ = MkFirst Nothing
Eq a => Eq (First a) where
(==) = (==) `on` value
Ord a => Ord (First a) where
compare = compare `on` value
Show a => Show (First a) where
show = show . value
FromString a => FromString (First a) where
fromString = pureFirst . fromString
FromDouble a => FromDouble (First a) where
fromDouble = pureFirst . fromDouble
FromChar a => FromChar (First a) where
fromChar = pureFirst . fromChar
Num a => Num (First a) where
(+) = mapFirst2 (+)
(*) = mapFirst2 (*)
fromInteger = pureFirst . fromInteger
Neg a => Neg (First a) where
negate = mapFirst negate
(-) = mapFirst2 (-)
Integral a => Integral (First a) where
mod = mapFirst2 mod
div = mapFirst2 div
Fractional a => Fractional (First a) where
(/) = mapFirst2 (/)
recip = mapFirst recip
-- 4
Semigroup (First a) where
l@(MkFirst (Just _)) <+> _ = l
_ <+> r = r
Monoid (First a) where
neutral = MkFirst Nothing
-- 5
record Last a where
constructor MkLast
value : Maybe a
pureLast : a -> Last a
pureLast = MkLast . Just
mapLast : (a -> b) -> Last a -> Last b
mapLast f = MkLast . map f . value
mapLast2 : (a -> b -> c) -> Last a -> Last b -> Last c
mapLast2 f (MkLast (Just va)) (MkLast (Just vb)) = pureLast (f va vb)
mapLast2 _ _ _ = MkLast Nothing
Eq a => Eq (Last a) where
(==) = (==) `on` value
Ord a => Ord (Last a) where
compare = compare `on` value
Show a => Show (Last a) where
show = show . value
FromString a => FromString (Last a) where
fromString = pureLast . fromString
FromDouble a => FromDouble (Last a) where
fromDouble = pureLast . fromDouble
FromChar a => FromChar (Last a) where
fromChar = pureLast . fromChar
Num a => Num (Last a) where
(+) = mapLast2 (+)
(*) = mapLast2 (*)
fromInteger = pureLast . fromInteger
Neg a => Neg (Last a) where
negate = mapLast negate
(-) = mapLast2 (-)
Integral a => Integral (Last a) where
mod = mapLast2 mod
div = mapLast2 div
Fractional a => Fractional (Last a) where
(/) = mapLast2 (/)
recip = mapLast recip
Semigroup (Last a) where
_ <+> r@(MkLast (Just _)) = r
l <+> _ = l
Monoid (Last a) where
neutral = MkLast Nothing
-- 6
last : List a -> Maybe a
last = value . foldMap pureLast
-- 7
record Any where
constructor MkAny
any : Bool
Semigroup Any where
MkAny x <+> MkAny y = MkAny (x || y)
Monoid Any where
neutral = MkAny False
record All where
constructor MkAll
all : Bool
Semigroup All where
MkAll x <+> MkAll y = MkAll (x && y)
Monoid All where
neutral = MkAll True
-- 8
anyElem : (a -> Bool) -> List a -> Bool
anyElem f = any . foldMap (MkAny . f)
allElems : (a -> Bool) -> List a -> Bool
allElems f = all . foldMap (MkAll . f)
-- 9
record Sum a where
constructor MkSum
value : a
record Product a where
constructor MkProduct
value : a
Num a => Semigroup (Sum a) where
MkSum x <+> MkSum y = MkSum (x + y)
Num a => Monoid (Sum a) where
neutral = MkSum 0
Num a => Semigroup (Product a) where
MkProduct x <+> MkProduct y = MkProduct (x * y)
Num a => Monoid (Product a) where
neutral = MkProduct 1
-- 10
sumList : Num a => List a -> a
sumList = value . foldMap MkSum
productList : Num a => List a -> a
productList = value . foldMap MkProduct
-- 12
data Element = H | C | N | O | F
record Mass where
constructor MkMass
value : Double
FromDouble Mass
where fromDouble = MkMass
Eq Mass where
(==) = (==) `on` value
Ord Mass where
compare = compare `on` value
Show Mass where
show = show . value
Semigroup Mass where
x <+> y = MkMass $ x.value + y.value
Monoid Mass where
neutral = 0.0
-- 13
atomicMass : Element -> Mass
atomicMass H = 1.008
atomicMass C = 12.011
atomicMass N = 14.007
atomicMass O = 15.999
atomicMass F = 18.9984
formulaMass : List (Element,Nat) -> Mass
formulaMass = foldMap pairMass
where pairMass : (Element,Nat) -> Mass
pairMass (e, n) = MkMass $ value (atomicMass e) * cast n
src/Solutions/Functions2.idr
module Solutions.Functions2
import Data.List
%default total
--------------------------------------------------------------------------------
-- Let Bindings and Where Blocks
--------------------------------------------------------------------------------
-- 1
record Artist where
constructor MkArtist
name : String
record Album where
constructor MkAlbum
name : String
artist : Artist
record Email where
constructor MkEmail
value : String
record Password where
constructor MkPassword
value : String
record User where
constructor MkUser
name : String
email : Email
password : Password
albums : List Album
Eq Artist where (==) = (==) `on` name
Eq Email where (==) = (==) `on` value
Eq Password where (==) = (==) `on` value
Eq Album where (==) = (==) `on` \a => (a.name, a.artist)
record Credentials where
constructor MkCredentials
email : Email
password : Password
record Request where
constructor MkRequest
credentials : Credentials
album : Album
data Response : Type where
UnknownUser : Email -> Response
InvalidPassword : Response
AccessDenied : Email -> Album -> Response
Success : Album -> Response
DB : Type
DB = List User
handleRequest : DB -> Request -> Response
handleRequest xs (MkRequest (MkCredentials e pw) album) =
case find ((e ==) . email) xs of
Nothing => UnknownUser e
Just (MkUser _ _ pw' albums) =>
if pw' /= pw then InvalidPassword
else if elem album albums then Success album
else AccessDenied e album
--2
namespace Ex2
data Failure : Type where
UnknownUser : Email -> Failure
InvalidPassword : Failure
AccessDenied : Email -> Album -> Failure
handleRequest : DB -> Request -> Either Failure Album
handleRequest db req = case find ((==) req.credentials.email . email) db of
Nothing => Left (UnknownUser req.credentials.email)
Just u2 => case (u2.email == req.credentials.email && u2.password == req.credentials.password) of
False => Left InvalidPassword
True => case elem req.album u2.albums of
False => Left (AccessDenied req.credentials.email req.album)
True => Right req.album
-- 3
data Nucleobase = Adenine | Cytosine | Guanine | Thymine
readBase : Char -> Maybe Nucleobase
readBase 'A' = Just Adenine
readBase 'C' = Just Cytosine
readBase 'G' = Just Guanine
readBase 'T' = Just Thymine
readBase c = Nothing
-- 4
traverseList : (a -> Maybe b) -> List a -> Maybe (List b)
traverseList _ [] = Just []
traverseList f (x :: xs) =
case f x of
Just y => case traverseList f xs of
Just ys => Just (y :: ys)
Nothing => Nothing
Nothing => Nothing
-- 5
DNA : Type
DNA = List Nucleobase
readDNA : String -> Maybe DNA
readDNA = traverseList readBase . unpack
-- 6
complement : DNA -> DNA
complement = map comp
where comp : Nucleobase -> Nucleobase
comp Adenine = Thymine
comp Cytosine = Guanine
comp Guanine = Cytosine
comp Thymine = Adenine
src/Solutions/Dependent.idr
module Solutions.Dependent
%default total
--------------------------------------------------------------------------------
-- Length-Indexed Lists
--------------------------------------------------------------------------------
data Vect : (len : Nat) -> Type -> Type where
Nil : Vect 0 a
(::) : a -> Vect n a -> Vect (S n) a
-- 1
len : List a -> Nat
len Nil = Z
len (_ :: xs) = S (len xs)
-- 2
head : Vect (S n) a -> a
head (x :: _) = x
head Nil impossible
-- 3
tail : Vect (S n) a -> Vect n a
tail (_ :: xs) = xs
tail Nil impossible
-- 4
zipWith3 : (a -> b -> c -> d) -> Vect n a -> Vect n b -> Vect n c -> Vect n d
zipWith3 f [] [] [] = []
zipWith3 f (x :: xs) (y :: ys) (z :: zs) = f x y z :: zipWith3 f xs ys zs
-- 5
-- Since we only have a `Semigroup` constraint, we can't conjure
-- a value of type `a` out of nothing in case of an empty list.
-- We therefore have to return a `Nothing` in case of an empty list.
foldSemi : Semigroup a => List a -> Maybe a
foldSemi [] = Nothing
foldSemi (x :: xs) = Just . maybe x (x <+>) $ foldSemi xs
-- 6
-- the `Nil` case is impossible here, so unlike in Exercise 4,
-- we don't need to wrap the result in a `Maybe`.
-- However, we need to pattern match on the tail of the Vect to
-- decide whether to invoke `foldSemiVect` recursively or not
foldSemiVect : Semigroup a => Vect (S n) a -> a
foldSemiVect (x :: []) = x
foldSemiVect (x :: t@(_ :: _)) = x <+> foldSemiVect t
-- 7
iterate : (n : Nat) -> (a -> a) -> a -> Vect n a
iterate 0 _ _ = Nil
iterate (S k) f v = v :: iterate k f (f v)
-- 8
generate : (n : Nat) -> (s -> (s,a)) -> s -> Vect n a
generate 0 _ _ = Nil
generate (S k) f v =
let (v', va) = f v
in va :: generate k f v'
-- 9
fromList : (as : List a) -> Vect (length as) a
fromList [] = []
fromList (x :: xs) = x :: fromList xs
-- 10
-- Lookup the type and implementation of functions `maybe` `const` and
-- try figuring out, what's going on here. An alternative implementation
-- would of course just pattern match on the argument.
maybeSize : Maybe a -> Nat
maybeSize = maybe 0 (const 1)
fromMaybe : (m : Maybe a) -> Vect (maybeSize m) a
fromMaybe Nothing = []
fromMaybe (Just x) = [x]
--------------------------------------------------------------------------------
-- Fin: Safe Indexing into Vectors
--------------------------------------------------------------------------------
data Fin : (n : Nat) -> Type where
FZ : {0 n : Nat} -> Fin (S n)
FS : (k : Fin n) -> Fin (S n)
(++) : Vect m a -> Vect n a -> Vect (m + n) a
(++) [] ys = ys
(++) (x :: xs) ys = x :: (xs ++ ys)
replicate : (n : Nat) -> a -> Vect n a
replicate 0 _ = []
replicate (S k) x = x :: replicate k x
zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
zipWith _ [] [] = []
zipWith f (x :: xs) (y :: ys) = f x y :: zipWith f xs ys
-- 1
update : (a -> a) -> Fin n -> Vect n a -> Vect n a
update f FZ (x :: xs) = f x :: xs
update f (FS k) (x :: xs) = x :: update f k xs
-- 2
insert : a -> Fin (S n) -> Vect n a -> Vect (S n) a
insert v FZ xs = v :: xs
insert v (FS k) (x :: xs) = x :: insert v k xs
insert v (FS k) [] impossible
-- 3
-- The trick here is to pattern match on the tail of the
-- vector in the `FS k` case and realize that an empty
-- tail is impossible. Otherwise we won't be able to
-- convince the type checker, that the vector's tail is
-- non-empty in the recursive case.
delete : Fin (S n) -> Vect (S n) a -> Vect n a
delete FZ (_ :: xs) = xs
delete (FS k) (x :: xs@(_ :: _)) = x :: delete k xs
delete (FS k) (x :: []) impossible
-- 4
safeIndexList : (xs : List a) -> Fin (length xs) -> a
safeIndexList (x :: _) FZ = x
safeIndexList (x :: xs) (FS k) = safeIndexList xs k
safeIndexList Nil _ impossible
-- 5
finToNat : Fin n -> Nat
finToNat FZ = Z
finToNat (FS k) = S $ finToNat k
take : (k : Fin (S n)) -> Vect n a -> Vect (finToNat k) a
take FZ x = []
take (FS k) (x :: xs) = x :: take k xs
-- 6
minus : (n : Nat) -> Fin (S n) -> Nat
minus n FZ = n
minus (S j) (FS k) = minus j k
minus 0 (FS k) impossible
-- 7
drop : (k : Fin (S n)) -> Vect n a -> Vect (minus n k) a
drop FZ xs = xs
drop (FS k) (_ :: xs) = drop k xs
-- 8
splitAt : (k : Fin (S n))
-> Vect n a
-> (Vect (finToNat k) a, Vect (minus n k) a)
splitAt k xs = (take k xs, drop k xs)
--------------------------------------------------------------------------------
-- Compile-Time Computations
--------------------------------------------------------------------------------
-- 1
flattenList : List (List a) -> List a
flattenList [] = []
flattenList (xs :: xss) = xs ++ flattenList xss
flattenVect : Vect m (Vect n a) -> Vect (m * n) a
flattenVect [] = []
flattenVect (xs :: xss) = xs ++ flattenVect xss
-- 2
take' : (m : Nat) -> Vect (m + n) a -> Vect m a
take' 0 _ = []
take' (S k) (x :: xs) = x :: take' k xs
take' (S k) Nil impossible
drop' : (m : Nat) -> Vect (m + n) a -> Vect n a
drop' 0 xs = xs
drop' (S k) (x :: xs) = drop' k xs
drop' (S k) Nil impossible
splitAt' : (m : Nat) -> Vect (m + n) a -> (Vect m a, Vect n a)
splitAt' m xs = (take' m xs, drop' m xs)
-- 3
-- Since we must call `replicate` in the `Nil` case, `k`
-- must be a non-erased argument. I used an implicit argument here,
-- since this reflects the type of the mathematical function
-- more closely.
--
-- Empty matrices probably don't make too much sense,
-- so we could also request at the type-level that `k` and `m`
-- are non-zero, in which case both values could be derived
-- by pattern matching on the vectors.
transpose : {k : _} -> Vect m (Vect k a) -> Vect k (Vect m a)
transpose [] = replicate k []
transpose (xs :: xss) = zipWith (::) xs (transpose xss)
src/Solutions/IO.idr
module Solutions.IO
import Data.List1
import Data.String
import System.File
%default total
--------------------------------------------------------------------------------
-- Pure Side Effects?
--------------------------------------------------------------------------------
-- 1
rep : (String -> String) -> IO ()
rep f = do
s <- getLine
putStrLn (f s)
-- 2
covering
repl : (String -> String) -> IO ()
repl f = do
_ <- rep f
repl f
-- 3
covering
replTill : (String -> Either String String) -> IO ()
replTill f = do
s <- getLine
case f s of
Left msg => putStrLn msg
Right msg => do
_ <- putStrLn msg
replTill f
-- 4
data Error : Type where
NotAnInteger : (value : String) -> Error
UnknownOperator : (value : String) -> Error
ParseError : (input : String) -> Error
dispError : Error -> String
dispError (NotAnInteger v) = "Not an integer: " ++ v ++ "."
dispError (UnknownOperator v) = "Unknown operator: " ++ v ++ "."
dispError (ParseError v) = "Invalid expression: " ++ v ++ "."
readInteger : String -> Either Error Integer
readInteger s = maybe (Left $ NotAnInteger s) Right $ parseInteger s
readOperator : String -> Either Error (Integer -> Integer -> Integer)
readOperator "+" = Right (+)
readOperator "*" = Right (*)
readOperator s = Left (UnknownOperator s)
eval : String -> Either Error Integer
eval s =
let [x,y,z] := forget $ split isSpace s | _ => Left (ParseError s)
Right v1 := readInteger x | Left e => Left e
Right op := readOperator y | Left e => Left e
Right v2 := readInteger z | Left e => Left e
in Right $ op v1 v2
covering
exprProg : IO ()
exprProg = replTill prog
where prog : String -> Either String String
prog "done" = Left "Goodbye!"
prog s = Right . either dispError show $ eval s
-- 5
covering
replWith : (state : s)
-> (next : s -> String -> Either res s)
-> (dispState : s -> String)
-> (dispResult : res -> s -> String)
-> IO ()
replWith state next dispState dispResult = do
_ <- putStrLn (dispState state)
input <- getLine
case next state input of
Left result => putStrLn (dispResult result state)
Right state' => replWith state' next dispState dispResult
-- 6
data Abort : Type where
NoNat : (input : String) -> Abort
Done : Abort
printSum : Nat -> String
printSum n =
"Current sum: " ++ show n ++ "\nPlease enter a natural number:"
printRes : Abort -> Nat -> String
printRes (NoNat input) _ =
"Not a natural number: " ++ input ++ ". Aborting..."
printRes Done k =
"Final sum: " ++ show k ++ "\nHave a nice day."
readInput : Nat -> String -> Either Abort Nat
readInput _ "done" = Left Done
readInput n s = case parseInteger {a = Integer} s of
Nothing => Left $ NoNat s
Just v => if v >= 0 then Right (cast v + n) else Left (NoNat s)
covering
sumProg : IO ()
sumProg = replWith 0 readInput printSum printRes
--------------------------------------------------------------------------------
-- Do Blocks, Desugared
--------------------------------------------------------------------------------
-- 1
ex1a : IO String
ex1a = do
s1 <- getLine
s2 <- getLine
s3 <- getLine
pure $ s1 ++ reverse s2 ++ s3
ex1aBind : IO String
ex1aBind =
getLine >>= (\s1 =>
getLine >>= (\s2 =>
getLine >>= (\s3 =>
pure $ s1 ++ reverse s2 ++ s3
)
)
)
ex1aBang : IO String
ex1aBang =
pure $ !getLine ++ reverse !getLine ++ !getLine
ex1b : Maybe Integer
ex1b = do
n1 <- parseInteger "12"
n2 <- parseInteger "300"
Just $ n1 + n2 * 100
ex1bBind : Maybe Integer
ex1bBind =
parseInteger "12" >>= (\n1 =>
parseInteger "300" >>= (\n2 =>
Just $ n1 + n2 * 100
)
)
ex1bBang : Maybe Integer
ex1bBang =
Just $ !(parseInteger "12") + !(parseInteger "300") * 100
-- 2
data List01 : (nonEmpty : Bool) -> Type -> Type where
Nil : List01 False a
(::) : a -> List01 False a -> List01 ne a
head : List01 True a -> a
head (x :: _) = x
weaken : List01 ne a -> List01 False a
weaken [] = []
weaken (h :: t) = h :: t
map01 : (a -> b) -> List01 ne a -> List01 ne b
map01 _ [] = []
map01 f (x :: y) = f x :: map01 f y
tail : List01 True a -> List01 False a
tail (_ :: t) = weaken t
(++) : List01 ne1 a -> List01 ne2 a -> List01 (ne1 || ne2) a
(++) [] [] = []
(++) [] (h :: t) = h :: t
(++) (h :: t) xs = h :: weaken (t ++ xs)
concat' : List01 ne1 (List01 ne2 a) -> List01 False a
concat' [] = []
concat' (x :: y) = weaken (x ++ concat' y)
concat : {ne1, ne2 : _}
-> List01 ne1 (List01 ne2 a)
-> List01 (ne1 && ne2) a
concat {ne1 = True} {ne2 = True} (x :: y) = x ++ concat' y
concat {ne1 = True} {ne2 = False} x = concat' x
concat {ne1 = False} {ne2 = _} x = concat' x
namespace List01
export
(>>=) : {ne1, ne2 : _}
-> List01 ne1 a
-> (a -> List01 ne2 b)
-> List01 (ne1 && ne2) b
as >>= f = concat (map01 f as)
--------------------------------------------------------------------------------
-- Working with Files
--------------------------------------------------------------------------------
-- 1
namespace IOErr
export
pure : a -> IO (Either e a)
pure = pure . Right
export
fail : e -> IO (Either e a)
fail = pure . Left
export
lift : IO a -> IO (Either e a)
lift = map Right
export
catch : IO (Either e1 a) -> (e1 -> IO (Either e2 a)) -> IO (Either e2 a)
catch io f = do
Left err <- io | Right v => pure v
f err
export
(>>=) : IO (Either e a) -> (a -> IO (Either e b)) -> IO (Either e b)
io >>= f = Prelude.do
Right v <- io | Left err => fail err
f v
export
(>>) : IO (Either e ()) -> Lazy (IO (Either e a)) -> IO (Either e a)
iou >> ioa = Prelude.do
Right _ <- iou | Left err => fail err
ioa
covering
countEmpty'' : (path : String) -> IO (Either FileError Nat)
countEmpty'' path = withFile path Read pure (go 0)
where covering go : Nat -> File -> IO (Either FileError Nat)
go k file = do
False <- lift (fEOF file) | True => pure k
"\n" <- fGetLine file | _ => go k file
go (k + 1) file
-- 2
covering
countWords : (path : String) -> IO (Either FileError Nat)
countWords path = withFile path Read pure (go 0)
where covering go : Nat -> File -> IO (Either FileError Nat)
go k file = do
False <- lift (fEOF file) | True => pure k
s <- fGetLine file
go (k + length (words s)) file
-- 3
covering
withLines : (path : String)
-> (accum : s -> String -> s)
-> (initialState : s)
-> IO (Either FileError s)
withLines path accum ini = withFile path Read pure (go ini)
where covering go : s -> File -> IO (Either FileError s)
go st file = do
False <- lift (fEOF file) | True => pure st
line <- fGetLine file
go (accum st line) file
covering
countEmpty3 : (path : String) -> IO (Either FileError Nat)
countEmpty3 path = withLines path acc 0
where acc : Nat -> String -> Nat
acc k "\n" = k + 1
acc k _ = k
covering
countWords2 : (path : String) -> IO (Either FileError Nat)
countWords2 path = withLines path (\n,s => n + length (words s)) 0
-- 4
covering
foldLines : Monoid s
=> (path : String)
-> (f : String -> s)
-> IO (Either FileError s)
foldLines path f = withLines path (\vs => (vs <+>) . f) neutral
-- 5
-- Instead of returning a triple of natural numbers,
-- it is better to make the semantics clear and use
-- a custom record type to store the result.
--
-- In a larger, more-complex application it might be
-- even better to make things truly type safe and
-- define a single field record together with an instance
-- of monoid for each kind of count.
record WC where
constructor MkWC
lines : Nat
words : Nat
chars : Nat
Semigroup WC where
MkWC l1 w1 c1 <+> MkWC l2 w2 c2 = MkWC (l1 + l2) (w1 + w2) (c1 + c2)
Monoid WC where
neutral = MkWC 0 0 0
covering
toWC : String -> WC
toWC s = MkWC 1 (length (words s)) (length s)
covering
wordCount : (path : String) -> IO (Either FileError WC)
wordCount path = foldLines path toWC
-- this is for testing the `wordCount` example.
covering
testWC : (path : String) -> IO ()
testWC path = Prelude.do
Right (MkWC ls ws cs) <- wordCount path
| Left err => putStrLn "Error: \{show err}"
putStrLn "\{show ls} lines, \{show ws} words, \{show cs} characters"
src/Solutions/Functor.idr
module Solutions.Functor
import Data.IORef
import Data.List
import Data.List1
import Data.String
import Data.Vect
%default total
--------------------------------------------------------------------------------
-- Code Required from the Turoial
--------------------------------------------------------------------------------
interface Functor' (0 f : Type -> Type) where
map' : (a -> b) -> f a -> f b
interface Functor' f => Applicative' f where
app : f (a -> b) -> f a -> f b
pure' : a -> f a
record Comp (f,g : Type -> Type) (a : Type) where
constructor MkComp
unComp : f (g a)
implementation Functor f => Functor g => Functor (Comp f g) where
map f (MkComp v) = MkComp $ map f <$> v
record Product (f,g : Type -> Type) (a : Type) where
constructor MkProduct
fst : f a
snd : g a
implementation Functor f => Functor g => Functor (Product f g) where
map f (MkProduct l r) = MkProduct (map f l) (map f r)
data Gender = Male | Female | Other
record Name where
constructor MkName
value : String
record Email where
constructor MkEmail
value : String
record Password where
constructor MkPassword
value : String
record User where
constructor MkUser
firstName : Name
lastName : Name
age : Maybe Nat
email : Email
gender : Gender
password : Password
interface CSVField a where
read : String -> Maybe a
CSVField Gender where
read "m" = Just Male
read "f" = Just Female
read "o" = Just Other
read _ = Nothing
CSVField Bool where
read "t" = Just True
read "f" = Just False
read _ = Nothing
CSVField Nat where
read = parsePositive
CSVField Integer where
read = parseInteger
CSVField Double where
read = parseDouble
CSVField a => CSVField (Maybe a) where
read "" = Just Nothing
read s = Just <$> read s
readIf : (String -> Bool) -> (String -> a) -> String -> Maybe a
readIf p mk s = if p s then Just (mk s) else Nothing
isValidName : String -> Bool
isValidName s =
let len = length s
in 0 < len && len <= 100 && all isAlpha (unpack s)
CSVField Name where
read = readIf isValidName MkName
isEmailChar : Char -> Bool
isEmailChar '.' = True
isEmailChar '@' = True
isEmailChar c = isAlphaNum c
isValidEmail : String -> Bool
isValidEmail s =
let len = length s
in 0 < len && len <= 100 && all isEmailChar (unpack s)
CSVField Email where
read = readIf isValidEmail MkEmail
isPasswordChar : Char -> Bool
isPasswordChar ' ' = True
isPasswordChar c = not (isControl c) && not (isSpace c)
isValidPassword : String -> Bool
isValidPassword s =
let len = length s
in 8 < len && len <= 100 && all isPasswordChar (unpack s)
CSVField Password where
read = readIf isValidPassword MkPassword
data HList : (ts : List Type) -> Type where
Nil : HList Nil
(::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)
--------------------------------------------------------------------------------
-- Functor
--------------------------------------------------------------------------------
-- 1
Functor' Maybe where
map' _ Nothing = Nothing
map' f (Just v) = Just $ f v
Functor' List where
map' _ [] = []
map' f (x :: xs) = f x :: map' f xs
Functor' List1 where
map' f (h ::: t) = f h ::: map' f t
Functor' (Vect n) where
map' _ [] = []
map' f (x :: xs) = f x :: map' f xs
Functor' (Either e) where
map' _ (Left ve) = Left ve
map' f (Right va) = Right $ f va
Functor' (Pair e) where
map' f (ve,va) = (ve, f va)
-- 2
[Prod] Functor f => Functor g => Functor (\a => (f a, g a)) where
map fun (fa, ga) = (map fun fa, map fun ga)
-- 3
record Identity a where
constructor Id
value : a
Functor Identity where
map f (Id va) = Id $ f va
-- 4
record Const (e,a : Type) where
constructor MkConst
value : e
Functor (Const e) where
map _ (MkConst v) = MkConst v
-- 5
data Crud : (i : Type) -> (a : Type) -> Type where
Create : (value : a) -> Crud i a
Update : (id : i) -> (value : a) -> Crud i a
Read : (id : i) -> Crud i a
Delete : (id : i) -> Crud i a
Functor (Crud i) where
map f (Create value) = Create $ f value
map f (Update id value) = Update id $ f value
map _ (Read id) = Read id
map _ (Delete id) = Delete id
-- 6
data Response : (e, i, a : Type) -> Type where
Created : (id : i) -> (value : a) -> Response e i a
Updated : (id : i) -> (value : a) -> Response e i a
Found : (values : List a) -> Response e i a
Deleted : (id : i) -> Response e i a
Error : (err : e) -> Response e i a
Functor (Response e i) where
map f (Created id value) = Created id $ f value
map f (Updated id value) = Updated id $ f value
map f (Found values) = Found $ map f values
map _ (Deleted id) = Deleted id
map _ (Error err) = Error err
-- 7
data Validated : (e,a : Type) -> Type where
Invalid : (err : e) -> Validated e a
Valid : (val : a) -> Validated e a
Functor (Validated e) where
map _ (Invalid err) = Invalid err
map f (Valid val) = Valid $ f val
--------------------------------------------------------------------------------
-- Applicative
--------------------------------------------------------------------------------
-- 1
Applicative' (Either e) where
pure' = Right
app (Right f) (Right v) = Right $ f v
app (Left ve) _ = Left ve
app _ (Left ve) = Left ve
Applicative Identity where
pure = Id
Id f <*> Id v = Id $ f v
-- 2
{n : _} -> Applicative' (Vect n) where
pure' = replicate n
app [] [] = []
app (f :: fs) (v :: vs) = f v :: app fs vs
-- 3
Monoid e => Applicative' (Pair e) where
pure' v = (neutral, v)
app (e1,f) (e2,v) = (e1 <+> e2, f v)
-- 4
Monoid e => Applicative (Const e) where
pure _ = MkConst neutral
MkConst e1 <*> MkConst e2 = MkConst $ e1 <+> e2
-- 5
Semigroup e => Applicative (Validated e) where
pure = Valid
Valid f <*> Valid v = Valid $ f v
Valid _ <*> Invalid ve = Invalid ve
Invalid e1 <*> Invalid e2 = Invalid $ e1 <+> e2
Invalid ve <*> Valid _ = Invalid ve
-- 6
data CSVError : Type where
FieldError : (line, column : Nat) -> (str : String) -> CSVError
UnexpectedEndOfInput : (line, column : Nat) -> CSVError
ExpectedEndOfInput : (line, column : Nat) -> CSVError
App : (fst, snd : CSVError) -> CSVError
Semigroup CSVError where
(<+>) = App
-- 7
readField : CSVField a => (line, column : Nat) -> String -> Validated CSVError a
readField line col str =
maybe (Invalid $ FieldError line col str) Valid (read str)
toVect : (n : Nat) -> (line, col : Nat) -> List a -> Validated CSVError (Vect n a)
toVect 0 line _ [] = Valid []
toVect 0 line col _ = Invalid (ExpectedEndOfInput line col)
toVect (S k) line col [] = Invalid (UnexpectedEndOfInput line col)
toVect (S k) line col (x :: xs) = (x ::) <$> toVect k line (S col) xs
-- We can't use do notation here as we don't have an implementation
-- of Monad for `Validated`
readUser' : (line : Nat) -> List String -> Validated CSVError User
readUser' line ss = case toVect 6 line 0 ss of
Valid [fn,ln,a,em,g,pw] =>
[| MkUser (readField line 1 fn)
(readField line 2 ln)
(readField line 3 a)
(readField line 4 em)
(readField line 5 g)
(readField line 6 pw) |]
Invalid err => Invalid err
readUser : (line : Nat) -> String -> Validated CSVError User
readUser line = readUser' line . forget . split (',' ==)
interface CSVLine a where
decodeAt : (line, col : Nat) -> List String -> Validated CSVError a
CSVLine (HList []) where
decodeAt _ _ [] = Valid Nil
decodeAt l c _ = Invalid (ExpectedEndOfInput l c)
CSVField t => CSVLine (HList ts) => CSVLine (HList (t :: ts)) where
decodeAt l c [] = Invalid (UnexpectedEndOfInput l c)
decodeAt l c (s :: ss) = [| readField l c s :: decodeAt l (S c) ss |]
decode : CSVLine a => (line : Nat) -> String -> Validated CSVError a
decode line = decodeAt line 1 . forget . split (',' ==)
hdecode : (0 ts : List Type)
-> CSVLine (HList ts)
=> (line : Nat)
-> String
-> Validated CSVError (HList ts)
hdecode _ = decode
-- 8
-- 8.1
head : HList (t :: ts) -> t
head (v :: _) = v
-- 8.2
tail : HList (t :: ts) -> HList ts
tail (_ :: t) = t
-- 8.3
(++) : HList xs -> HList ys -> HList (xs ++ ys)
[] ++ ws = ws
(v :: vs) ++ ws = v :: (vs ++ ws)
-- 8.4
indexList : (as : List a) -> Fin (length as) -> a
indexList (x :: _) FZ = x
indexList (_ :: xs) (FS y) = indexList xs y
indexList [] x impossible
index : (ix : Fin (length ts)) -> HList ts -> indexList ts ix
index FZ (v :: _) = v
index (FS x) (_ :: vs) = index x vs
index ix [] impossible
-- 8.5
namespace HVect
public export
data HVect : (ts : Vect n Type) -> Type where
Nil : HVect Nil
(::) : (v : t) -> (vs : HVect ts) -> HVect (t :: ts)
public export
head : HVect (t :: ts) -> t
head (v :: _) = v
public export
tail : HVect (t :: ts) -> HVect ts
tail (_ :: t) = t
public export
(++) : HVect xs -> HVect ys -> HVect (xs ++ ys)
[] ++ ws = ws
(v :: vs) ++ ws = v :: (vs ++ ws)
public export
index : {0 n : Nat}
-> {0 ts : Vect n Type}
-> (ix : Fin n)
-> HVect ts -> index ix ts
index FZ (v :: _) = v
index (FS x) (_ :: vs) = index x vs
index ix [] impossible
-- 8.6
-- Note: We are usually not allowed to pattern match
-- on an erased argument. However, in this case, the
-- shape of `ts` follows from `n`, so we can pattern
-- match on `ts` to help Idris inferring the types.
--
-- Note also, that we create a `HVect` holding only empty
-- `Vect`s. We therefore only need to know about the length
-- of the type-level vector to implement this.
empties : {n : Nat} -> {0 ts : Vect n Type} -> HVect (Vect 0 <$> ts)
empties {n = 0} {ts = []} = []
empties {n = S _} {ts = _ :: _} = [] :: empties
hcons : {0 ts : Vect n Type}
-> HVect ts
-> HVect (Vect m <$> ts)
-> HVect (Vect (S m) <$> ts)
hcons [] [] = []
hcons (v :: vs) (w :: ws) = (v :: w) :: hcons vs ws
htranspose : {n : Nat}
-> {0 ts : Vect n Type}
-> Vect m (HVect ts)
-> HVect (Vect m <$> ts)
htranspose [] = empties
htranspose (x :: xs) = hcons x (htranspose xs)
vects : Vect 3 (HVect [Bool, Nat, String])
vects = [[True, 100, "Hello"], [False, 0, "Idris"], [False, 2, "!"]]
vects' : HVect [Vect 3 Bool, Vect 3 Nat, Vect 3 String]
vects' = htranspose vects
-- 9
Applicative f => Applicative g => Applicative (Comp f g) where
pure = MkComp . pure . pure
MkComp ff <*> MkComp fa = MkComp [| ff <*> fa |]
-- 10
Applicative f => Applicative g => Applicative (Product f g) where
pure v = MkProduct (pure v) (pure v)
MkProduct ffl ffr <*> MkProduct fal far =
MkProduct (ffl <*> fal) (ffr <*> far)
--------------------------------------------------------------------------------
-- Monad
--------------------------------------------------------------------------------
-- 1
mapWithApp : Applicative f => (a -> b) -> f a -> f b
mapWithApp fun fa = pure fun <*> fa
-- 2
appWithBind : Monad f => f (a -> b) -> f a -> f b
appWithBind ff fa = ff >>= (\fun => fa >>= (\va => pure $ fun va))
-- or, more readable, the same thing with do notation
appWithBindDo : Monad f => f (a -> b) -> f a -> f b
appWithBindDo ff fa = do
fun <- ff
va <- fa
pure $ fun va
-- 3
bindFromJoin : Monad m => m a -> (a -> m b) -> m b
bindFromJoin ma f = join $ map f ma
-- 4
joinFromBind : Monad m => m (m a) -> m a
joinFromBind = (>>= id)
-- 5
-- The third law
-- `mf <*> ma = mf >>= (\fun => map (fun $) ma)`
-- does not hold, as implementation of *apply* on the
-- right hand side does not perform error accumulation.
--
-- `Validated e` therefore comes without implementation of
-- `Monad`. In order to use it in do blocks, it's best to
-- convert it to Either and back.
-- 6
DB : Type
DB = IORef (List (Nat,User))
data DBError : Type where
UserExists : Email -> Nat -> DBError
UserNotFound : Nat -> DBError
SizeLimitExceeded : DBError
record Prog a where
constructor MkProg
runProg : DB -> IO (Either DBError a)
-- 6.1
-- make sure you are able to read and understand the
-- point-free style in the implementation of `map`!
Functor Prog where
map f (MkProg run) = MkProg $ map (map f) . run
Applicative Prog where
pure v = MkProg $ _ => pure (Right v)
MkProg rf <*> MkProg ra = MkProg $ \db => do
Right fun <- rf db | Left err => pure (Left err)
Right va <- ra db | Left err => pure (Left err)
pure (Right $ fun va)
Monad Prog where
MkProg ra >>= f = MkProg $ \db => do
Right va <- ra db | Left err => pure (Left err)
runProg (f va) db
-- 6.2
HasIO Prog where
liftIO act = MkProg $ _ => map Right act
-- 6.3
throw : DBError -> Prog a
throw err = MkProg $ _ => pure (Left err)
getUsers : Prog (List (Nat,User))
getUsers = MkProg (map Right . readIORef)
putUsers : List (Nat,User) -> Prog ()
putUsers us =
if length us > 1000 then throw SizeLimitExceeded
else MkProg $ \db => Right <$> writeIORef db us
modifyDB : (List (Nat,User) -> List (Nat,User)) -> Prog ()
modifyDB f = getUsers >>= putUsers . f
-- 6.4
lookupUser : (id : Nat) -> Prog User
lookupUser id = do
db <- getUsers
case lookup id db of
Just u => pure u
Nothing => throw (UserNotFound id)
-- 6.5
deleteUser : (id : Nat) -> Prog ()
deleteUser id =
-- In the first step, we are only interested in the potential
-- of failure, not the actual user value.
-- We can therefore use `(>>)` to chain the operations.
-- In order to do so, we must wrap `lookupUser` in a call
-- to `ignore`.
ignore (lookupUser id) >> modifyDB (filter $ (id /=) . fst)
-- 6.6
Eq Email where (==) = (==) `on` value
newId : List (Nat,User) -> Nat
newId = S . foldl (\n1,(n2,_) => max n1 n2) 0
addUser : (u : User) -> Prog Nat
addUser u = do
us <- getUsers
case find ((u.email ==) . email . snd) us of
Just (id,_) => throw $ UserExists u.email id
Nothing => let id = newId us in putUsers ((id, u) :: us) $> id
-- 6.7
update : Eq a => a -> b -> List (a,b) -> List (a,b)
update va vb = map (\p@(va',vb') => if va == va' then (va,vb) else p)
updateUser : (id : Nat) -> (mod : User -> User) -> Prog User
updateUser id mod = do
u <- mod <$> lookupUser id
us <- getUsers
case find ((u.email ==) . email . snd) us of
Just (id',_) => if id /= id'
then throw $ UserExists u.email id'
else putUsers (update id u us) $> u
Nothing => putUsers (update id u us) $> u
-- 6.8
record Prog' env err a where
constructor MkProg'
runProg' : env -> IO (Either err a)
Functor (Prog' env err) where
map f (MkProg' run) = MkProg' $ map (map f) . run
Applicative (Prog' env err) where
pure v = MkProg' $ _ => pure (Right v)
MkProg' rf <*> MkProg' ra = MkProg' $ \db => do
Right fun <- rf db | Left err => pure (Left err)
Right va <- ra db | Left err => pure (Left err)
pure (Right $ fun va)
Monad (Prog' env err) where
MkProg' ra >>= f = MkProg' $ \db => do
Right va <- ra db | Left err => pure (Left err)
runProg' (f va) db
HasIO (Prog' env err) where
liftIO act = MkProg' $ _ => map Right act
throw' : err -> Prog' env err a
throw' ve = MkProg' $ _ => pure (Left ve)
src/Solutions/Folds.idr
module Solutions.Folds
import Data.Maybe
import Data.SnocList
import Data.Vect
%default total
--------------------------------------------------------------------------------
-- Recursion
--------------------------------------------------------------------------------
-- 1
anyList : (a -> Bool) -> List a -> Bool
anyList p [] = False
anyList p (x :: xs) = case p x of
False => anyList p xs
True => True
anyList' : (a -> Bool) -> List a -> Bool
anyList' p Nil = False
anyList' p (x :: xs) = p x || anyList p xs
allList : (a -> Bool) -> List a -> Bool
allList p [] = True
allList p (x :: xs) = case p x of
True => allList p xs
False => False
allList' : (a -> Bool) -> List a -> Bool
allList' p Nil = True
allList' p (x :: xs) = p x && allList p xs
-- 2
findList : (a -> Bool) -> List a -> Maybe a
findList f [] = Nothing
findList f (x :: xs) = if f x then Just x else findList f xs
-- 3
collectList : (a -> Maybe b) -> List a -> Maybe b
collectList f [] = Nothing
collectList f (x :: xs) = case f x of
Just vb => Just vb
Nothing => collectList f xs
-- Note utility function `Data.Maybe.toMaybe` in the implementation
lookupList : Eq a => a -> List (a,b) -> Maybe b
lookupList va = collectList (\(k,v) => toMaybe (k == va) v)
-- 4
mapTR' : (a -> b) -> List a -> List b
mapTR' f = go Lin
where go : SnocList b -> List a -> List b
go sx [] = sx <>> Nil
go sx (x :: xs) = go (sx :< f x) xs
-- 5
filterTR' : (a -> Bool) -> List a -> List a
filterTR' f = go Lin
where go : SnocList a -> List a -> List a
go sx [] = sx <>> Nil
go sx (x :: xs) = if f x then go (sx :< x) xs else go sx xs
-- 6
mapMayTR : (a -> Maybe b) -> List a -> List b
mapMayTR f = go Lin
where go : SnocList b -> List a -> List b
go sx [] = sx <>> Nil
go sx (x :: xs) = case f x of
Just vb => go (sx :< vb) xs
Nothing => go sx xs
catMaybesTR : List (Maybe a) -> List a
catMaybesTR = mapMayTR id
-- 7
concatTR : List a -> List a -> List a
concatTR xs ys = (Lin <>< xs) <>> ys
-- 8
bindTR : List a -> (a -> List b) -> List b
bindTR xs f = go Lin xs
where go : SnocList b -> List a -> List b
go sx [] = sx <>> Nil
go sx (x :: xs) = go (sx <>< f x) xs
joinTR : List (List a) -> List a
joinTR = go Lin
where go : SnocList a -> List (List a) -> List a
go sx [] = sx <>> Nil
go sx (x :: xs) = go (sx <>< x) xs
-- Using the connection between join and bind:
-- yielding a tail recursive implementation as bindTR is.
joinTR' : List (List a) -> List a
joinTR' xss = bindTR xss id
--------------------------------------------------------------------------------
-- A few Notes on Totality Checking
--------------------------------------------------------------------------------
record Tree a where
constructor Node
value : a
forest : List (Tree a)
Forest : Type -> Type
Forest = List . Tree
example : Tree Bits8
example = Node 0 [Node 1 [], Node 2 [Node 3 [], Node 4 [Node 5 []]]]
mutual
treeSize : Tree a -> Nat
treeSize (Node _ forest) = S $ forestSize forest
forestSize : Forest a -> Nat
forestSize [] = 0
forestSize (x :: xs) = treeSize x + forestSize xs
-- 1
mutual
treeDepth : Tree a -> Nat
treeDepth (Node _ forest) = S $ forestDepth forest
forestDepth : Forest a -> Nat
forestDepth [] = 0
forestDepth (x :: xs) = max (treeDepth x) (forestDepth xs)
-- 2
-- It's often easier to write complex interface implementations
-- via a utility function.
--
-- Of course, we could also use a `mutual` block as with
-- `treeSize` and `forestSize` here.
treeEq : Eq a => Tree a -> Tree a -> Bool
treeEq (Node v1 f1) (Node v2 f2) = v1 == v2 && go f1 f2
where go : Forest a -> Forest a -> Bool
go [] [] = True
go (x :: xs) (y :: ys) = treeEq x y && go xs ys
go _ _ = False
Eq a => Eq (Tree a) where (==) = treeEq
-- 3
treeMap : (a -> b) -> Tree a -> Tree b
treeMap f (Node value forest) = Node (f value) (go forest)
where go : Forest a -> Forest b
go [] = []
go (x :: xs) = treeMap f x :: go xs
Functor Tree where map = treeMap
-- 4
treeShow : Show a => Prec -> Tree a -> String
treeShow p (Node value forest) =
showCon p "Node" $ showArg value ++ case forest of
[] => " []"
x :: xs => " [" ++ treeShow Open x ++ go xs ++ "]"
where go : Forest a -> String
go [] = ""
go (y :: ys) = ", " ++ treeShow Open y ++ go ys
Show a => Show (Tree a) where showPrec = treeShow
-- 5
mutual
treeToVect : (tr : Tree a) -> Vect (treeSize tr) a
treeToVect (Node value forest) = value :: forestToVect forest
forestToVect : (f : Forest a) -> Vect (forestSize f) a
forestToVect [] = []
forestToVect (x :: xs) = treeToVect x ++ forestToVect xs
--------------------------------------------------------------------------------
-- Interface Foldable
--------------------------------------------------------------------------------
-- 1
data Crud : (i : Type) -> (a : Type) -> Type where
Create : (value : a) -> Crud i a
Update : (id : i) -> (value : a) -> Crud i a
Read : (id : i) -> Crud i a
Delete : (id : i) -> Crud i a
Foldable (Crud i) where
foldr acc st (Create value) = acc value st
foldr acc st (Update _ value) = acc value st
foldr _ st (Read _) = st
foldr _ st (Delete _) = st
foldl acc st (Create value) = acc st value
foldl acc st (Update _ value) = acc st value
foldl _ st (Read _) = st
foldl _ st (Delete _) = st
null (Create _) = False
null (Update _ _) = False
null (Read _) = True
null (Delete _) = True
foldMap f (Create value) = f value
foldMap f (Update _ value) = f value
foldMap _ (Read _) = neutral
foldMap _ (Delete _) = neutral
foldlM acc st (Create value) = acc st value
foldlM acc st (Update _ value) = acc st value
foldlM _ st (Read _) = pure st
foldlM _ st (Delete _) = pure st
toList (Create v) = [v]
toList (Update _ v) = [v]
toList (Read _) = []
toList (Delete _) = []
-- 2
data Response : (e, i, a : Type) -> Type where
Created : (id : i) -> (value : a) -> Response e i a
Updated : (id : i) -> (value : a) -> Response e i a
Found : (values : List a) -> Response e i a
Deleted : (id : i) -> Response e i a
Error : (err : e) -> Response e i a
Foldable (Response e i) where
foldr acc st (Created _ value) = acc value st
foldr acc st (Updated _ value) = acc value st
foldr acc st (Found values) = foldr acc st values
foldr _ st (Deleted _) = st
foldr _ st (Error _) = st
foldl acc st (Created _ value) = acc st value
foldl acc st (Updated _ value) = acc st value
foldl acc st (Found values) = foldl acc st values
foldl _ st (Deleted _) = st
foldl _ st (Error _) = st
null (Created _ _) = False
null (Updated _ _) = False
null (Found values) = null values
null (Deleted _) = True
null (Error _) = True
foldMap f (Created _ value) = f value
foldMap f (Updated _ value) = f value
foldMap f (Found values) = foldMap f values
foldMap f (Deleted _) = neutral
foldMap f (Error _) = neutral
toList (Created _ value) = [value]
toList (Updated _ value) = [value]
toList (Found values) = values
toList (Deleted _) = []
toList (Error _) = []
foldlM acc st (Created _ value) = acc st value
foldlM acc st (Updated _ value) = acc st value
foldlM acc st (Found values) = foldlM acc st values
foldlM _ st (Deleted _) = pure st
foldlM _ st (Error _) = pure st
-- 3
data List01 : (nonEmpty : Bool) -> Type -> Type where
Nil : List01 False a
(::) : a -> List01 False a -> List01 ne a
list01ToList : List01 ne a -> List a
list01ToList = go Lin
where go : SnocList a -> List01 ne' a -> List a
go sx [] = sx <>> Nil
go sx (x :: xs) = go (sx :< x) xs
list01FoldMap : Monoid m => (a -> m) -> List01 ne a -> m
list01FoldMap f = go neutral
where go : m -> List01 ne' a -> m
go vm [] = vm
go vm (x :: xs) = go (vm <+> f x) xs
Foldable (List01 ne) where
foldr acc st [] = st
foldr acc st (x :: xs) = acc x (foldr acc st xs)
foldl acc st [] = st
foldl acc st (x :: xs) = foldl acc (acc st x) xs
null [] = True
null (_ :: _) = False
toList = list01ToList
foldMap = list01FoldMap
foldlM _ st [] = pure st
foldlM f st (x :: xs) = f st x >>= \st' => foldlM f st' xs
-- 4
mutual
foldrTree : (el -> st -> st) -> st -> Tree el -> st
foldrTree f v (Node value forest) = f value (foldrForest f v forest)
foldrForest : (el -> st -> st) -> st -> Forest el -> st
foldrForest _ v [] = v
foldrForest f v (x :: xs) = foldrTree f (foldrForest f v xs) x
mutual
foldlTree : (st -> el -> st) -> st -> Tree el -> st
foldlTree f v (Node value forest) = foldlForest f (f v value) forest
foldlForest : (st -> el -> st) -> st -> Forest el -> st
foldlForest _ v [] = v
foldlForest f v (x :: xs) = foldlForest f (foldlTree f v x) xs
mutual
foldMapTree : Monoid m => (el -> m) -> Tree el -> m
foldMapTree f (Node value forest) = f value <+> foldMapForest f forest
foldMapForest : Monoid m => (el -> m) -> Forest el -> m
foldMapForest _ [] = neutral
foldMapForest f (x :: xs) = foldMapTree f x <+> foldMapForest f xs
mutual
toListTree : Tree el -> List el
toListTree (Node value forest) = value :: toListForest forest
toListForest : Forest el -> List el
toListForest [] = []
toListForest (x :: xs) = toListTree x ++ toListForest xs
mutual
foldlMTree : Monad m => (st -> el -> m st) -> st -> Tree el -> m st
foldlMTree f v (Node value forest) =
f v value >>= \v' => foldlMForest f v' forest
foldlMForest : Monad m => (st -> el -> m st) -> st -> Forest el -> m st
foldlMForest _ v [] = pure v
foldlMForest f v (x :: xs) =
foldlMTree f v x >>= \v' => foldlMForest f v' xs
Foldable Tree where
foldr = foldrTree
foldl = foldlTree
foldMap = foldMapTree
foldlM = foldlMTree
null _ = False
toList = toListTree
-- 5
record Comp (f,g : Type -> Type) (a : Type) where
constructor MkComp
unComp : f (g a)
Foldable f => Foldable g => Foldable (Comp f g) where
foldr f st (MkComp v) = foldr (flip $ foldr f) st v
foldl f st (MkComp v) = foldl (foldl f) st v
foldMap f (MkComp v) = foldMap (foldMap f) v
foldlM f st (MkComp v) = foldlM (foldlM f) st v
toList (MkComp v) = foldMap toList v
null (MkComp v) = all null v
record Product (f,g : Type -> Type) (a : Type) where
constructor MkProduct
fst : f a
snd : g a
Foldable f => Foldable g => Foldable (Product f g) where
foldr f st (MkProduct v w) = foldr f (foldr f st w) v
foldl f st (MkProduct v w) = foldl f (foldl f st v) w
foldMap f (MkProduct v w) = foldMap f v <+> foldMap f w
toList (MkProduct v w) = toList v ++ toList w
null (MkProduct v w) = null v && null w
foldlM f st (MkProduct v w) = foldlM f st v >>= \st' => foldlM f st' w
--------------------------------------------------------------------------------
-- Tests
--------------------------------------------------------------------------------
iterateTR : Nat -> (a -> a) -> a -> List a
iterateTR k f = go k Lin
where go : Nat -> SnocList a -> a -> List a
go 0 sx _ = sx <>> Nil
go (S k) sx x = go k (sx :< x) (f x)
values : List Integer
values = iterateTR 100000 (+1) 0
main : IO ()
main = do
printLn . length $ mapTR' (*2) values
printLn . length $ filterTR' (\n => n `mod` 2 == 0) values
printLn . length $ mapMayTR (\n => toMaybe (n `mod` 2 == 1) "foo") values
printLn . length $ concatTR values values
printLn . length $ bindTR [1..500] (\n => iterateTR n (+1) n)
src/Solutions/Traverse.idr
module Solutions.Traverse
import Control.Applicative.Const
import Control.Monad.Identity
import Data.HList
import Data.List1
import Data.Singleton
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV
%default total
record State state a where
constructor ST
runST : state -> (state,a)
get : State state state
get = ST $ \s => (s,s)
put : state -> State state ()
put v = ST $ _ => (v,())
modify : (state -> state) -> State state ()
modify f = ST $ \v => (f v,())
runState : state -> State state a -> (state, a)
runState = flip runST
evalState : state -> State state a -> a
evalState s = snd . runState s
execState : state -> State state a -> state
execState s = fst . runState s
Functor (State state) where
map f (ST run) = ST $ \s => let (s2,va) = run s in (s2, f va)
Applicative (State state) where
pure v = ST $ \s => (s,v)
ST fun <*> ST val = ST $ \s =>
let (s2, f) = fun s
(s3, va) = val s2
in (s3, f va)
Monad (State state) where
ST val >>= f = ST $ \s =>
let (s2, va) = val s
in runST (f va) s2
--------------------------------------------------------------------------------
-- Reading CSV Tables
--------------------------------------------------------------------------------
-- 1
mapFromTraverse : Traversable t => (a -> b) -> t a -> t b
mapFromTraverse f = runIdentity . traverse (Id . f)
-- 2
-- Since Idris can't infer the type of `b` the call to `MkConst`, we have
-- to pass a value (which we can choose freely) explicitly.
foldMapFromTraverse : Traversable t => Monoid m => (a -> m) -> t a -> m
foldMapFromTraverse f = runConst . traverse (MkConst {b = ()}. f)
-- 3
interface Functor t => Foldable t => Traversable' t where
traverse' : Applicative f => (a -> f b) -> t a -> f (t b)
Traversable' List where
traverse' f Nil = pure Nil
traverse' f (x :: xs) = [| f x :: traverse' f xs |]
Traversable' List1 where
traverse' f (h ::: t) = [| f h ::: traverse' f t |]
Traversable' (Either e) where
traverse' f (Left ve) = pure $ Left ve
traverse' f (Right va) = Right <$> f va
Traversable' Maybe where
traverse' f Nothing = pure Nothing
traverse' f (Just va) = Just <$> f va
-- 4
data List01 : (nonEmpty : Bool) -> Type -> Type where
Nil : List01 False a
(::) : a -> List01 False a -> List01 ne a
Functor (List01 ne) where
map f Nil = Nil
map f (x :: xs) = f x :: map f xs
Foldable (List01 ne) where
foldr acc st [] = st
foldr acc st (x :: xs) = acc x (foldr acc st xs)
Traversable (List01 ne) where
traverse _ Nil = pure Nil
traverse f (x :: xs) = [| f x :: traverse f xs |]
-- 5
record Tree a where
constructor Node
value : a
forest : List (Tree a)
Forest : Type -> Type
Forest = List . Tree
treeMap : (a -> b) -> Tree a -> Tree b
treeMap f (Node value forest) = Node (f value) (go forest)
where go : Forest a -> Forest b
go [] = []
go (x :: xs) = treeMap f x :: go xs
Functor Tree where map = treeMap
mutual
foldrTree : (el -> st -> st) -> st -> Tree el -> st
foldrTree f v (Node value forest) = f value (foldrForest f v forest)
foldrForest : (el -> st -> st) -> st -> Forest el -> st
foldrForest _ v [] = v
foldrForest f v (x :: xs) = foldrTree f (foldrForest f v xs) x
Foldable Tree where
foldr = foldrTree
mutual
traverseTree : Applicative f => (a -> f b) -> Tree a -> f (Tree b)
traverseTree g (Node v fo) = [| Node (g v) (traverseForest g fo) |]
traverseForest : Applicative f => (a -> f b) -> Forest a -> f (Forest b)
traverseForest g [] = pure []
traverseForest g (x :: xs) = [| traverseTree g x :: traverseForest g xs |]
Traversable Tree where
traverse = traverseTree
-- 6
data Crud : (i : Type) -> (a : Type) -> Type where
Create : (value : a) -> Crud i a
Update : (id : i) -> (value : a) -> Crud i a
Read : (id : i) -> Crud i a
Delete : (id : i) -> Crud i a
Functor (Crud i) where
map f (Create value) = Create $ f value
map f (Update id value) = Update id $ f value
map f (Read id) = Read id
map f (Delete id) = Delete id
Foldable (Crud i) where
foldr acc st (Create value) = acc value st
foldr acc st (Update _ value) = acc value st
foldr _ st (Read _) = st
foldr _ st (Delete _) = st
Traversable (Crud i) where
traverse f (Create value) = Create <$> f value
traverse f (Update id value) = Update id <$> f value
traverse f (Read id) = pure $ Read id
traverse f (Delete id) = pure $ Delete id
-- 7
data Response : (e, i, a : Type) -> Type where
Created : (id : i) -> (value : a) -> Response e i a
Updated : (id : i) -> (value : a) -> Response e i a
Found : (values : List a) -> Response e i a
Deleted : (id : i) -> Response e i a
Error : (err : e) -> Response e i a
Functor (Response e i) where
map f (Created id value) = Created id $ f value
map f (Updated id value) = Updated id $ f value
map f (Found values) = Found $ map f values
map _ (Deleted id) = Deleted id
map _ (Error err) = Error err
Foldable (Response e i) where
foldr acc st (Created _ value) = acc value st
foldr acc st (Updated _ value) = acc value st
foldr acc st (Found values) = foldr acc st values
foldr _ st (Deleted _) = st
foldr _ st (Error _) = st
Traversable (Response e i) where
traverse f (Created id value) = Created id <$> f value
traverse f (Updated id value) = Updated id <$> f value
traverse f (Found values) = Found <$> traverse f values
traverse _ (Deleted id) = pure $ Deleted id
traverse _ (Error err) = pure $ Error err
-- 8
record Comp (f,g : Type -> Type) (a : Type) where
constructor MkComp
unComp : f (g a)
Functor f => Functor g => Functor (Comp f g) where
map fun = MkComp . (map . map) fun . unComp
Foldable f => Foldable g => Foldable (Comp f g) where
foldr f st (MkComp v) = foldr (flip $ foldr f) st v
Traversable f => Traversable g => Traversable (Comp f g) where
traverse fun = map MkComp . (traverse . traverse) fun . unComp
record Product (f,g : Type -> Type) (a : Type) where
constructor MkProduct
fst : f a
snd : g a
Functor f => Functor g => Functor (Product f g) where
map fun (MkProduct fa ga) = MkProduct (map fun fa) (map fun ga)
Foldable f => Foldable g => Foldable (Product f g) where
foldr f st (MkProduct v w) = foldr f (foldr f st w) v
Traversable f => Traversable g => Traversable (Product f g) where
traverse fun (MkProduct fa ga) =
[| MkProduct (traverse fun fa) (traverse fun ga) |]
--------------------------------------------------------------------------------
-- Programming with State
--------------------------------------------------------------------------------
-- 1
rnd : Bits64 -> Bits64
rnd seed = fromInteger
$ (437799614237992725 * cast seed) `mod` 2305843009213693951
Gen : Type -> Type
Gen = State Bits64
-- 1.1
bits64 : Gen Bits64
bits64 = get <* modify rnd
-- 1.2
range64 : (upper : Bits64) -> Gen Bits64
range64 18446744073709551615 = bits64
range64 n = (`mod` (n + 1)) <$> bits64
interval64 : (a,b : Bits64) -> Gen Bits64
interval64 a b =
let mi = min a b
ma = max a b
in (mi +) <$> range64 (ma - mi)
interval : Num n => Cast n Bits64 => (a,b : n) -> Gen n
interval a b = fromInteger . cast <$> interval64 (cast a) (cast b)
-- 1.3
bool : Gen Bool
bool = (== 0) <$> range64 1
-- 1.4
fin : {n : _} -> Gen (Fin $ S n)
fin = (\x => fromMaybe FZ $ natToFin x _) <$> interval 0 n
-- 1.5
element : {n : _} -> Vect (S n) a -> Gen a
element vs = (`index` vs) <$> fin
-- 1.6
vect : {n : _} -> Gen a -> Gen (Vect n a)
vect = sequence . replicate n
list : Gen Nat -> Gen a -> Gen (List a)
list gnat ga = gnat >>= \n => toList <$> vect {n} ga
testGen : Bits64 -> Gen a -> Vect 10 a
testGen seed = evalState seed . vect
-- 1.7
choice : {n : _} -> Vect (S n) (Gen a) -> Gen a
choice gens = element gens >>= id
-- 1.8
either : Gen a -> Gen b -> Gen (Either a b)
either ga gb = choice [Left <$> ga, Right <$> gb]
-- 1.9
printableAscii : Gen Char
printableAscii = chr <$> interval 32 126
-- 1.10
string : Gen Nat -> Gen Char -> Gen String
string gn = map pack . list gn
-- 1.11
namespace HListF
public export
data HListF : (f : Type -> Type) -> (ts : List Type) -> Type where
Nil : HListF f []
(::) : (x : f t) -> (xs : HListF f ts) -> HListF f (t :: ts)
hlist : HListF Gen ts -> Gen (HList ts)
hlist Nil = pure Nil
hlist (gh :: gt) = [| gh :: hlist gt |]
-- 1.12
hlistT : Applicative f => HListF f ts -> f (HList ts)
hlistT Nil = pure Nil
hlistT (fh :: ft) = [| fh :: hlistT ft |]
-- 2
-- 2.1
record IxState s t a where
constructor IxST
runIxST : s -> (t,a)
-- 2.2
Functor (IxState s t) where
map f (IxST run) = IxST $ \vs => let (vt,va) = run vs in (vt, f va)
-- 2.3
pure : a -> IxState s s a
pure va = IxST $ \vs => (vs,va)
(<*>) : IxState r s (a -> b) -> IxState s t a -> IxState r t b
IxST ff <*> IxST fa = IxST $ \vr =>
let (vs,f) = ff vr
(vt,va) = fa vs
in (vt, f va)
-- 2.4
(>>=) : IxState r s a -> (a -> IxState s t b) -> IxState r t b
IxST fa >>= f = IxST $ \vr =>
let (vs,va) = fa vr in runIxST (f va) vs
(>>) : IxState r s () -> IxState s t a -> IxState r t a
IxST fu >> IxST fb = IxST $ fb . fst . fu
-- 2.5
namespace IxMonad
interface Functor (m s t) =>
IxApplicative (0 m : Type -> Type -> Type -> Type) where
pure : a -> m s s a
(<*>) : m r s (a -> b) -> m s t a -> m r t b
interface IxApplicative m => IxMonad m where
(>>=) : m r s a -> (a -> m s t b) -> m r t b
IxApplicative IxState where
pure = Traverse.pure
(<*>) = Traverse.(<*>)
IxMonad IxState where
(>>=) = Traverse.(>>=)
-- 2.6
namespace IxState
get : IxState s s s
get = IxST $ \vs => (vs,vs)
put : t -> IxState s t ()
put vt = IxST $ _ => (vt,())
modify : (s -> t) -> IxState s t ()
modify f = IxST $ \vs => (f vs, ())
runState : s -> IxState s t a -> (t,a)
runState = flip runIxST
evalState : s -> IxState s t a -> a
evalState vs = snd . runState vs
execState : s -> IxState s t a -> t
execState vs = fst . runState vs
-- 2.7
Applicative (IxState s s) where
pure = Traverse.pure
(<*>) = Traverse.(<*>)
Monad (IxState s s) where
(>>=) = Traverse.(>>=)
join = (>>= id)
--------------------------------------------------------------------------------
-- The Power of Composition
--------------------------------------------------------------------------------
-- 1
data Tagged : (tag, val : Type) -> Type where
Tag : tag -> val -> Tagged tag val
Pure : val -> Tagged tag val
Functor (Tagged tag) where
map f (Tag x y) = Tag x (f y)
map f (Pure x) = Pure (f x)
Foldable (Tagged tag) where
foldr f acc (Tag _ x) = f x acc
foldr f acc (Pure x) = f x acc
Traversable (Tagged tag) where
traverse f (Tag x y) = Tag x <$> f y
traverse f (Pure x) = Pure <$> f x
Bifunctor Tagged where
bimap f g (Tag x y) = Tag (f x) (g y)
bimap _ g (Pure x) = Pure (g x)
mapFst f (Tag x y) = Tag (f x) y
mapFst _ (Pure x) = Pure x
mapSnd g (Tag x y) = Tag x (g y)
mapSnd g (Pure x) = Pure (g x)
Bifoldable Tagged where
bifoldr f g acc (Tag x y) = f x (g y acc)
bifoldr f g acc (Pure x) = g x acc
bifoldl f g acc (Tag x y) = g (f acc x) y
bifoldl _ g acc (Pure x) = g acc x
binull _ = False
Bitraversable Tagged where
bitraverse f g (Tag x y) = [| Tag (f x) (g y) |]
bitraverse _ g (Pure x) = Pure <$> g x
-- 2
record Biff (p : Type -> Type -> Type) (f,g : Type -> Type) (a,b : Type) where
constructor MkBiff
runBiff : p (f a) (g b)
Bifunctor p => Functor f => Functor g => Bifunctor (Biff p f g) where
bimap ff fg = MkBiff . bimap (map ff) (map fg) . runBiff
Bifoldable p => Foldable f => Foldable g => Bifoldable (Biff p f g) where
bifoldr ff fg acc = bifoldr (flip $ foldr ff) (flip $ foldr fg) acc . runBiff
Bitraversable p => Traversable f => Traversable g =>
Bitraversable (Biff p f g) where
bitraverse ff fg =
map MkBiff . bitraverse (traverse ff) (traverse fg) . runBiff
-- 3
record Tannen (f : Type -> Type) (p : Type -> Type -> Type) (a,b : Type) where
constructor MkTannen
runTannen : f (p a b)
Bifunctor p => Functor f => Bifunctor (Tannen f p) where
bimap ff fg = MkTannen . map (bimap ff fg) . runTannen
Bifoldable p => Foldable f => Bifoldable (Tannen f p) where
bifoldr ff fg acc = foldr (flip $ bifoldr ff fg) acc . runTannen
Bitraversable p => Traversable f => Bitraversable (Tannen f p) where
bitraverse ff fg = map MkTannen . traverse (bitraverse ff fg) . runTannen
-- 4
data TagError : Type where
CE : CSVError -> TagError
InvalidTag : (line : Nat) -> (tag : String) -> TagError
Append : TagError -> TagError -> TagError
Semigroup TagError where (<+>) = Append
pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = ST $ \index => (S index, (index, v))
data Color = Red | Green | Blue
readColor : String -> State Nat (Validated TagError Color)
readColor s = uncurry decodeTag . (`MkPair` s) <$> get
where decodeTag : Nat -> String -> Validated TagError Color
decodeTag k "red" = pure Red
decodeTag k "green" = pure Green
decodeTag k "blue" = pure Blue
decodeTag k s = Invalid $ InvalidTag k s
readTaggedLine : String -> Tagged String String
readTaggedLine s = case split ('#' ==) s of
h ::: [t] => Tag t h
_ => Pure s
tagAndDecodeTE : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> State Nat (Validated TagError (HList ts))
tagAndDecodeTE ts s = mapFst CE . uncurry (hdecode ts) <$> pairWithIndex s
readTagged : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> Validated TagError (List $ Tagged Color $ HList ts)
readTagged ts = map runTannen
. evalState 1
. bitraverse @{%search} @{Compose} readColor (tagAndDecodeTE ts)
. MkTannen {f = List} {p = Tagged}
. map readTaggedLine
. lines
validInput : String
validInput = """
f,12,-13.01#green
t,100,0.0017
t,1,100.8#blue
f,255,0.0
f,24,1.12e17
"""
invalidInput : String
invalidInput = """
o,12,-13.01#yellow
t,100,0.0017
t,1,abc
f,256,0.0
f,24,1.12e17
"""
src/Solutions/DPair.idr
module Solutions.DPair
import Control.Monad.State
import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect
import Text.CSV
import System.File
%default total
--------------------------------------------------------------------------------
-- Dependent Pairs
--------------------------------------------------------------------------------
-- 1
filterVect : (a -> Bool) -> Vect m a -> (n ** Vect n a)
filterVect f [] = (_ ** [])
filterVect f (x :: xs) = case f x of
True => let (_ ** ys) = filterVect f xs in (_ ** x :: ys)
False => filterVect f xs
-- 2
mapMaybeVect : (a -> Maybe b) -> Vect m a -> (n ** Vect n b)
mapMaybeVect f [] = (_ ** [])
mapMaybeVect f (x :: xs) = case f x of
Just v => let (_ ** vs) = mapMaybeVect f xs in (_ ** v :: vs)
Nothing => mapMaybeVect f xs
-- 3
dropWhileVect : (a -> Bool) -> Vect m a -> Exists (\n => Vect n a)
dropWhileVect f [] = Evidence _ []
dropWhileVect f (x :: xs) = case f x of
True => dropWhileVect f xs
False => Evidence _ (x :: xs)
-- 4
vectLength : Vect n a -> Singleton n
vectLength [] = Val 0
vectLength (x :: xs) = let Val k = vectLength xs in Val (S k)
dropWhileVect' : (a -> Bool) -> Vect m a -> (n ** Vect n a)
dropWhileVect' f xs =
let Evidence _ ys = dropWhileVect f xs
Val n = vectLength ys
in (n ** ys)
--------------------------------------------------------------------------------
-- Use Case: Nucleic Acids
--------------------------------------------------------------------------------
-- 1
data BaseType = DNABase | RNABase
data Nucleobase' : BaseType -> Type where
Adenine' : Nucleobase' b
Cytosine' : Nucleobase' b
Guanine' : Nucleobase' b
Thymine' : Nucleobase' DNABase
Uracile' : Nucleobase' RNABase
RNA' : Type
RNA' = List (Nucleobase' RNABase)
DNA' : Type
DNA' = List (Nucleobase' DNABase)
Acid1 : Type
Acid1 = (b ** List (Nucleobase' b))
record Acid2 where
constructor MkAcid2
baseType : BaseType
sequence : List (Nucleobase' baseType)
data Acid3 : Type where
SomeRNA : RNA' -> Acid3
SomeDNA : DNA' -> Acid3
nb12 : Acid1 -> Acid2
nb12 (fst ** snd) = MkAcid2 fst snd
nb21 : Acid2 -> Acid1
nb21 (MkAcid2 bt seq) = (bt ** seq)
nb13 : Acid1 -> Acid3
nb13 (DNABase ** snd) = SomeDNA snd
nb13 (RNABase ** snd) = SomeRNA snd
nb31 : Acid3 -> Acid1
nb31 (SomeRNA xs) = (RNABase ** xs)
nb31 (SomeDNA xs) = (DNABase ** xs)
-- 2
data Dir = Sense | Antisense
data Nucleobase : BaseType -> Dir -> Type where
Adenine : Nucleobase b d
Cytosine : Nucleobase b d
Guanine : Nucleobase b d
Thymine : Nucleobase DNABase d
Uracile : Nucleobase RNABase d
RNA : Dir -> Type
RNA d = List (Nucleobase RNABase d)
DNA : Dir -> Type
DNA d = List (Nucleobase DNABase d)
-- 3
inverse : Dir -> Dir
inverse Sense = Antisense
inverse Antisense = Sense
complementBase : (b : BaseType)
-> Nucleobase b dir
-> Nucleobase b (inverse dir)
complementBase DNABase Adenine = Thymine
complementBase RNABase Adenine = Uracile
complementBase _ Cytosine = Guanine
complementBase _ Guanine = Cytosine
complementBase _ Thymine = Adenine
complementBase _ Uracile = Adenine
complement : (b : BaseType)
-> List (Nucleobase b dir)
-> List (Nucleobase b $ inverse dir)
complement b = map (complementBase b)
transcribeBase : Nucleobase DNABase Antisense -> Nucleobase RNABase Sense
transcribeBase Adenine = Uracile
transcribeBase Cytosine = Guanine
transcribeBase Guanine = Cytosine
transcribeBase Thymine = Adenine
transcribe : DNA Antisense -> RNA Sense
transcribe = map transcribeBase
transcribeAny : (dir : Dir) -> DNA dir -> RNA Sense
transcribeAny Antisense = transcribe
transcribeAny Sense = transcribe . complement _
-- 4
record NucleicAcid where
constructor MkNucleicAcid
baseType : BaseType
dir : Dir
sequence : List (Nucleobase baseType dir)
-- 5
readAnyBase : {0 dir : _} -> Char -> Maybe (Nucleobase b dir)
readAnyBase 'A' = Just Adenine
readAnyBase 'C' = Just Cytosine
readAnyBase 'G' = Just Guanine
readAnyBase _ = Nothing
readRNABase : {0 dir : _} -> Char -> Maybe (Nucleobase RNABase dir)
readRNABase 'U' = Just Uracile
readRNABase c = readAnyBase c
readDNABase : {0 dir : _} -> Char -> Maybe (Nucleobase DNABase dir)
readDNABase 'T' = Just Thymine
readDNABase c = readAnyBase c
readRNA : String -> Maybe (dir : Dir ** RNA dir)
readRNA str = case forget $ split ('-' ==) str of
["5´",s,"3´"] => MkDPair Sense <$> traverse readRNABase (unpack s)
["3´",s,"5´"] => MkDPair Antisense <$> traverse readRNABase (unpack s)
_ => Nothing
readDNA : String -> Maybe (dir : Dir ** DNA dir)
readDNA str = case forget $ split ('-' ==) str of
["5´",s,"3´"] => MkDPair Sense <$> traverse readDNABase (unpack s)
["3´",s,"5´"] => MkDPair Antisense <$> traverse readDNABase (unpack s)
_ => Nothing
-- 6
preSuf : Dir -> (String,String)
preSuf Sense = ("5´-", "-3´")
preSuf Antisense = ("3´-", "-5´")
encodeBase : Nucleobase c d -> Char
encodeBase Adenine = 'A'
encodeBase Cytosine = 'C'
encodeBase Guanine = 'G'
encodeBase Thymine = 'T'
encodeBase Uracile = 'U'
encode : (dir : Dir) -> List (Nucleobase b dir) -> String
encode dir seq =
let (pre,suf) = preSuf dir
in pre ++ pack (map encodeBase seq) ++ suf
-- 7
public export
data InputError : Type where
UnknownBaseType : String -> InputError
InvalidSequence : String -> InputError
readAcid : (b : BaseType)
-> String
-> Either InputError (d ** List $ Nucleobase b d)
readAcid b str =
let err = InvalidSequence str
in case b of
DNABase => maybeToEither err $ readDNA str
RNABase => maybeToEither err $ readRNA str
toAcid : (b : BaseType) -> (d ** List $ Nucleobase b d) -> NucleicAcid
toAcid b (d ** seq) = MkNucleicAcid b d seq
getNucleicAcid : IO (Either InputError NucleicAcid)
getNucleicAcid = do
baseString <- getLine
case baseString of
"DNA" => map (toAcid _) . readAcid DNABase <$> getLine
"RNA" => map (toAcid _) . readAcid RNABase <$> getLine
_ => pure $ Left (UnknownBaseType baseString)
printRNA : RNA Sense -> IO ()
printRNA = putStrLn . encode _
transcribeProg : IO ()
transcribeProg = do
Right (MkNucleicAcid b d seq) <- getNucleicAcid
| Left (InvalidSequence str) => putStrLn $ "Invalid sequence: " ++ str
| Left (UnknownBaseType str) => putStrLn $ "Unknown base type: " ++ str
case b of
DNABase => printRNA $ transcribeAny d seq
RNABase => case d of
Sense => printRNA seq
Antisense => printRNA $ complement _ seq
--------------------------------------------------------------------------------
-- Use Case: CSV Files with a Schema
--------------------------------------------------------------------------------
-- A lot of code was copy-pasted from the chapter's text and is, therefore
-- not very interesting. I tried to annotate the new parts with some hints
-- for better understanding. Also, instead of grouping code by exercise number,
-- I organized it thematically.
-- *** Types ***
-- I used an indexed type here to make sure, data
-- constructor `Optional` takes only non-nullary types
-- as arguments. As noted in exercise 3, having a nesting
-- of nullary types does not make sense without a way to
-- distinguish between a `Nothing` and a `Just Nothing`,
-- both of which would be encoded as the empty string.
-- For `Finite`, we have to add `n` as an argument to the
-- data constructor, so we can use it to decode values
-- of type `Fin n`.
data ColType0 : (nullary : Bool) -> Type where
I64 : ColType0 b
Str : ColType0 b
Boolean : ColType0 b
Float : ColType0 b
Natural : ColType0 b
BigInt : ColType0 b
Finite : Nat -> ColType0 b
Optional : ColType0 False -> ColType0 True
-- This is the type used in schemata, where nullary types
-- are explicitly allowed.
ColType : Type
ColType = ColType0 True
Schema : Type
Schema = List ColType
-- The only interesting new parts are the last two
-- lines. They should be pretty self-explanatory.
IdrisType : ColType0 b -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
IdrisType Natural = Nat
IdrisType BigInt = Integer
IdrisType (Finite n) = Fin n
IdrisType (Optional t) = Maybe $ IdrisType t
Row : Schema -> Type
Row = HList . map IdrisType
record Table where
constructor MkTable
schema : Schema
size : Nat
rows : Vect size (Row schema)
data Error : Type where
ExpectedEOI : (pos : Nat) -> String -> Error
ExpectedLine : Error
InvalidCell : (row, col : Nat) -> ColType0 b -> String -> Error
NoNat : String -> Error
OutOfBounds : (size : Nat) -> (index : Nat) -> Error
ReadError : (path : String) -> FileError -> Error
UnexpectedEOI : (pos : Nat) -> String -> Error
UnknownCommand : String -> Error
UnknownType : (pos : Nat) -> String -> Error
WriteError : (path : String) -> FileError -> Error
-- Oh, the type of `Query` is a nice one. :-)
-- `PrintTable`, on the other hand, is trivial.
-- The save and load commands are special: They will
-- already have carried out their tasks after parsing.
-- This allow us to keep `applyCommand` pure.
data Command : (t : Table) -> Type where
PrintSchema : Command t
PrintSize : Command t
PrintTable : Command t
Load : Table -> Command t
Save : Command t
New : (newSchema : Schema) -> Command t
Prepend : Row (schema t) -> Command t
Get : Fin (size t) -> Command t
Delete : Fin (size t) -> Command t
Quit : Command t
Query : (ix : Fin (length $ schema t))
-> (val : IdrisType $ indexList (schema t) ix)
-> Command t
-- *** Core Functionality ***
-- Compares two values for equality.
eq : (c : ColType0 b) -> IdrisType c -> IdrisType c -> Bool
eq I64 x y = x == y
eq Str x y = x == y
eq Boolean x y = x == y
eq Float x y = x == y
eq Natural x y = x == y
eq BigInt x y = x == y
eq (Finite k) x y = x == y
eq (Optional z) (Just x) (Just y) = eq z x y
eq (Optional z) Nothing Nothing = True
eq (Optional z) _ _ = False
-- Note: It would have been quite a bit easier to type and
-- implement this, had we used a heterogeneous vector instead
-- of a heterogeneous list for encoding table rows. However,
-- I still think it's pretty cool that this type checks!
eqAt : (ts : Schema)
-> (ix : Fin $ length ts)
-> (val : IdrisType $ indexList ts ix)
-> (row : Row ts)
-> Bool
eqAt (x :: _) FZ val (v :: _) = eq x val v
eqAt (_ :: xs) (FS y) val (_ :: vs) = eqAt xs y val vs
eqAt [] _ _ _ impossible
-- Most new commands don't change the table,
-- so their cases are trivial. The exception is
-- `Load`, which replaces the table completely.
applyCommand : (t : Table) -> Command t -> Table
applyCommand t PrintSchema = t
applyCommand t PrintSize = t
applyCommand t PrintTable = t
applyCommand t Save = t
applyCommand _ (Load t') = t'
applyCommand _ (New ts) = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r) = MkTable ts _ $ r :: rs
applyCommand t (Get x) = t
applyCommand t Quit = t
applyCommand t (Query ix val) = t
applyCommand (MkTable ts n rs) (Delete x) = case n of
S k => MkTable ts k (deleteAt x rs)
Z => absurd x
-- *** Parsers ***
zipWithIndex : Traversable t => t a -> t (Nat, a)
zipWithIndex = evalState 1 . traverse pairWithIndex
where pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = (,v) <$> get <* modify S
fromCSV : String -> List String
fromCSV = forget . split (',' ==)
-- Reads a primitive (non-nullary) type. This is therefore
-- universally quantified over parameter `b`.
-- The only interesting part is the parsing of `finXYZ`,
-- where we `break` the string at the occurrence of
-- the first digit.
readPrim : Nat -> String -> Either Error (ColType0 b)
readPrim _ "i64" = Right I64
readPrim _ "str" = Right Str
readPrim _ "boolean" = Right Boolean
readPrim _ "float" = Right Float
readPrim _ "natural" = Right Natural
readPrim _ "bigint" = Right BigInt
readPrim n s =
let err = Left $ UnknownType n s
in case break isDigit s of
("fin",r) => maybe err (Right . Finite) $ parsePositive r
_ => err
-- This is the parser for (possibly nullary) column types.
-- A nullary type is encoded as the corresponding non-nullary
-- type with a question mark appended. We therefore first check
-- for the presence of said question mark at the end of the string.
readColType : Nat -> String -> Either Error ColType
readColType n s = case reverse (unpack s) of
'?' :: t => Optional <$> readPrim n (pack $ reverse t)
_ => readPrim n s
readSchema : String -> Either Error Schema
readSchema = traverse (uncurry readColType) . zipWithIndex . fromCSV
readSchemaList : List String -> Either Error Schema
readSchemaList [s] = readSchema s
readSchemaList _ = Left ExpectedLine
-- For all except nullary types we can just use the `CSVField`
-- implementation for reading values.
-- For values of nullary types, we treat the empty string specially.
decodeF : (c : ColType0 b) -> String -> Maybe (IdrisType c)
decodeF I64 s = read s
decodeF Str s = read s
decodeF Boolean s = read s
decodeF Float s = read s
decodeF Natural s = read s
decodeF BigInt s = read s
decodeF (Finite k) s = read s
decodeF (Optional y) "" = Just Nothing
decodeF (Optional y) s = Just <$> decodeF y s
decodeField : (row,col : Nat) -> (c : ColType0 b) -> String -> Either Error (IdrisType c)
decodeField row k c s = maybeToEither (InvalidCell row k c s) $ decodeF c s
decodeRow : {ts : _} -> (row : Nat) -> String -> Either Error (Row ts)
decodeRow row s = go 1 ts $ fromCSV s
where go : Nat -> (cs : Schema) -> List String -> Either Error (Row cs)
go k [] [] = Right []
go k [] (_ :: _) = Left $ ExpectedEOI k s
go k (_ :: _) [] = Left $ UnexpectedEOI k s
go k (c :: cs) (s :: ss) = [| decodeField row k c s :: go (S k) cs ss |]
decodeRows : {ts : _} -> List String -> Either Error (List $ Row ts)
decodeRows = traverse (uncurry decodeRow) . zipWithIndex
readFin : {n : _} -> String -> Either Error (Fin n)
readFin s = do
S k <- maybeToEither (NoNat s) $ parsePositive {a = Nat} s
| Z => Left $ OutOfBounds n Z
maybeToEither (OutOfBounds n $ S k) $ natToFin k n
readCommand : (t : Table) -> String -> Either Error (Command t)
readCommand _ "schema" = Right PrintSchema
readCommand _ "size" = Right PrintSize
readCommand _ "table" = Right PrintTable
readCommand _ "quit" = Right Quit
readCommand (MkTable ts n _) s = case words s of
["new", str] => New <$> readSchema str
"add" :: ss => Prepend <$> decodeRow 1 (unwords ss)
["get", str] => Get <$> readFin str
["delete", str] => Delete <$> readFin str
"query" :: n :: ss => do
ix <- readFin n
val <- decodeField 1 1 (indexList ts ix) (unwords ss)
pure $ Query ix val
_ => Left $ UnknownCommand s
-- *** Printers ***
toCSV : List String -> String
toCSV = concat . intersperse ","
-- We mark optional type by appending a question
-- mark after the corresponding non-nullary type.
showColType : ColType0 b -> String
showColType I64 = "i64"
showColType Str = "str"
showColType Boolean = "boolean"
showColType Float = "float"
showColType Natural = "natural"
showColType BigInt = "bigint"
showColType (Finite n) = "fin\{show n}"
showColType (Optional t) = showColType t ++ "?"
-- Again, only nullary values are treated specially. This
-- is another case of a dependent pattern match: We use
-- explicit pattern matches on the value to encode based
-- on the type calculated from the `ColType0 b` parameter.
-- There are few languages capable of expressing this as
-- cleanly as Idris does.
encodeField : (t : ColType0 b) -> IdrisType t -> String
encodeField I64 x = show x
encodeField Str x = x
encodeField Boolean True = "t"
encodeField Boolean False = "f"
encodeField Float x = show x
encodeField Natural x = show x
encodeField BigInt x = show x
encodeField (Finite k) x = show x
encodeField (Optional y) (Just v) = encodeField y v
encodeField (Optional y) Nothing = ""
encodeFields : (ts : Schema) -> Row ts -> Vect (length ts) String
encodeFields [] [] = []
encodeFields (c :: cs) (v :: vs) = encodeField c v :: encodeFields cs vs
encodeTable : Table -> String
encodeTable (MkTable ts _ rows) =
unlines . toList $ map (toCSV . toList . encodeFields ts) rows
encodeSchema : Schema -> String
encodeSchema = toCSV . map showColType
-- Pretty printing a table plus header. All cells are right-padded
-- with spaces to adjust their size to the cell with the longest
-- entry for each colum.
-- Value `ls` is a `Vect n Nat` holding these lengths.
-- Here is an example of how the output looks like:
--
-- fin100 | boolean | natural | str | bigint?
-- --------------------------------------------------
-- 88 | f | 10 | stefan |
-- 13 | f | 10 | hock | -100
-- 58 | t | 1000 | hello world | -1234
--
-- Ideally, numeric values would be right-aligned, but since this
-- whole exercise is already quite long and complex, I refrained
-- from adding this luxury.
prettyTable : {n : _}
-> (header : Vect n String)
-> (table : Vect m (Vect n String))
-> String
prettyTable h t =
let -- vector holding the maximal length of each column
ls = foldl (zipWith $ \k => max k . length) (replicate n Z) (h::t)
-- horizontal bar used to separate the header from the rows
bar = concat . intersperse "---" $ map (`replicate` '-') ls
in unlines . toList $ line ls h :: bar :: map (line ls) t
where pad : Nat -> String -> String
pad v = padRight v ' '
-- given a vector of lengths, pads each string to the
-- desired length, separating cells with a vertical bar.
line : Vect n Nat -> Vect n String -> String
line lengths = concat . intersperse " | " . zipWith pad lengths
printTable : (cs : List ColType)
-> (rows : Vect n (Row cs))
-> String
printTable cs rows =
let header = map showColType $ fromList cs
table = map (encodeFields cs) rows
in prettyTable header table
allTypes : String
allTypes = concat
. List.intersperse ", "
. map (showColType {b = True})
$ [I64,Str,Boolean,Float]
showError : Error -> String
showError ExpectedLine = """
Error when reading schema.
Expected a single line of content.
"""
showError (UnknownCommand x) = """
Unknown command: \{x}.
Known commands are: clear, schema, size, table, new, add, get, delete, quit.
"""
showError (UnknownType pos x) = """
Unknown type at position \{show pos}: \{x}.
Known types are: \{allTypes}.
"""
showError (InvalidCell row col tpe x) = """
Invalid value at row \{show row}, column \{show col}.
Expected type: \{showColType tpe}.
Value found: \{x}.
"""
showError (ExpectedEOI k x) = """
Expected end of input.
Position: \{show k}
Input: \{x}
"""
showError (UnexpectedEOI k x) = """
Unxpected end of input.
Position: \{show k}
Input: \{x}
"""
showError (OutOfBounds size index) = """
Index out of bounds.
Size of table: \{show size}
Index: \{show index}
Note: Indices start at zero.
"""
showError (WriteError path err) = """
Error when writing file \{path}.
Message: \{show err}
"""
showError (ReadError path err) = """
Error when reading file \{path}.
Message: \{show err}
"""
showError (NoNat x) = "Not a natural number: \{x}"
result : (t : Table) -> Command t -> String
result t PrintSchema = "Current schema: \{encodeSchema t.schema}"
result t PrintSize = "Current size: \{show t.size}"
result t PrintTable = "Table:\n\n\{printTable t.schema t.rows}"
result _ Save = "Table written to disk."
result _ (Load t) = "Table loaded. Schema: \{encodeSchema t.schema}"
result _ (New ts) = "Created table. Schema: \{encodeSchema ts}"
result t (Prepend r) = "Row prepended:\n\n\{printTable t.schema [r]}"
result _ (Delete x) = "Deleted row: \{show $ FS x}."
result _ Quit = "Goodbye."
result t (Query ix val) =
let (_ ** rs) = filter (eqAt t.schema ix val) t.rows
in "Result:\n\n\{printTable t.schema rs}"
result t (Get x) =
"Row \{show $ FS x}:\n\n\{printTable t.schema [index x t.rows]}"
-- *** File IO ***
-- We use partial function `readFile` for simplicity here.
partial
load : (path : String)
-> (decode : List String -> Either Error a)
-> IO (Either Error a)
load path decode = do
Right ls <- readFile path
| Left err => pure $ Left (ReadError path err)
pure $ decode (filter (not . null) $ lines ls)
write : (path : String) -> (content : String) -> IO (Either Error ())
write path content = mapFst (WriteError path) <$> writeFile path content
namespace IOEither
export
(>>=) : IO (Either err a) -> (a -> IO (Either err b)) -> IO (Either err b)
ioa >>= f = Prelude.(>>=) ioa (either (pure . Left) f)
export
(>>) : IO (Either err ()) -> IO (Either err a) -> IO (Either err a)
(>>) x y = x >>= const y
export
pure : a -> IO (Either err a)
pure = Prelude.pure . Right
partial
readCommandIO : (t : Table) -> String -> IO (Either Error (Command t))
readCommandIO t s = case words s of
["save", pth] => IOEither.do
write (pth ++ ".schema") (encodeSchema t.schema)
write (pth ++ ".csv") (encodeTable t)
pure Save
["load", pth] => IOEither.do
schema <- load (pth ++ ".schema") readSchemaList
rows <- load (pth ++ ".csv") (decodeRows {ts = schema})
pure . Load $ MkTable schema (length rows) (fromList rows)
_ => Prelude.pure $ readCommand t s
-- *** Main Loop ***
partial
runProg : Table -> IO ()
runProg t = do
putStr "Enter a command: "
str <- getLine
cmd <- readCommandIO t str
case cmd of
Left err => putStrLn (showError err) >> runProg t
Right Quit => putStrLn (result t Quit)
Right cmd => putStrLn (result t cmd) >>
runProg (applyCommand t cmd)
partial
main : IO ()
main = runProg $ MkTable [] _ []
src/Solutions/Eq.idr
module Solutions.Eq
import Data.HList
import Data.Vect
import Decidable.Equality
%default total
data ColType = I64 | Str | Boolean | Float
Schema : Type
Schema = List ColType
IdrisType : ColType -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
Row : Schema -> Type
Row = HList . map IdrisType
record Table where
constructor MkTable
schema : Schema
size : Nat
rows : Vect size (Row schema)
data SameColType : (c1, c2 : ColType) -> Type where
SameCT : SameColType c1 c1
--------------------------------------------------------------------------------
-- Equality as a Type
--------------------------------------------------------------------------------
-- 1
sctReflexive : SameColType c1 c1
sctReflexive = SameCT
-- 2
sctSymmetric : SameColType c1 c2 -> SameColType c2 c1
sctSymmetric SameCT = SameCT
-- 3
sctTransitive : SameColType c1 c2 -> SameColType c2 c3 -> SameColType c1 c3
sctTransitive SameCT SameCT = SameCT
-- 4
sctCong : (f : ColType -> a) -> SameColType c1 c2 -> f c1 = f c2
sctCong f SameCT = Refl
-- 5
natEq : (n1,n2 : Nat) -> Maybe (n1 = n2)
natEq 0 0 = Just Refl
natEq (S k) (S j) = (\x => cong S x) <$> natEq k j
natEq (S k) 0 = Nothing
natEq 0 (S _) = Nothing
-- 6
appRows : {ts1 : _} -> Row ts1 -> Row ts2 -> Row (ts1 ++ ts2)
appRows {ts1 = []} Nil y = y
appRows {ts1 = _ :: _} (h :: t) y = h :: appRows t y
zip : Table -> Table -> Maybe Table
zip (MkTable s1 m rs1) (MkTable s2 n rs2) = case natEq m n of
Just Refl => Just $ MkTable _ _ (zipWith appRows rs1 rs2)
Nothing => Nothing
--------------------------------------------------------------------------------
-- Programs as Proofs
--------------------------------------------------------------------------------
-- 1
mapIdEither : (ea : Either e a) -> map Prelude.id ea = ea
mapIdEither (Left ve) = Refl
mapIdEither (Right va) = Refl
-- 2
mapIdList : (as : List a) -> map Prelude.id as = as
mapIdList [] = Refl
mapIdList (x :: xs) = cong (x ::) $ mapIdList xs
-- 3
data BaseType = DNABase | RNABase
data Nucleobase : BaseType -> Type where
Adenine : Nucleobase b
Cytosine : Nucleobase b
Guanine : Nucleobase b
Thymine : Nucleobase DNABase
Uracile : Nucleobase RNABase
NucleicAcid : BaseType -> Type
NucleicAcid = List . Nucleobase
complementBase : (b : BaseType) -> Nucleobase b -> Nucleobase b
complementBase DNABase Adenine = Thymine
complementBase RNABase Adenine = Uracile
complementBase _ Cytosine = Guanine
complementBase _ Guanine = Cytosine
complementBase _ Thymine = Adenine
complementBase _ Uracile = Adenine
complement : (b : BaseType) -> NucleicAcid b -> NucleicAcid b
complement b = map (complementBase b)
complementBaseId : (b : BaseType)
-> (nb : Nucleobase b)
-> complementBase b (complementBase b nb) = nb
complementBaseId DNABase Adenine = Refl
complementBaseId RNABase Adenine = Refl
complementBaseId DNABase Cytosine = Refl
complementBaseId RNABase Cytosine = Refl
complementBaseId DNABase Guanine = Refl
complementBaseId RNABase Guanine = Refl
complementBaseId DNABase Thymine = Refl
complementBaseId RNABase Uracile = Refl
complementId : (b : BaseType)
-> (na : NucleicAcid b)
-> complement b (complement b na) = na
complementId b [] = Refl
complementId b (x :: xs) =
cong2 (::) (complementBaseId b x) (complementId b xs)
-- 4
replaceVect : Fin n -> a -> Vect n a -> Vect n a
replaceVect FZ v (x :: xs) = v :: xs
replaceVect (FS k) v (x :: xs) = x :: replaceVect k v xs
indexReplace : (ix : Fin n)
-> (v : a)
-> (as : Vect n a)
-> index ix (replaceVect ix v as) = v
indexReplace FZ v (x :: xs) = Refl
indexReplace (FS k) v (x :: xs) = indexReplace k v xs
-- 5
insertVect : (ix : Fin (S n)) -> a -> Vect n a -> Vect (S n) a
insertVect FZ v xs = v :: xs
insertVect (FS k) v (x :: xs) = x :: insertVect k v xs
indexInsert : (ix : Fin (S n))
-> (v : a)
-> (as : Vect n a)
-> index ix (insertVect ix v as) = v
indexInsert FZ v xs = Refl
indexInsert (FS k) v (x :: xs) = indexInsert k v xs
--------------------------------------------------------------------------------
-- Into the Void
--------------------------------------------------------------------------------
-- 1
Uninhabited (Vect (S n) Void) where
uninhabited (_ :: _) impossible
-- 2
Uninhabited a => Uninhabited (Vect (S n) a) where
uninhabited = uninhabited . head
-- 3
notSym : Not (a = b) -> Not (b = a)
notSym f prf = f $ sym prf
-- 4
notTrans : a = b -> Not (b = c) -> Not (a = c)
notTrans ab f ac = f $ trans (sym ab) ac
-- 5
data Crud : (i : Type) -> (a : Type) -> Type where
Create : (value : a) -> Crud i a
Update : (id : i) -> (value : a) -> Crud i a
Read : (id : i) -> Crud i a
Delete : (id : i) -> Crud i a
Uninhabited a => Uninhabited i => Uninhabited (Crud i a) where
uninhabited (Create value) = uninhabited value
uninhabited (Update id value) = uninhabited value
uninhabited (Read id) = uninhabited id
uninhabited (Delete id) = uninhabited id
-- 6
namespace DecEq
DecEq ColType where
decEq I64 I64 = Yes Refl
decEq I64 Str = No $ \case Refl impossible
decEq I64 Boolean = No $ \case Refl impossible
decEq I64 Float = No $ \case Refl impossible
decEq Str I64 = No $ \case Refl impossible
decEq Str Str = Yes Refl
decEq Str Boolean = No $ \case Refl impossible
decEq Str Float = No $ \case Refl impossible
decEq Boolean I64 = No $ \case Refl impossible
decEq Boolean Str = No $ \case Refl impossible
decEq Boolean Boolean = Yes Refl
decEq Boolean Float = No $ \case Refl impossible
decEq Float I64 = No $ \case Refl impossible
decEq Float Str = No $ \case Refl impossible
decEq Float Boolean = No $ \case Refl impossible
decEq Float Float = Yes Refl
-- 7
ctNat : ColType -> Nat
ctNat I64 = 0
ctNat Str = 1
ctNat Boolean = 2
ctNat Float = 3
ctNatInjective : (c1,c2 : ColType) -> ctNat c1 = ctNat c2 -> c1 = c2
ctNatInjective I64 I64 Refl = Refl
ctNatInjective Str Str Refl = Refl
ctNatInjective Boolean Boolean Refl = Refl
ctNatInjective Float Float Refl = Refl
DecEq ColType where
decEq c1 c2 = case decEq (ctNat c1) (ctNat c2) of
Yes prf => Yes $ ctNatInjective c1 c2 prf
No contra => No $ \x => contra $ cong ctNat x
--------------------------------------------------------------------------------
-- Rewrite Rules
--------------------------------------------------------------------------------
-- 1
psuccRightSucc : (m,n : Nat) -> S (m + n) = m + S n
psuccRightSucc 0 n = Refl
psuccRightSucc (S k) n = cong S $ psuccRightSucc k n
-- 2
minusSelfZero : (n : Nat) -> minus n n = 0
minusSelfZero 0 = Refl
minusSelfZero (S k) = minusSelfZero k
-- 3
minusZero : (n : Nat) -> minus n 0 = n
minusZero 0 = Refl
minusZero (S k) = Refl
-- 4
timesOneLeft : (n : Nat) -> 1 * n = n
timesOneLeft 0 = Refl
timesOneLeft (S k) = cong S $ timesOneLeft k
timesOneRight : (n : Nat) -> n * 1 = n
timesOneRight 0 = Refl
timesOneRight (S k) = cong S $ timesOneRight k
-- 5
plusCommutes : (m,n : Nat) -> m + n = n + m
plusCommutes 0 n = rewrite plusZeroRightNeutral n in Refl
plusCommutes (S k) n =
rewrite sym (psuccRightSucc n k)
in cong S (plusCommutes k n)
-- 6
mapOnto : (a -> b) -> Vect k b -> Vect m a -> Vect (k + m) b
mapOnto _ xs [] =
rewrite plusZeroRightNeutral k in reverse xs
mapOnto {m = S m'} f xs (y :: ys) =
rewrite sym (plusSuccRightSucc k m') in mapOnto f (f y :: xs) ys
mapTR : (a -> b) -> Vect n a -> Vect n b
mapTR f = mapOnto f Nil
-- 7
mapAppend : (f : a -> b)
-> (xs : List a)
-> (ys : List a)
-> map f (xs ++ ys) = map f xs ++ map f ys
mapAppend f [] ys = Refl
mapAppend f (x :: xs) ys = cong (f x ::) $ mapAppend f xs ys
-- 8
zip2 : Table -> Table -> Maybe Table
zip2 (MkTable s1 m rs1) (MkTable s2 n rs2) = case decEq m n of
Yes Refl =>
let rs2 = zipWith (++) rs1 rs2
in Just $ MkTable (s1 ++ s2) _ (rewrite mapAppend IdrisType s1 s2 in rs2)
No _ => Nothing
src/Solutions/Predicates.idr
module Solutions.Predicates
import Data.Vect
import Decidable.Equality
%default total
--------------------------------------------------------------------------------
-- Preconditions
--------------------------------------------------------------------------------
data NonEmpty : (as : List a) -> Type where
IsNonEmpty : NonEmpty (h :: t)
-- 1
tail : (as : List a) -> (0 _ : NonEmpty as) => List a
tail (_ :: xs) = xs
tail [] impossible
-- 2
concat1 : Semigroup a => (as : List a) -> (0 _ : NonEmpty as) => a
concat1 (h :: t) = foldl (<+>) h t
foldMap1 : Semigroup m => (a -> m) -> (as : List a) -> (0 _ : NonEmpty as) => m
foldMap1 f (h :: t) = foldl (\x,y => x <+> f y) (f h) t
-- 3
maximum : Ord a => (as : List a) -> (0 _ : NonEmpty as) => a
maximum (x :: xs) = foldl max x xs
minimum : Ord a => (as : List a) -> (0 _ : NonEmpty as) => a
minimum (x :: xs) = foldl min x xs
-- 4
data Positive : Nat -> Type where
IsPositive : Positive (S n)
saveDiv : (m,n : Nat) -> (0 _ : Positive n) => Nat
saveDiv m (S k) = go 0 m k
where go : (res, rem, sub : Nat) -> Nat
go res 0 _ = res
go res (S rem) 0 = go (res + 1) rem k
go res (S rem) (S x) = go res rem x
-- 5
data IJust : Maybe a -> Type where
ItIsJust : IJust (Just v)
Uninhabited (IJust Nothing) where
uninhabited ItIsJust impossible
isJust : (m : Maybe a) -> Dec (IJust m)
isJust Nothing = No uninhabited
isJust (Just x) = Yes ItIsJust
fromJust : (m : Maybe a) -> (0 _ : IJust m) => a
fromJust (Just x) = x
fromJust Nothing impossible
-- 6
data IsLeft : Either e a -> Type where
ItIsLeft : IsLeft (Left v)
Uninhabited (IsLeft $ Right w) where
uninhabited ItIsLeft impossible
isLeft : (v : Either e a) -> Dec (IsLeft v)
isLeft (Right _) = No uninhabited
isLeft (Left x) = Yes ItIsLeft
data IsRight : Either e a -> Type where
ItIsRight : IsRight (Right v)
Uninhabited (IsRight $ Left w) where
uninhabited ItIsRight impossible
isRight : (v : Either e a) -> Dec (IsRight v)
isRight (Left _) = No uninhabited
isRight (Right x) = Yes ItIsRight
fromLeft : (v : Either e a) -> (0 _ : IsLeft v) => e
fromLeft (Left x) = x
fromLeft (Right x) impossible
fromRight : (v : Either e a) -> (0 _ : IsRight v) => a
fromRight (Right x) = x
fromRight (Left x) impossible
--------------------------------------------------------------------------------
-- Contracts between Values
--------------------------------------------------------------------------------
data ColType = I64 | Str | Boolean | Float
IdrisType : ColType -> Type
IdrisType I64 = Int64
IdrisType Str = String
IdrisType Boolean = Bool
IdrisType Float = Double
record Column where
constructor MkColumn
name : String
type : ColType
infixr 8 :>
(:>) : String -> ColType -> Column
(:>) = MkColumn
Schema : Type
Schema = List Column
data Row : Schema -> Type where
Nil : Row []
(::) : {0 name : String}
-> {0 type : ColType}
-> (v : IdrisType type)
-> Row ss
-> Row (name :> type :: ss)
data InSchema : (name : String)
-> (schema : Schema)
-> (colType : ColType)
-> Type where
[search name schema]
IsHere : InSchema n (n :> t :: ss) t
IsThere : InSchema n ss t -> InSchema n (fld :: ss) t
getAt : {0 ss : Schema}
-> (name : String)
-> Row ss
-> (prf : InSchema name ss c)
=> IdrisType c
getAt name (v :: vs) {prf = IsHere} = v
getAt name (_ :: vs) {prf = IsThere p} = getAt name vs
-- 1
Uninhabited (InSchema n [] c) where
uninhabited IsHere impossible
uninhabited (IsThere _) impossible
inSchema : (ss : Schema) -> (n : String) -> Dec (c ** InSchema n ss c)
inSchema [] _ = No $ \(_ ** prf) => uninhabited prf
inSchema (MkColumn cn t :: xs) n = case decEq cn n of
Yes Refl => Yes (t ** IsHere)
No contra => case inSchema xs n of
Yes (t ** prf) => Yes (t ** IsThere prf)
No contra2 => No $ \case (_ ** IsHere) => contra Refl
(t ** IsThere p) => contra2 (t ** p)
-- 2
updateAt : (name : String)
-> Row ss
-> (prf : InSchema name ss c)
=> (f : IdrisType c -> IdrisType c)
-> Row ss
updateAt name (v :: vs) {prf = IsHere} f = f v :: vs
updateAt name (v :: vs) {prf = IsThere p} f = v :: updateAt name vs f
-- 3
public export
data Elems : (xs,ys : List a) -> Type where
ENil : Elems [] ys
EHere : Elems xs ys -> Elems (x :: xs) (x :: ys)
EThere : Elems xs ys -> Elems xs (y :: ys)
extract : (0 s1 : Schema)
-> (row : Row s2)
-> (prf : Elems s1 s2)
=> Row s1
extract [] _ {prf = ENil} = []
extract (_ :: t) (v :: vs) {prf = EHere x} = v :: extract t vs
extract s1 (v :: vs) {prf = EThere x} = extract s1 vs
-- 4
namespace AllInSchema
public export
data AllInSchema : (names : List String)
-> (schema : Schema)
-> (result : Schema)
-> Type where
[search names schema]
Nil : AllInSchema [] s []
(::) : InSchema n s c
-> AllInSchema ns s res
-> AllInSchema (n :: ns) s (n :> c :: res)
getAll : {0 ss : Schema}
-> (names : List String)
-> Row ss
-> (prf : AllInSchema names ss res)
=> Row res
getAll [] _ {prf = []} = []
getAll (n :: ns) row {prf = _ :: _} = getAt n row :: getAll ns row
--------------------------------------------------------------------------------
-- Use Case: Flexible Error Handling
--------------------------------------------------------------------------------
data Has : (v : a) -> (vs : Vect n a) -> Type where
Z : Has v (v :: vs)
S : Has v vs -> Has v (w :: vs)
Uninhabited (Has v []) where
uninhabited Z impossible
uninhabited (S _) impossible
data Union : Vect n Type -> Type where
U : {0 ts : _} -> (ix : Has t ts) -> (val : t) -> Union ts
Uninhabited (Union []) where
uninhabited (U ix _) = absurd ix
0 Err : Vect n Type -> Type -> Type
Err ts t = Either (Union ts) t
-- 1
project : (0 t : Type) -> (prf : Has t ts) => Union ts -> Maybe t
project t {prf = Z} (U Z val) = Just val
project t {prf = S p} (U (S x) val) = project t (U x val)
project t {prf = Z} (U (S x) val) = Nothing
project t {prf = S p} (U Z val) = Nothing
project1 : Union [t] -> t
project1 (U Z val) = val
project1 (U (S x) val) impossible
safe : Err [] a -> a
safe (Right x) = x
safe (Left x) = absurd x
-- 2
weakenHas : Has t ts -> Has t (ts ++ ss)
weakenHas Z = Z
weakenHas (S x) = S (weakenHas x)
weaken : Union ts -> Union (ts ++ ss)
weaken (U ix val) = U (weakenHas ix) val
extendHas : {m : _} -> {0 pre : Vect m a} -> Has t ts -> Has t (pre ++ ts)
extendHas {m = Z} {pre = []} x = x
extendHas {m = S p} {pre = _ :: _} x = S (extendHas x)
extend : {m : _} -> {0 pre : Vect m _} -> Union ts -> Union (pre ++ ts)
extend (U ix val) = U (extendHas ix) val
-- 3
0 Errs : Vect m Type -> Vect n Type -> Type
Errs [] _ = ()
Errs (x :: xs) ts = (Has x ts, Errs xs ts)
inject : Has t ts => (v : t) -> Union ts
inject v = U %search v
embed : (prf : Errs ts ss) => Union ts -> Union ss
embed (U Z val) = inject val
embed (U (S x) val) = embed (U x val)
-- 4
data Rem : (v : a) -> (vs : Vect (S n) a) -> (rem : Vect n a) -> Type where
[search v vs]
RZ : Rem v (v :: rem) rem
RS : Rem v vs rem -> Rem v (w :: vs) (w :: rem)
split : (prf : Rem t ts rem) => Union ts -> Either t (Union rem)
split {prf = RZ} (U Z val) = Left val
split {prf = RZ} (U (S x) val) = Right (U x val)
split {prf = RS p} (U Z val) = Right (U Z val)
split {prf = RS p} (U (S x) val) = case split {prf = p} (U x val) of
Left vt => Left vt
Right (U ix y) => Right $ U (S ix) y
handle : Applicative f
=> Rem t ts rem
=> (h : t -> f (Err rem a))
-> Err ts a
-> f (Err rem a)
handle h (Left x) = case split x of
Left v => h v
Right err => pure $ Left err
handle _ (Right x) = pure $ Right x
--------------------------------------------------------------------------------
-- Tests
--------------------------------------------------------------------------------
EmployeeSchema : Schema
EmployeeSchema = [ "firstName" :> Str
, "lastName" :> Str
, "email" :> Str
, "age" :> I64
, "salary" :> Float
, "management" :> Boolean
]
0 Employee : Type
Employee = Row EmployeeSchema
hock : Employee
hock = [ "Stefan", "Höck", "hock@foo.com", 46, 5443.2, False ]
shoeck : String
shoeck = getAt "firstName" hock ++ " " ++ getAt "lastName" hock
shoeck2 : String
shoeck2 = case getAll ["firstName", "lastName", "age"] hock of
[fn,ln,a] => "\{fn} \{ln}: \{show a} years old."
embedTest : Err [Nat,Bits8] a
-> Err [String, Bits8, Int32, Nat] a
embedTest = mapFst embed
src/Solutions/Prim.idr
module Solutions.Prim
import Data.Bits
import Data.DPair
import Data.List
import Data.Maybe
import Data.SnocList
import Decidable.Equality
%default total
--------------------------------------------------------------------------------
-- Working with Strings
--------------------------------------------------------------------------------
-- 1
map : (Char -> Char) -> String -> String
map f = pack . map f . unpack
filter : (Char -> Bool) -> String -> String
filter f = pack . filter f . unpack
mapMaybe : (Char -> Maybe Char) -> String -> String
mapMaybe f = pack . mapMaybe f . unpack
-- 2
foldl : (a -> Char -> a) -> a -> String -> a
foldl f v = foldl f v . unpack
foldMap : Monoid m => (Char -> m) -> String -> m
foldMap f = foldMap f . unpack
-- 3
traverse : Applicative f => (Char -> f Char) -> String -> f String
traverse fun = map pack . traverse fun . unpack
-- 4
(>>=) : String -> (Char -> String) -> String
str >>= f = foldMap f $ unpack str
--------------------------------------------------------------------------------
-- Integers
--------------------------------------------------------------------------------
-- 1
record And a where
constructor MkAnd
value : a
Bits a => Semigroup (And a) where
MkAnd x <+> MkAnd y = MkAnd $ x .&. y
Bits a => Monoid (And a) where
neutral = MkAnd oneBits
-- 2
record Or a where
constructor MkOr
value : a
Bits a => Semigroup (Or a) where
MkOr x <+> MkOr y = MkOr $ x .|. y
Bits a => Monoid (Or a) where
neutral = MkOr zeroBits
-- 3
even : Bits64 -> Bool
even x = not $ testBit x 0
-- 4
binChar : Bits64 -> Char
binChar x = if testBit x 0 then '1' else '0'
toBin : Bits64 -> String
toBin 0 = "0"
toBin v = go [] v
where go : List Char -> Bits64 -> String
go cs 0 = pack cs
go cs v = go (binChar v :: cs) (assert_smaller v $ v `shiftR` 1)
-- 5
-- Note: We know that `x .&. 15` must be a value in the range
-- [0,15] (unless there is a bug in the backend we use), but since
-- `Bits64` is a primitive, Idris can't know this. We therefore
-- fail with a runtime crash in the impossible case, but annotate the
-- call to `idris_crash` with `assert_total` (otherwise, `hexChar` would
-- be a partial function).
hexChar : Bits64 -> Char
hexChar x = case x .&. 15 of
0 => '0'
1 => '1'
2 => '2'
3 => '3'
4 => '4'
5 => '5'
6 => '6'
7 => '7'
8 => '8'
9 => '9'
10 => 'a'
11 => 'b'
12 => 'c'
13 => 'd'
14 => 'e'
15 => 'f'
x => assert_total $ idris_crash "IMPOSSIBLE: Invalid hex digit (\{show x})"
toHex : Bits64 -> String
toHex 0 = "0"
toHex v = go [] v
where go : List Char -> Bits64 -> String
go cs 0 = pack cs
go cs v = go (hexChar v :: cs) (assert_smaller v $ v `shiftR` 4)
--------------------------------------------------------------------------------
-- Refined Primitives
--------------------------------------------------------------------------------
data Dec0 : (prop : Type) -> Type where
Yes0 : (0 prf : prop) -> Dec0 prop
No0 : (0 contra : prop -> Void) -> Dec0 prop
data IsYes0 : (d : Dec0 prop) -> Type where
ItIsYes0 : IsYes0 (Yes0 prf)
0 fromYes0 : (d : Dec0 prop) -> (0 prf : IsYes0 d) => prop
fromYes0 (Yes0 x) = x
fromYes0 (No0 contra) impossible
interface Decidable (0 a : Type) (0 p : a -> Type) | p where
decide : (v : a) -> Dec0 (p v)
decideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> Dec0 (p v)
decideOn _ = decide
test0 : (b : Bool) -> Dec0 (b === True)
test0 True = Yes0 Refl
test0 False = No0 absurd
0 unsafeDecideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> p v
unsafeDecideOn p v = case decideOn p v of
Yes0 prf => prf
No0 _ =>
assert_total $ idris_crash "Unexpected refinement failure in `unsafeRefineOn`"
0 safeDecideOn : (0 p : a -> Type)
-> Decidable a p
=> (v : a)
-> (0 prf : IsYes0 (decideOn p v))
=> p v
safeDecideOn p v = fromYes0 $ decideOn p v
-- 1
{x : a} -> DecEq a => Decidable a (Equal x) where
decide v = case decEq x v of
Yes prf => Yes0 prf
No contra => No0 contra
-- 2
data Neg : (p : a -> Type) -> a -> Type where
IsNot : {0 p : a -> Type} -> (contra : p v -> Void) -> Neg p v
Decidable a p => Decidable a (Neg p) where
decide v = case decideOn p v of
Yes0 prf => No0 $ \(IsNot contra) => contra prf
No0 contra => Yes0 $ IsNot contra
-- 3
data (&&) : (p,q : a -> Type) -> a -> Type where
Both : {0 p,q : a -> Type} -> (prf1 : p v) -> (prf2 : q v) -> (&&) p q v
Decidable a p => Decidable a q => Decidable a (p && q) where
decide v = case decideOn p v of
Yes0 prf1 => case decideOn q v of
Yes0 prf2 => Yes0 $ Both prf1 prf2
No0 contra => No0 $ \(Both _ prf2) => contra prf2
No0 contra => No0 $ \(Both prf1 _) => contra prf1
-- 4
data (||) : (p,q : a -> Type) -> a -> Type where
L : {0 p,q : a -> Type} -> (prf : p v) -> (p || q) v
R : {0 p,q : a -> Type} -> (prf : q v) -> (p || q) v
Decidable a p => Decidable a q => Decidable a (p || q) where
decide v = case decideOn p v of
Yes0 prf1 => Yes0 $ L prf1
No0 contra1 => case decideOn q v of
Yes0 prf2 => Yes0 $ R prf2
No0 contra2 => No0 $ \case L prf => contra1 prf
R prf => contra2 prf
-- 5
negOr : Neg (p || q) v -> (Neg p && Neg q) v
negOr (IsNot contra) = Both (IsNot $ contra . L) (IsNot $ contra . R)
andNeg : (Neg p && Neg q) v -> Neg (p || q) v
andNeg (Both (IsNot c1) (IsNot c2)) =
IsNot $ \case L p1 => c1 p1
R p2 => c2 p2
orNeg : (Neg p || Neg q) v -> Neg (p && q) v
orNeg (L (IsNot contra)) = IsNot $ \(Both p1 _) => contra p1
orNeg (R (IsNot contra)) = IsNot $ \(Both _ p2) => contra p2
0 negAnd : Decidable a p
=> Decidable a q
=> Neg (p && q) v
-> (Neg p || Neg q) v
negAnd (IsNot contra) = case decideOn p v of
Yes0 p1 => case decideOn q v of
Yes0 p2 => void (contra $ Both p1 p2)
No0 c => R $ IsNot c
No0 c => L $ IsNot c
-- 6
data (<=) : (m,n : Nat) -> Type where
ZLTE : 0 <= n
SLTE : m <= n -> S m <= S n
(>=) : (m,n : Nat) -> Type
m >= n = n <= m
(<) : (m,n : Nat) -> Type
m < n = S m <= n
(>) : (m,n : Nat) -> Type
m > n = n < m
LessThan : (m,n : Nat) -> Type
LessThan m = (< m)
To : (m,n : Nat) -> Type
To m = (<= m)
GreaterThan : (m,n : Nat) -> Type
GreaterThan m = (> m)
From : (m,n : Nat) -> Type
From m = (>= m)
FromTo : (lower,upper : Nat) -> Nat -> Type
FromTo l u = From l && To u
Between : (lower,upper : Nat) -> Nat -> Type
Between l u = GreaterThan l && LessThan u
Uninhabited (S n <= 0) where
uninhabited ZLTE impossible
uninhabited (SLTE _) impossible
0 fromLTE : (n1,n2 : Nat) -> (n1 <= n2) === True -> n1 <= n2
fromLTE 0 n2 prf = ZLTE
fromLTE (S k) (S j) prf = SLTE $ fromLTE k j prf
fromLTE (S k) 0 prf = absurd prf
0 toLTE : (n1,n2 : Nat) -> n1 <= n2 -> (n1 <= n2) === True
toLTE 0 0 _ = Refl
toLTE 0 (S k) _ = Refl
toLTE (S k) (S j) (SLTE x) = toLTE k j x
toLTE (S k) 0 x = absurd x
{n : Nat} -> Decidable Nat (<= n) where
decide m = case test0 (m <= n) of
Yes0 prf => Yes0 $ fromLTE m n prf
No0 contra => No0 $ contra . toLTE m n
{m : Nat} -> Decidable Nat (m <=) where
decide n = case test0 (m <= n) of
Yes0 prf => Yes0 $ fromLTE m n prf
No0 contra => No0 $ contra . toLTE m n
-- 7
0 refl : {n : Nat} -> n <= n
refl {n = 0} = ZLTE
refl {n = S _} = SLTE refl
0 trans : {l,m,n : Nat} -> l <= m -> m <= n -> l <= n
trans {l = 0} _ _ = ZLTE
trans {l = S _} (SLTE x) (SLTE y) = SLTE $ trans x y
0 (>>) : {l,m,n : Nat} -> l <= m -> m <= n -> l <= n
(>>) = trans
-- 8
0 toIsSucc : {n : Nat} -> n > 0 -> IsSucc n
toIsSucc {n = S _} (SLTE _) = ItIsSucc
0 fromIsSucc : {n : Nat} -> IsSucc n -> n > 0
fromIsSucc {n = S _} ItIsSucc = SLTE ZLTE
-- 9
safeDiv : (x,y : Bits64) -> (0 prf : cast y > 0) => Bits64
safeDiv x y = x `div` y
safeMod : (x,y : Bits64)
-> (0 prf : cast y > 0)
=> Subset Bits64 (\v => cast v < cast y)
safeMod x y = Element (x `mod` y) (unsafeDecideOn (<= cast y) _)
-- 10
digit : (v : Bits64) -> (0 prf : cast v < 16) => Char
digit 0 = '0'
digit 1 = '1'
digit 2 = '2'
digit 3 = '3'
digit 4 = '4'
digit 5 = '5'
digit 6 = '6'
digit 7 = '7'
digit 8 = '8'
digit 9 = '9'
digit 10 = 'a'
digit 11 = 'b'
digit 12 = 'c'
digit 13 = 'd'
digit 14 = 'e'
digit 15 = 'f'
digit x = assert_total $ idris_crash "IMPOSSIBLE: Invalid digit (\{show x})"
record Base where
constructor MkBase
value : Bits64
0 prf : FromTo 2 16 (cast value)
base : Bits64 -> Maybe Base
base v = case decideOn (FromTo 2 16) (cast v) of
Yes0 prf => Just $ MkBase v prf
No0 _ => Nothing
namespace Base
public export
fromInteger : (v : Integer) -> {auto 0 _ : IsJust (base $ cast v)} -> Base
fromInteger v = fromJust $ base (cast v)
digits : Bits64 -> Base -> String
digits 0 _ = "0"
digits x (MkBase b $ Both p1 p2) = go [] x
where go : List Char -> Bits64 -> String
go cs 0 = pack cs
go cs v =
let Element d p = (v `safeMod` b) {prf = %search >> p1}
v2 = (v `safeDiv` b) {prf = %search >> p1}
in go (digit d {prf = p >> p2} :: cs) (assert_smaller v v2)
-- 11
data CharOrd : (p : Nat -> Type) -> Char -> Type where
IsCharOrd : {0 p : Nat -> Type} -> (prf : p (cast c)) -> CharOrd p c
Decidable Nat p => Decidable Char (CharOrd p) where
decide c = case decideOn p (cast c) of
Yes0 prf => Yes0 $ IsCharOrd prf
No0 contra => No0 $ \(IsCharOrd prf) => contra prf
-- 12
IsAscii : Char -> Type
IsAscii = CharOrd (< 128)
IsLatin : Char -> Type
IsLatin = CharOrd (< 255)
IsUpper : Char -> Type
IsUpper = CharOrd (FromTo (cast 'A') (cast 'Z'))
IsLower : Char -> Type
IsLower = CharOrd (FromTo (cast 'a') (cast 'z'))
IsAlpha : Char -> Type
IsAlpha = IsUpper || IsLower
IsDigit : Char -> Type
IsDigit = CharOrd (FromTo (cast '0') (cast '9'))
IsAlphaNum : Char -> Type
IsAlphaNum = IsAlpha || IsDigit
IsControl : Char -> Type
IsControl = CharOrd (FromTo 0 31 || FromTo 127 159)
IsPlainAscii : Char -> Type
IsPlainAscii = IsAscii && Neg IsControl
IsPlainLatin : Char -> Type
IsPlainLatin = IsLatin && Neg IsControl
-- 12
0 plainToAscii : IsPlainAscii c -> IsAscii c
plainToAscii (Both prf1 _) = prf1
0 digitToAlphaNum : IsDigit c -> IsAlphaNum c
digitToAlphaNum = R
0 alphaToAlphaNum : IsAlpha c -> IsAlphaNum c
alphaToAlphaNum = L
0 lowerToAlpha : IsLower c -> IsAlpha c
lowerToAlpha = R
0 upperToAlpha : IsUpper c -> IsAlpha c
upperToAlpha = L
0 lowerToAlphaNum : IsLower c -> IsAlphaNum c
lowerToAlphaNum = L . R
0 upperToAlphaNum : IsUpper c -> IsAlphaNum c
upperToAlphaNum = L . L
0 asciiToLatin : IsAscii c -> IsLatin c
asciiToLatin (IsCharOrd x) = IsCharOrd (trans x $ safeDecideOn _ _)
0 plainAsciiToPlainLatin : IsPlainAscii c -> IsPlainLatin c
plainAsciiToPlainLatin (Both x y) = Both (asciiToLatin x) y
-- 13
data Head : (p : a -> Type) -> List a -> Type where
AtHead : {0 p : a -> Type} -> (0 prf : p v) -> Head p (v :: vs)
Uninhabited (Head p []) where
uninhabited (AtHead _) impossible
Decidable a p => Decidable (List a) (Head p) where
decide [] = No0 $ \prf => absurd prf
decide (x :: xs) = case decide {p} x of
Yes0 prf => Yes0 $ AtHead prf
No0 contra => No0 $ \(AtHead prf) => contra prf
-- 14
data Length : (p : Nat -> Type) -> List a -> Type where
HasLength : {0 p : Nat -> Type}
-> (0 prf : p (List.length vs))
-> Length p vs
Decidable Nat p => Decidable (List a) (Length p) where
decide vs = case decideOn p (length vs) of
Yes0 prf => Yes0 $ HasLength prf
No0 contra => No0 $ \(HasLength prf) => contra prf
-- 15
data All : (p : a -> Type) -> (as : List a) -> Type where
Nil : All p []
(::) : {0 p : a -> Type}
-> (0 h : p v)
-> (0 t : All p vs)
-> All p (v :: vs)
data AllSnoc : (p : a -> Type) -> (as : SnocList a) -> Type where
Lin : AllSnoc p [<]
(:<) : {0 p : a -> Type}
-> (0 i : AllSnoc p vs)
-> (0 l : p v)
-> AllSnoc p (vs :< v)
0 head : All p (x :: xs) -> p x
head (h :: _) = h
0 (<>>) : AllSnoc p sx -> All p xs -> All p (sx <>> xs)
(<>>) [<] y = y
(<>>) (i :< l) y = i <>> l :: y
0 suffix : (sx : SnocList a) -> All p (sx <>> xs) -> All p xs
suffix [<] x = x
suffix (sx :< y) x = let (_ :: t) = suffix {xs = y :: xs} sx x in t
0 notInner : {0 p : a -> Type}
-> (sx : SnocList a)
-> (0 contra : (prf : p x) -> Void)
-> (0 prfs : All p (sx <>> x :: xs))
-> Void
notInner sx contra prfs = let prfs2 = suffix sx prfs in contra (head prfs2)
allTR : {0 p : a -> Type} -> Decidable a p => (as : List a) -> Dec0 (All p as)
allTR as = go Lin as
where go : (0 sp : AllSnoc p sx) -> (xs : List a) -> Dec0 (All p (sx <>> xs))
go sp [] = Yes0 $ sp <>> Nil
go sp (x :: xs) = case decide {p} x of
Yes0 prf => go (sp :< prf) xs
No0 contra => No0 $ \prf => notInner sx contra prf
Decidable a p => Decidable (List a) (All p) where decide = allTR
-- 16
0 IsIdentChar : Char -> Type
IsIdentChar = IsAlphaNum || Equal '_'
0 IdentChars : List Char -> Type
IdentChars = Length (<= 100) && Head IsAlpha && All IsIdentChar
record Identifier where
constructor MkIdentifier
value : String
0 prf : IdentChars (unpack value)
identifier : String -> Maybe Identifier
identifier s = case decideOn IdentChars (unpack s) of
Yes0 prf => Just $ MkIdentifier s prf
No0 _ => Nothing
namespace Identifier
public export
fromString : (s : String)
-> (0 _ : IsYes0 (decideOn IdentChars (unpack s)))
=> Identifier
fromString s = MkIdentifier s (fromYes0 $ decide (unpack s))
testIdent : Identifier
testIdent = "fooBar_123"