Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Idris 2 Programming Language

by Stefan Höck, Nathan McCarty, and others

Welcome to the community Idris 2 tutorial! This book aims to be a comprehensive resource for learning the Idris 2 programming language.

This book is rendered from a collection of Idris source files structured as a normal Idris project, which you can download and play around with.

Introduction

Many of the Markdown files making up this book (those with a .md file extension) are literate Idris files, consisting of a mixture of Markdown and Idris code, and can be type checked and built just like regular code by the Idris compiler. You can identify a document as a literate Idris document if it contains a module declaration, like so:

module Tutorial.Intro

Even though this file (src/Tutorial/Intro.md) has no actual code in it, by including that module declaration, it qualifies as a literate Idris file. A module name consists of a list of identifiers separated by dots and must reflect the folder structure plus the module file's name, starting from the source directory. For instance, as this file's path, from the root of the src directory is Tutorial/Intro.md, it's module name must be Tutorial.Intro.

Before starting this book, make sure you have the Idris compiler installed on your computer. While it is technically possible to work through this book without it, we recommend that you have the pack package manager installed and have a skeleton package setup as described in the Getting Started with pack and Idris2 appendix, as such a setup is assumed.

Later in the book, you will encounter various exercises. The solutions to these exercises can be found as regular Idris files in the src/Solutions directory of the git repository, or in syntax highlight form in the "Exercise Solutions" section at the bottom of the navigation sidebar.

About the Idris Programming Language

Idris is a pure, dependently typed, total functional programming language.

Lets break that down and explore what each of those terms means on their own.

Functional Programming

In functional programming languages, functions are first-class constructs, meaning that they can be assigned to variables, passed as arguments to other functions, and returned as results from functions, just like any other value in the language. Unlike in, for instance, object-oriented languages, functions are the main form of abstraction in functional programming.

Whenever we find a common pattern or (almost) identical code in several parts of a project, we try to implement an abstraction over it to avoid write the same code multiple times. In functional programming, we do this by introducing one or more new functions implementing the required behavior, often trying to be as general as possible to maximize the versatility and re-usability of our functions.

Functional programming languages are concerned with the evaluation of functions, unlike imperative languages, which are concerned with the execution of statements.

Pure Functional Programming

Pure functional programming languages come with an additional important guarantee:

Functions don't have side effects, like writing to a file or mutating global state. They can only compute a result from their arguments possibly by invoking other pure functions, and nothing else. Given the same input, a pure function will always generate the same output, this property is known as referential transparency.

Pure functions have several advantages:

  • They are easy to test by specifying (possibly randomly generated) sets of input arguments alongside the expected results.

  • They are thread-safe. Since they don't mutate global state, they be used in several computations running in parallel without interfering with each other.

There are, of course, also some disadvantages:

  • Some algorithms are hard to implement efficiently with only pure functions.

  • Writing programs that actually do something (have some observable effect) is a bit tricky, but certainly possible.

Dependent Types

Idris is a strongly, statically typed programming language. Every expression is given a type (for instance: integer, list of strings, boolean, function from integer to boolean, etc.), and types are verified at compile time to rule out certain common programming errors.

For instance, if a function expects an argument of type String (a sequence of unicode characters, such as "Hello123"), it is a type error to invoke this function with an argument of type Integer, and Idris will refuse to compile such an ill-typed program.

Being statically typed means that Idris will catch type errors at compile time, before it generates an executable program that can be run. This stands in contrast with dynamically typed languages such as Python, which check for type errors at runtime, while a program is already being executed. It is the goal of statically typed languages to catch as many type errors as possible before there even is a program that can be run.

Furthermore, Idris is dependently typed, which is one of its most characteristic properties in comparison to other programming languages. In Idris, types are first class: Types can be passed as arguments to functions, and functions can return types as their results. Types can also depend on other values, as one example, the return type of a function can depend on the value of one of its arguments. This is a quite abstract statement that may be difficult to grasp at first, but we will be exploring its meaning and the profound impact it has on programming through example as we move through this book.

Total Functions

A total function is a pure function which is guaranteed to return a value of its return type for every possible set of inputs in a finite number of computational steps. A total function will never fail with an exception or loop infinitely, although it can still take arbitrarily long to compute its result.

Idris comes with a totality checker built-in, which allows us to verify that the functions we write are provably total. Totality in Idris is opt-in, as checking the totality of an arbitrary computer program is undecidable in the general case (a dilemma you may recognize as the halting problem). However, if we annotate a function with the total keyword, and the totality checker is unable to verify that the function is, indeed, total, Idris will fail with a type error. Notably, failing to determine a function is total is not the same as judging the function to be non-total.

Using the REPL

Idris comes with a REPL (Read Evaluate Print Loop), which is useful for tinkering with small ideas, and for quickly experimenting with the code we just wrote. To start a REPL session, run the following command in a terminal:

pack repl

Idris should now be ready to accept your commands:

     ____    __     _         ___
    /  _/___/ /____(_)____   |__ \
    / // __  / ___/ / ___/   __/ /     Version 0.5.1-3c532ea35
  _/ // /_/ / /  / (__  )   / __/      https://www.idris-lang.org
 /___/\__,_/_/  /_/____/   /____/      Type :? for help

Welcome to Idris 2.  Enjoy yourself!
Main>

We can go ahead and enter some simple arithmetic expressions, Idris will evaluate them and print the result:

Main> 2 * 4
8
Main> 3 * (7 + 100)
321

Since every expression in Idris has a type, we might want to inspect those as well:

Main> :t 2
2 : Integer

:t is a command specific to the Idris REPL (it is not part of the Idris programming language), and it is used to inspect the type of an expression:

Main> :t 2 * 4
2 * 4 : Integer

Whenever we perform calculations involving integer literals without explicitly specifying the types involved, Idris will assume the Integer type by default. Integer is an arbitrary precision (there is no hard-coded maximum value) signed integer type. It is one of the primitive types built into the language. Other primitives include fixed precision signed and unsigned integral types (Bits8, Bits16, Bits32 Bits64, Int8, Int16, Int32, and Int64), double precision (64 bit) floating point numbers (Double), unicode characters (Char) and strings of unicode characters (String).

A First Idris Program

module Tutorial.Intro.FirstIdrisProgram

While we will often start with the REPL for tinkering with small parts of the Idris language, for reading some documentation, or for inspecting the content of an Idris module, lets go ahead and will write a minimal Idris program to get started with the language.

Here comes the mandatory Hello World:

main : IO ()
main = putStrLn "Hello World!"

We will inspect the code above in some detail in a moment, but first we'd like to compile and run it. If you have checked out this books source code, you can run the following from the root directory:

pack -o hello exec src/Tutorial/Intro/FirstIdrisProgram.md

This will create an executable called hello in the build/exec directory, which can be invoked from the command-line like so (without the dollar prefix; this is used here to distinguish the terminal command from its output):

$ build/exec/hello
Hello World!

The pack program requires an .ipkg to be in scope (in the current directory or one of its parent directories), which provides other settings like the source directory to use (src in our case). The optional -o option provides a name to use for the executable to be generated. Pack comes up with a name of its own it this is not provided. Type pack help for a list of available command-line options and commands, and pack help <cmd> for help with a specific command.

You can also load this source file in a REPL session and invoke function main from there:

pack repl src/Tutorial/Intro/FirstIdrisProgram.md
Tutorial.Intro> :exec main
Hello World!

Go ahead and try both ways of building and running main on your system!

The Shape of an Idris Definition

module Tutorial.Intro.ShapeOfADef

Now that we have executed our first Idris program, lets talk a bit more about the code we had to write to define it.

A typical top level function in Idris consists of three things:

  1. The function's name (main in our case)
  2. Its type (IO ())
  3. Its implementation (putStrLn "Hello World")

Lets explore these parts through a couple of examples, starting out by defining a constant for the largest unsigned 8 bit integer:

maxBits8 : Bits8
maxBits8 = 255

The first line can be read as:

We'd like to declare a (nullary, or zero argument) function maxBits8. It is of type Bits8.

This is called the function declaration, we declare that there shall be a function of the given name and type.

The second line reads:

The result of invoking maxBits8 should be 255. (As you can see, we can use integer literals for other integral types and not just Integer.)

This is called the function definition, the function maxBits8 should behave as described here when being evaluated.

We can inspect this at the REPL, load this source file into an Idris REPL (as described in the previous section, this time using src/Tutorial/Intro/ShapeOfADef.md as the source file), and try running the following tests:

Tutorial.Intro> maxBits8
255
Tutorial.Intro> :t maxBits8
Tutorial.Intro.maxBits8 : Bits8

We can also use maxBits8 as part of another expression:

Tutorial.Intro> maxBits8 - 100
155

We previously described maxBits8 as a nullary function, which is just a fancy word for a constant. Let's write and test our first real function:

distanceToMax : Bits8 -> Bits8
distanceToMax n = maxBits8 - n

This introduces some new syntax and a new kind of type: Function types.

distanceToMax : Bits8 -> Bits8 can be read as:

distanceToMax is a function of one argument, with type Bits8, which returns a result of type Bits8.

In the implementation, the argument is given a local identifier (a fancy term for "name") n, which is then used in the calculation on the right hand side. Go ahead and try this function at the REPL:

Tutorial.Intro> distanceToMax 12
243
Tutorial.Intro> :t distanceToMax
Tutorial.Intro.distanceToMax : Bits8 -> Bits8
Tutorial.Intro> :t distanceToMax 12
distanceToMax 12 : Bits8

As a final example, let's implement a function that calculates the square of an integer:

square : Integer -> Integer
square n = n * n

We now learn a very important aspect of programming in Idris: Idris is a statically typed programming language. We are not allowed to freely mix types as we please, doing so will result in an error message from the type checker (which is part of Idris's compilation process). For instance, if we try the following at the REPL, we will get a type error:

Tutorial.Intro> square maxBits8
Error: ...

This is because square expects an argument of type Integer, but maxBits8 is of type Bits8. Many primitive types can be converted back and forth between each other (sometimes with the risk of loss of precision) using function cast (we will cover cast in further detail in the section on Interfaces in the Prelude):

Tutorial.Intro> square (cast maxBits8)
65025

Notice that the above result is much larger than maxBits8. This is because maxBits8 is first converted to an Integer of the same value, which is then squared. If we instead squared maxBits8 directly, the result would be truncated to still fit in the range of valid Bits8s:

Tutorial.Intro> maxBits8 * maxBits8
1

Where to get Help

There are several resources available online and in print, where you can find help and documentation about the Idris programming language. Here is a non-comprehensive list:

  • Type-Driven Development with Idris

    The Idris book! This describes in great detail the core concepts for using Idris and dependent types to write robust and concise code. It uses Idris 1 in its examples, so parts of it have to be slightly adjusted when using Idris 2. There is also a list of required updates.

  • A Crash Course in Idris 2

    The official Idris 2 tutorial. A comprehensive but dense explanation of all features of Idris 2. I find this to be useful as a reference, and as such it is highly accessible. However, it is not an introduction to functional programming or type-driven development in general.

  • The Idris 2 GitHub Repository

    Look here for detailed installation instructions and some introductory material. There is also a wiki, where you can find a list of editor plugins, a list of external backends, and other useful information.

  • The pack Database

    This is the listing of all the libraries included in pack's collection, which is currently the most comprehensive source of community contributed libraries for Idris 2.

  • The Idris 2 Discord Channel

    If you get stuck with a piece of code, want to ask about some obscure language feature, want to promote your new library, or want to just hang out with other Idris programmers, this is the place to go. The discord channel is pretty active and very friendly towards newcomers.

  • The Idris REPL

    Finally, a lot of useful information can be provided by Idris itself. Many users tend to kep at least one REPL open while working on an Idris project. Text editors can be set up to use the language server for Idris 2, which is incredibly useful. In the REPL,

    • use :t to inspect the type of an expression or meta variable (hole): :t foldl,
    • use :ti to inspect the type of a function including implicit arguments: :ti foldl,
    • use :m to list all meta variables (holes) in scope,
    • use :doc to access the documentation of a top level function (:doc the), a data type plus all its constructors and available hints (:doc Bool), a language feature (:doc case, :doc let, :doc interface, :doc record, or even :doc ?), or an interface (:doc Uninhabited),
    • use :module to import a module from one of the available packages: :module Data.Vect,
    • use :browse to list the names and types of all functions exported by a loaded module: :browse Data.Vect,
    • use :help to get a list of other commands plus a short description for each.

Conclusion

In this introduction we learned about the most basic features of the Idris programming language. We used the REPL to tinker with our ideas and inspect the types of things in our code, and we used the Idris compiler to compile an Idris source file to an executable.

We also learned about the basic shape of a top level definition in Idris, which always consists of an identifier (its name), a type, and an implementation.

What's next?

In the next chapter, we start programming in Idris for real. We learn how to write our own pure functions, how functions compose, and how we can treat functions just like other values and pass them around as arguments to other functions.

Introduction To Functions

Idris is a functional programming language, functions are its main form of abstraction (unlike for instance in an object oriented language like Java, where objects and classes are the main form of abstraction). Thus, we expect Idris to make it very easy for us to compose and combine functions to create new functions. In fact, in Idris functions are first class, functions can take other functions as arguments and can return functions as their results.

This chapter will explore some of the basic tools Idris provides for combining and producing functions .

Functions with more than one Argument

module Tutorial.Functions1.FunctionsWithMultipleArguments

Let's implement a function, which checks if its three Integer arguments form a Pythagorean triple, we'll need to use a new operator for this: ==, the equality operator.

export
isTriple : Integer -> Integer -> Integer -> Bool
isTriple x y z = x * x + y * y == z * z

Let's give this a spin at the REPL before we talk a about the types:

Tutorial.Functions1> isTriple 1 2 3
False
Tutorial.Functions1> isTriple 3 4 5
True

As this example demonstrates, the type of a function of several arguments consists of a sequence of argument types (also called input types) chained by function arrows (->), terminated by an output type (Bool in this case).

The implementation looks like a mathematical equation: The arguments are listed on the left hand side of the =, and the computation(s) to perform with them are described on the right hand side.

Function implementations in functional programming languages often have a more mathematical look compared to implementations in imperative languages, which often describe not what to compute, but instead how to compute it by describing an algorithm as a sequence of imperative statements. This imperative style is also available in Idris, and we will explore it in later chapters, but we prefer the declarative style whenever possible.

As shown in the above example, functions can be invoked by passing the arguments separated by whitespace. No parentheses are necessary, unless one of the expressions we pass as the function's arguments contains its own additional whitespace. This syntax provides for particularly ergonomic partial function application, a concept we will cover in a later section.

Note that, unlike Integer or Bits8, Bool is not a primitive data type built into the Idris language but just a normal data type that you could have written yourself. We will cover data type definitions in the next chapter

Function Composition

module Tutorial.Functions1.FunctionComposition

Functions can be combined in several ways, the most direct probably being the dot (.) operator:

export
square : Integer -> Integer
square n = n * n

times2 : Integer -> Integer
times2 n = 2 * n

squareTimes2 : Integer -> Integer
squareTimes2 = times2 . square

Give this a try at the REPL! Does it do what you'd expect?

We could have implemented squareTimes2 without using the dot operator as follows:

squareTimes2' : Integer -> Integer
squareTimes2' n = times2 (square n)

To get a better insight into how the dot operator works, let's implement our own version of it, instead called <.> to avoid name collision with the built-in dot operator:

private infixr 9 <.>
(<.>) : (b -> c) -> (a -> b) -> a -> c
f <.> g = \x => (f (g x))

We'll cover more about functions that take other functions as arguments in the next section, but for now, it suffices to know that our <.> is identical to the built-in ., and can be used the same way:

squareTimes2'' : Integer -> Integer
squareTimes2'' = times2 <.> square

It is important to note that functions chained by the dot operator are invoked from right to left: times2 . square is the same as \n => times2 (square n) and not \n => square (times2 n). This can be seen in our definition of <.>.

We can conveniently chain several functions using the dot operator to write more complex functions:

dotChain : Integer -> String
dotChain = reverse . show . square . square . times2 . times2

This will first multiply the argument by four, then square it twice before converting it to a string (show) and reversing the resulting String (functions show and reverse are part of the Idris Prelude and as such are available in every Idris program).

Higher-order Functions

module Tutorial.Functions1.HigherOrder

import Tutorial.Functions1.FunctionComposition

Functions can take other functions as arguments. This is an incredibly powerful concept which can be taken to an extreme very easily, but to keep things simple, we'll start slowly:

isEven : Integer -> Bool
isEven n = mod n 2 == 0

testSquare : (Integer -> Bool) -> Integer -> Bool
testSquare fun n = fun (square n)

In the above definition, isEven uses the mod function to check if an integer is divisible by two, and is defined in the same straightforward manor as the other functions we have defined so far.

testSquare, however, is more interesting. It takes two arguments, the first argument having the type of a function from Integer to Bool, and the second having type Integer. The second argument is squared before being passed to the first argument.

Let's give this a go at the REPL:

Tutorial.Functions1> testSquare isEven 12
True

Take your time to understand what's going on here. We pass the function isEven as the first argument to testSquare. The second argument is an integer, which will first be squared and then passed to isEven. While this particular example is not very interesting, we will cover lots of use cases for passing functions as arguments to other functions as we continue.

As noted earlier, things can go to an extreme pretty easily. Consider the following example:

twice : (Integer -> Integer) -> Integer -> Integer
twice f n = f (f n)

And at the REPL:

Tutorial.Functions1> twice square 2
16
Tutorial.Functions1> (twice . twice) square 2
65536
Tutorial.Functions1> (twice . twice . twice . twice) square 2
*** huge number ***

You might be surprised about this behavior, so let's break it down. The following two expressions are identical in their behavior:

expr1 : Integer -> Integer
expr1 = (twice . twice . twice . twice) square

expr2 : Integer -> Integer
expr2 = twice (twice (twice (twice square)))

Let's walk through this:

  • square raises its argument to the 2nd power
  • twice square applies square twice, raising its argument to the 4th power
  • twice (twice square) raises it to the 16th power, by invoking twice square twice
  • And so on until twice (twice (twice (twice square))), which raises it's argument to the 65536th power, giving an impressively huge result

Currying

module Tutorial.Functions1.Currying

import Tutorial.Functions1.FunctionsWithMultipleArguments

Once we start using higher-order functions, the concept of partial function application (also called currying after mathematician and logician Haskell Curry) becomes very important.

Load this file in a REPL session and try the following:

Tutorial.Functions1.Currying> :t testSquare isEven
testSquare isEven : Integer -> Bool
Tutorial.Functions1.Currying> :t isTriple 1
isTriple 1 : Integer -> Integer -> Bool
Tutorial.Functions1.Currying> :t isTriple 1 2
isTriple 1 2 : Integer -> Bool

Notice how in Idris we can partially apply a function with more than one argument and, as a result, get a new function back. For instance, isTriple 1 applies argument 1 to function isTriple and returns a new function of type Integer -> Integer -> Bool. We can even use the result of such a partially applied function in a new top level definition:

partialExample : Integer -> Bool
partialExample = isTriple 3 4

And at the REPL:

Tutorial.Functions1.Currying> partialExample 5
True

We already used partial function application in our twice examples above to get some impressive results with very little code.

Anonymous Functions

module Tutorial.Functions1.Lambdas

import Tutorial.Functions1.FunctionsWithMultipleArguments

Sometimes we'd like to pass a small custom function to a higher-order function, but without the hassle writing a top level definition. For instance, in the following example, function someTest is very specific and probably not very useful in general, but we'd still like to pass it to higher-order function testSquare:

someTest : Integer -> Bool
someTest n = n >= 3 || n <= 10

Here's, how to pass it to testSquare:

Tutorial.Functions1> testSquare someTest 100
True

Instead of defining and using someTest, we can use an anonymous function:

Tutorial.Functions1> testSquare (\n => n >= 3 || n <= 10) 100
True

For clarity, lets use an anonymous function to reproduce the above definition:

someTest' : Integer -> Bool
someTest' = \n => n >= 3 || n <= 10

Anonymous functions are sometimes also called lambdas (from lambda calculus), and the backslash is chosen since it resembles the Greek letter lambda.

The \n => syntax introduces a new anonymous function of one argument called n, the implementation of which is given on the right hand side of the function arrow. Like other top level functions, lambdas can have more than one arguments, separated by commas: \x,y => x * x + y. When we pass lambdas as arguments to higher-order functions, they typically need to be wrapped in parentheses or separated by the dollar operator ($) (see the next section about this).

Note that, in a lambda, arguments are not annotated with types, so Idris has to be able to infer them from the current context.

Operators

module Tutorial.Functions1.Operators

In Idris, infix operators like ., * or + are not built into the language, they are instead just regular Idris function with some special support for using them in infix notation. When we use operators outside of infix notation, we have to wrap them in parentheses.

As an example, let us define a custom operator for sequencing functions of type Bits8 -> Bits8:

infixr 4 >>>

(>>>) : (Bits8 -> Bits8) -> (Bits8 -> Bits8) -> Bits8 -> Bits8
f1 >>> f2 = f2 . f1

foo : Bits8 -> Bits8
foo n = 2 * n + 3

test : Bits8 -> Bits8
test = foo >>> foo >>> foo >>> foo

In addition to declaring and defining the operator itself, we also have to specify its fixity: infixr 4 >>> means, that (>>>) associates to the right (meaning, that f >>> g >>> h is to be interpreted as f >>> (g >>> h)) with a priority of 4. You can also have a look at the fixity of operators exported by the Prelude in the REPL:

Tutorial.Functions1> :doc (.)
Prelude.. : (b -> c) -> (a -> b) -> a -> c
  Function composition.
  Totality: total
  Fixity Declaration: infixr operator, level 9

When you mix infix operators in an expression, those with a higher priority bind more tightly. For instance, (+) is left associated with a priority of 8, while (*) is left associated with a priority of 9. Hence, a * b + c is the same as (a * b) + c instead of a * (b + c).

Operator Sections

Operators can be partially applied just like regular functions. In this case, the whole expression has to be wrapped in parentheses and is called an operator section. Here are two examples:

Tutorial.Functions1> testSquare (< 10) 5
False
Tutorial.Functions1> testSquare (10 <) 5
True

As you can see, there is a difference between (< 10) and (10 <). The first tests whether its argument is less than 10, and the second tests whether 10 is less than its argument.

One exception where operator sections will not work is with the minus operator (-). Here is an example to demonstrate this:

applyToTen : (Integer -> Integer) -> Integer
applyToTen f = f 10

This is just a higher-order function applying the number ten to its function argument. This works very well in the following example:

Tutorial.Functions1> applyToTen (* 2)
20

However, if we want to subtract five from ten, the following will fail:

Tutorial.Functions1> applyToTen (- 5)
Error: Can't find an implementation for Num (Integer -> Integer).

(Interactive):1:12--1:17
 1 | applyToTen (- 5)

The problem here is that Idris treats - 5 as an integer literal instead of an operator section. In this special case, we have to use an anonymous function instead:

Tutorial.Functions1> applyToTen (\x => x - 5)
5

Infix Notation for Non-Operators

In Idris, it is possible to use infix notation for regular binary functions by wrapping them in backticks. It is even possible to define a precedence (fixity) for these and use them in operator sections, just like regular operators:

infixl 8 `plus`

infixl 9 `mult`

plus : Integer -> Integer -> Integer
plus = (+)

mult : Integer -> Integer -> Integer
mult = (*)

arithTest : Integer
arithTest = 5 `plus` 10 `mult` 12

arithTest' : Integer
arithTest' = 5 + 10 * 12

Operators exported by the Prelude

Here is a list of important operators exported by the Prelude:

  • (.): Function composition
  • (+): Addition
  • (*): Multiplication
  • (-): Subtraction
  • (/): Division
  • (==) : True, if two values are equal
  • (/=) : True, if two values are not equal
  • (<=), (>=), (<), and (>) : Comparison operators
  • ($): Function application

Most of these are constrained, that is they work only for types implementing a certain interface. Don't worry about this right now. We will learn about interfaces their own chapter later, and the operators behave as they intuitively should. For instance, addition and multiplication work for all numeric types, and comparison operators work for almost all types in the Prelude with the exception of functions.

The most special of the above is the last one, ($). It has a priority of 0, so all other operators bind more tightly. In addition, function application binds more tightly, so this can be used to reduce the number of parentheses required in an expression. For instance, instead of writing isTriple 3 4 (2 + 3 * 1) we can write isTriple 3 4 $ 2 + 3 * 1, with exactly the same meaning. Sometimes this helps readability, other times it doesn't, you will naturally build an intuition for which form of a given expression is more readable with experience, especially from rereading your own code after some time has passed. The important thing to remember is that fun $ x y is just the same as fun (x y).

Introductory Function Exercises

module Tutorial.Functions1.Exercises

The solutions to these exercises can be found in src/Solutions/Functions1.idr.

Exercise 1

Reimplement functions testSquare and twice by using the dot operator and dropping the second arguments (have a look at the implementation of squareTimes2 to get an idea where this should lead you). This highly concise way of writing function implementations is sometimes called point-free style and is often the preferred way of writing small utility functions.

Exercise 2

Declare and implement function isOdd by combining functions isEven from above and not (from the Idris Prelude). Use point-free style.

Exercise 3

Declare and implement function isSquareOf, which checks whether its first Integer argument is the square of the second argument.

Exercise 4

Declare and implement function isSmall, which checks whether its Integer argument is less than or equal to 100. Use one of the comparison operators <= or >= in your implementation.

Exercise 5

Declare and implement function absIsSmall, which checks whether the absolute value of its Integer argument is less than or equal to 100. Use functions isSmall and abs (from the Idris Prelude) in your implementation, which should be in point-free style.

Exercise 6

In this slightly extended exercise we are going to implement some utilities for working with Integer predicates (functions from Integer to Bool). Implement the following higher-order functions (use boolean operators &&, ||, and function not in your implementations):

-- return true, if and only if both predicates hold
and : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool

-- return true, if and only if at least one predicate holds
or : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool

-- return true, if the predicate does not hold
negate : (Integer -> Bool) -> Integer -> Bool

After solving this exercise, give it a go in the REPL. In the example below, we use binary function and in infix notation by wrapping it in backticks. This is just a syntactic convenience to make certain function applications more readable:

Tutorial.Functions1> negate (isSmall `and` isOdd) 73
False

Exercise 7

As explained above, Idris allows us to define our own infix operators. Even better, Idris supports overloading of function names, that is, two functions or operators can have the same name, but different types and implementations. Idris will make use of the types to distinguish between equally named operators and functions.

This allows us, to reimplement functions and, or, and negate from Exercise 6 by using the existing operator and function names from boolean algebra:

-- return true, if and only if both predicates hold
(&&) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
x && y = and x y

-- return true, if and only if at least one predicate holds
(||) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool

-- return true, if the predicate does not hold
not : (Integer -> Bool) -> Integer -> Bool

Implement the other two functions and test them at the REPL:

Tutorial.Functions1> not (isSmall && isOdd) 73
False

Conclusion

In this chapter, we learned:

  • A function in Idris can take an arbitrary number of arguments, separated by -> in the function's type.

  • Functions can be combined sequentially using the dot operator (.), which leads to concise code.

  • Functions can be partially applied by passing them fewer arguments than they expect, resulting in a new function expecting the remaining arguments. This technique is called currying.

  • Functions can be passed as arguments to other functions, which allows us to easily combine small units of code to create more complex behavior.

  • If writing a corresponding top level function would be too cumbersome, we can pass anonymous functions (lambdas) to higher-order functions.

  • Idris allows us to define our own infix operators. These have to be written in parentheses unless they are being used in infix notation.

  • Infix operators can also be partially applied. These operator sections have to be wrapped in parentheses, and the position of the argument determines whether it is used as the operator's first or second argument.

  • Idris supports name overloading, functions can have the same names but different implementations. Idris will decide which function to use based to the types involved.

Please note, that function and operator names within an individual a module must be unique. In order to define two functions with the same name, they have to be declared in distinct modules. If Idris is not able to decide which of the two functions to use, we can help name resolution by prefixing a function with (a part of) its namespace:

Tutorial.Functions1> :t Prelude.not
Prelude.not : Bool -> Bool
Tutorial.Functions1> :t Functions1.not
Tutorial.Functions1.not : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool

What's next

In the next section, we will learn how to define our own data types and how to construct and deconstruct values of these new types. We will also learn about generic types and functions.

Algebraic Data Types

In the previous chapter, we learned how to write our own functions and combine them to create more complex functionality. Of equal importance is the ability to define our own data types and use them as arguments and results in functions.

This is a lengthy chapter, densely packed with information. If you are new to Idris and functional programming, make sure to follow along slowly, experimenting with the examples, and possibly coming up with your own. Make sure to try and solve all exercises.

module Tutorial.DataTypes

Enumerations

module Tutorial.DataTypes.Enumerations

Let's start with a data type for the days of the week as an example.

public export
data Weekday = Monday
             | Tuesday
             | Wednesday
             | Thursday
             | Friday
             | Saturday
             | Sunday

The declaration above defines a new type (Weekday) and several new values (Monday to Sunday) of the given type. Go ahead, and verify this at the REPL:

Tutorial.DataTypes> :t Monday
Tutorial.DataTypes.Monday : Weekday
Tutorial.DataTypes> :t Weekday
Tutorial.DataTypes.Weekday : Type

So, Monday is of type Weekday, while Weekday itself is of type Type.

It is important to note that a value of type Weekday can only ever be one of the values listed above. It is a type error to use anything else where a Weekday is expected.

Pattern Matching

In order to use our new data type as a function argument, we need to learn about an important concept in functional programming languages: Pattern matching. Let's implement a function which calculates the successor of a weekday:

total
next : Weekday -> Weekday
next Monday    = Tuesday
next Tuesday   = Wednesday
next Wednesday = Thursday
next Thursday  = Friday
next Friday    = Saturday
next Saturday  = Sunday
next Sunday    = Monday

In order to inspect a Weekday argument, we match on the different possible values and return a result for each of them. This is a very powerful concept, as it allows us to match on and extract values from deeply nested data structures. The different cases in a pattern match are inspected from top to bottom, each being compared against the current function argument. Once a matching pattern is found, the computation on the right hand side of this pattern is evaluated. Later patterns are then ignored.

For instance, if we invoke next with argument Thursday, the first three patterns (Monday, Tuesday, and Wednesday) will be checked against the argument, but they do not match. The fourth pattern is a match, and result Friday is being returned. Later patterns are then ignored, even if they would also match the input (this becomes relevant with catch-all patterns, which we will talk about in a moment).

The function above is provably total. Idris knows about the possible values of type Weekday, and can therefore figure out that our pattern match covers all possible cases. We can therefore annotate the function with the total keyword, and Idris will answer with a type error if it can't verify the function's totality. (Go ahead, and try removing one of the clauses in next to get an idea about how an error message from the coverage checker looks like.)

Please remember that these are very strong guarantees from the type checker: Given enough resources, a provably total function will always return a result of the given type in a finite amount of time (resources here meaning computational resources like memory or, in case of recursive functions, stack space).

Catch-all Patterns

Sometimes, it is convenient to only match on a subset of the possible values and collect the remaining possibilities in a catch-all clause:

export
total
isWeekend : Weekday -> Bool
isWeekend Saturday = True
isWeekend Sunday   = True
isWeekend _        = False

The final line with the catch-all pattern is only invoked if the argument is not equal to Saturday or Sunday. Remember: Patterns in a pattern match are matched against the input from top to bottom, and the first match decides which path on the right hand side will be taken.

We can use catch-all patterns to implement an equality test for Weekday (we will not yet use the == operator for this; this will have to wait until we learn about interfaces):

total
eqWeekday : Weekday -> Weekday -> Bool
eqWeekday Monday Monday        = True
eqWeekday Tuesday Tuesday      = True
eqWeekday Wednesday Wednesday  = True
eqWeekday Thursday Thursday    = True
eqWeekday Friday Friday        = True
eqWeekday Saturday Saturday    = True
eqWeekday Sunday Sunday        = True
eqWeekday _ _                  = False

Enumeration Types in the Prelude

Data types like Weekday consisting of a finite set of values are sometimes called enumerations. The Idris Prelude defines some common enumerations for us: for instance, Bool and Ordering. As with Weekday, we can use pattern matching when implementing functions on these types:

-- this is how `not` is implemented in the *Prelude*
total
negate : Bool -> Bool
negate False = True
negate True  = False

The Ordering data type describes an ordering relation between two values. For instance:

total
compareBool : Bool -> Bool -> Ordering
compareBool False False = EQ
compareBool False True  = LT
compareBool True True   = EQ
compareBool True False  = GT

Here, LT means that the first argument is less than the second, EQ means that the two arguments are equal and GT means, that the first argument is greater than the second.

Case Expressions

Sometimes we need to perform a computation with one of the arguments and want to pattern match on the result of this computation. We can use case expressions in this situation:

-- returns the larger of the two arguments
total
maxBits8 : Bits8 -> Bits8 -> Bits8
maxBits8 x y =
  case compare x y of
    LT => y
    _  => x

The first line of the case expression (case compare x y of) will invoke function compare with arguments x and y. On the following (indented) lines, we pattern match on the result of this computation. This is of type Ordering, so we expect one of the three constructors LT, EQ, or GT as the result. On the first line, we handle the LT case explicitly, while the other two cases are handled with an underscore as a catch-all pattern.

Note that indentation matters here: The case block as a whole must be indented (if it starts on a new line), and the different cases must also be indented by the same amount of whitespace.

Function compare is overloaded for many data types. We will learn how this works when we talk about interfaces.

If Then Else

When working with Bool, there is an alternative to pattern matching common to most programming languages:

total
maxBits8' : Bits8 -> Bits8 -> Bits8
maxBits8' x y = if compare x y == LT then y else x

Note that the if then else expression always returns a value and, therefore, the else branch cannot be dropped. This is different from the behavior in typical imperative languages, where if is a statement with possible side effects.

Naming Conventions: Identifiers

While we are free to use lower-case and upper-case identifiers for function names, type- and data constructors must be given upper-case identifiers in order not to confuse Idris (operators are also fine). For instance, the following data definition is not valid, and Idris will complain that it expected upper-case identifiers:

data foo = bar | baz

The same goes for similar data definitions like records and sum types (both will be explained below):

-- not valid Idris
record Foo where
  constructor mkfoo

On the other hand, we typically use lower-case identifiers for function names, unless we plan to use them mostly during type checking (more on this later). This is not enforced by Idris, however, so if you are working in a domain where upper-case identifiers are preferable, feel free to use those:

foo : Bits32 -> Bits32
foo = (* 2)

Bar : Bits32 -> Bits32
Bar = foo

Exercises part 1

module Tutorial.DataTypes.Exercises1
  1. Use pattern matching to implement your own versions of boolean operators (&&) and (||) calling them and and or respectively.

    Note: One way to go about this is to enumerate all four possible combinations of two boolean values and give the result for each. However, there is a shorter, more clever way, requiring only two pattern matches for each of the two functions.

  2. Define your own data type representing different units of time (seconds, minutes, hours, days, weeks), and implement the following functions for converting between time spans using different units. Hint: Use integer division (div) when going from seconds to some larger unit like hours).

    data UnitOfTime = Second -- add additional values
    
    -- calculate the number of seconds from a
    -- number of steps in the given unit of time
    total
    toSeconds : UnitOfTime -> Integer -> Integer
    
    -- Given a number of seconds, calculate the
    -- number of steps in the given unit of time
    total
    fromSeconds : UnitOfTime -> Integer -> Integer
    
    -- convert the number of steps in a given unit of time
    -- to the number of steps in another unit of time.
    -- use `fromSeconds` and `toSeconds` in your implementation
    total
    convert : UnitOfTime -> Integer -> UnitOfTime -> Integer
    
  3. Define a data type for representing a subset of the chemical elements: Hydrogen (H), Carbon (C), Nitrogen (N), Oxygen (O), and Fluorine (F).

    Declare and implement function atomicMass, which for each element returns its atomic mass in dalton:

    Hydrogen : 1.008
    Carbon : 12.011
    Nitrogen : 14.007
    Oxygen : 15.999
    Fluorine : 18.9984
    

Sum Types

module Tutorial.DataTypes.SumTypes

Assume we'd like to write some web form, where users of our web application can decide how they like to be addressed. We give them a choice between two common predefined forms of address (Mr and Mrs), but also allow them to decide on a customized form. The possible choices can be encapsulated in an Idris data type:

public export
data Title = Mr | Mrs | Other String

This looks almost like an enumeration type, with the exception that there is a new thing, called a data constructor, which accepts a String argument (actually, the values in an enumeration are also called (nullary) data constructors). If we inspect the types at the REPL, we learn the following:

Tutorial.DataTypes> :t Mr
Tutorial.DataTypes.Mr : Title
Tutorial.DataTypes> :t Other
Tutorial.DataTypes.Other : String -> Title

So, Other is a function from String to Title. This means, that we can pass Other a String argument and get a Title as the result:

public export
total
dr : Title
dr = Other "Dr."

Again, a value of type Title can only consist of one of the three choices listed above, and again, we can use pattern matching to implement functions on the Title data type in a provably total way:

export
total
showTitle : Title -> String
showTitle Mr        = "Mr."
showTitle Mrs       = "Mrs."
showTitle (Other x) = x

Note, how in the last pattern match, the string value stored in the Other data constructor is bound to local variable x. Also, the Other x pattern has to be wrapped in parentheses, as otherwise Idris would think Other and x were to distinct function arguments.

This is a very common way to extract the values from data constructors. We can use showTitle to implement a function for creating a courteous greeting:

export
total
greet : Title -> String -> String
greet t name = "Hello, " ++ showTitle t ++ " " ++ name ++ "!"

In the implementation of greet, we use string literals and the string concatenation operator (++) to assemble the greeting from its parts.

At the REPL:

Tutorial.DataTypes> greet dr "Höck"
"Hello, Dr. Höck!"
Tutorial.DataTypes> greet Mrs "Smith"
"Hello, Mrs. Smith!"

Data types like Title are called sum types as they consist of the sum of their different parts: A value of type Title is either a Mr, a Mrs, or a String wrapped up in Other.

Here's another (drastically simplified) example of a sum type. Assume we allow two forms of authentication in our web application: Either by entering a username plus a password (for which we'll use an unsigned 64 bit integer here), or by providing username plus a (very complex) secret key. Here's a data type to encapsulate this use case:

data Credentials = Password String Bits64 | Key String String

As an example of a very primitive login function, we can hard-code some known credentials:

total
login : Credentials -> String
login (Password "Anderson" 6665443) = greet Mr "Anderson"
login (Key "Y" "xyz")               = greet (Other "Agent") "Y"
login _                             = "Access denied!"

As can be seen in the example above, we can also pattern match against primitive values by using integer and string literals. Give login a go at the REPL:

Tutorial.DataTypes> login (Password "Anderson" 6665443)
"Hello, Mr. Anderson!"
Tutorial.DataTypes> login (Key "Y" "xyz")
"Hello, Agent Y!"
Tutorial.DataTypes> login (Key "Y" "foo")
"Access denied!"

Exercises part 2

module Tutorial.DataTypes.Exercises2
  1. Implement an equality test for Title (you can use the equality operator (==) for comparing two Strings):

    total
    eqTitle : Title -> Title -> Bool
    
  2. For Title, implement a simple test to check, whether a custom title is being used:

    total
    isOther : Title -> Bool
    
  3. Given our simple Credentials type, there are three ways for authentication to fail:

    • An unknown username was used.
    • The password given does not match the one associated with the username.
    • An invalid key was used.

    Encapsulate these three possibilities in a sum type called LoginError, but make sure not to disclose any confidential information: An invalid username should be stored in the corresponding error value, but an invalid password or key should not.

  4. Implement function showError : LoginError -> String, which can be used to display an error message to the user who unsuccessfully tried to login into our web application.

Records

module Tutorial.DataTypes.Records

import Tutorial.DataTypes.Enumerations
import Tutorial.DataTypes.SumTypes

It is often useful to group together several values as a logical unit. For instance, in our web application we might want to group information about a user in a single data type. Such data types are often called product types (see below for an explanation). The most common and convenient way to define them is the record construct:

record User where
  constructor MkUser
  name  : String
  title : Title
  age   : Bits8

The declaration above creates a new type called User, and a new data constructor called MkUser. As usual, have a look at their types in the REPL:

Tutorial.DataTypes> :t User
Tutorial.DataTypes.User : Type
Tutorial.DataTypes> :t MkUser
Tutorial.DataTypes.MkUser : String -> Title -> Bits8 -> User

We can use MkUser (which is a function from String to Title to Bits8 to User) to create values of type User:

total
agentY : User
agentY = MkUser "Y" (Other "Agent") 51

total
drNo : User
drNo = MkUser "No" dr 73

We can also use pattern matching to extract the fields from a User value (they can again be bound to local variables):

total
greetUser : User -> String
greetUser (MkUser n t _) = greet t n

In the example above, the name and title field are bound to two new local variables (n and t respectively), which can then be used on the right hand side of greetUser's implementation. For the age field, which is not used on the right hand side, we can use an underscore as a catch-all pattern.

Note, how Idris will prevent us from making a common mistake: If we confuse the order of arguments, the implementation will no longer type check. We can verify this by putting the erroneous code in a failing block: This is an indented code block, which will lead to an error during elaboration (type checking). We can give part of the expected error message as an optional string argument to a failing block. If this does not match part of the error message (or the whole code block does not fail to type check) the failing block itself fails to type check. This is a useful tool to demonstrate that type safety works in two directions: We can show that valid code type checks but also that invalid code is rejected by the Idris elaborator:

failing "Mismatch between: String and Title"
  greetUser' : User -> String
  greetUser' (MkUser n t _) = greet n t

In addition, for every record field, Idris creates an extractor function of the same name. This can either be used as a regular function, or it can be used in postfix notation by appending it to a variable of the record type separated by a dot. Here are two examples for extracting the age from a user:

getAgeFunction : User -> Bits8
getAgeFunction u = age u

getAgePostfix : User -> Bits8
getAgePostfix u = u.age

Syntactic Sugar for Records

As was already mentioned in the introduction, Idris is a pure functional programming language. In pure functions, we are not allowed to modify global mutable state. As such, if we want to modify a record value, we will always create a new value with the original value remaining unchanged: Records and other Idris values are immutable. While this can have a slight impact on performance, it has the benefit that we can freely pass a record value to different functions, without fear of the functions modifying the value by in-place mutation. These are, again, very strong guarantees, which makes it drastically easier to reason about our code.

There are several ways to modify a record, the most general being to pattern match on the record and adjust each field as desired. If, for instance, we'd like to increase the age of a User by one, we could do the following:

total
incAge : User -> User
incAge (MkUser name title age) = MkUser name title (age + 1)

That's a lot of code for such a simple thing, so Idris offers several syntactic conveniences for this. For instance, using record syntax, we can just access and update the age field of a value:

total
incAge2 : User -> User
incAge2 u = { age := u.age + 1 } u

Assignment operator := assigns a new value to the age field in u. Remember, that this will create a new User value. The original value u remains unaffected by this.

We can access a record field, either by using the field name as a projection function (age u; also have a look at :t age in the REPL), or by using dot syntax: u.age. This is special syntax and not related to the dot operator for function composition ((.)).

The use case of modifying a record field is so common that Idris provides special syntax for this as well:

total
incAge3 : User -> User
incAge3 u = { age $= (+ 1) } u

Here, I used an operator section ((+ 1)) to make the code more concise. As an alternative to an operator section, we could have used an anonymous function like so:

total
incAge4 : User -> User
incAge4 u = { age $= \x => x + 1 } u

Finally, since our function's argument u is only used once at the very end, we can drop it altogether, to get the following, highly concise version:

total
incAge5 : User -> User
incAge5 = { age $= (+ 1) }

As usual, we should have a look at the result at the REPL:

Tutorial.DataTypes> incAge5 drNo
MkUser "No" (Other "Dr.") 74

It is possible to use this syntax to set and/or update several record fields at once:

total
drNoJunior : User
drNoJunior = { name $= (++ " Jr."), title := Mr, age := 17 } drNo

Tuples

I wrote above that a record is also called a product type. This is quite obvious when we consider the number of possible values inhabiting a given type. For instance, consider the following custom record:

record Foo where
  constructor MkFoo
  wd   : Weekday
  bool : Bool

How many possible values of type Foo are there? The answer is 7 * 2 = 14, as we can pair every possible Weekday (seven in total) with every possible Bool (two in total). So, the number of possible values of a record type is the product of the number of possible values for each field.

The canonical product type is the Pair, which is available from the Prelude:

total
weekdayAndBool : Weekday -> Bool -> Pair Weekday Bool
weekdayAndBool wd b = MkPair wd b

Since it is quite common to return several values from a function wrapped in a Pair or larger tuple, Idris provides some syntactic sugar for working with these. Instead of Pair Weekday Bool, we can just write (Weekday, Bool). Likewise, instead of MkPair wd b, we can just write (wd, b) (the space is optional):

total
weekdayAndBool2 : Weekday -> Bool -> (Weekday, Bool)
weekdayAndBool2 wd b = (wd, b)

This works also for nested tuples:

total
triple : Pair Bool (Pair Weekday String)
triple = MkPair False (Friday, "foo")

total
triple2 : (Bool, Weekday, String)
triple2 = (False, Friday, "foo")

In the example above, triple2 is converted to the form used in triple by the Idris compiler.

We can even use tuple syntax in pattern matches:

total
bar : Bool
bar = case triple of
  (b,wd,_) => b && isWeekend wd

As Patterns

Sometimes, we'd like to take apart a value by pattern matching on it but still retain the value as a whole for using it in further computations:

total
baz : (Bool,Weekday,String) -> (Nat,Bool,Weekday,String)
baz t@(_,_,s) = (length s, t)

In baz, variable t is bound to the triple as a whole, which is then reused to construct the resulting quadruple. Remember, that (Nat,Bool,Weekday,String) is just sugar for Pair Nat (Bool,Weekday,String), and (length s, t) is just sugar for MkPair (length s) t. Hence, the implementation above is correct as is confirmed by the type checker.

Exercises part 3

  1. Define a record type for time spans by pairing a UnitOfTime with an integer representing the duration of the time span in the given unit of time. Define also a function for converting a time span to an Integer representing the duration in seconds.

  2. Implement an equality check for time spans: Two time spans should be considered equal, if and only if they correspond to the same number of seconds.

  3. Implement a function for pretty printing time spans: The resulting string should display the time span in its given unit, plus show the number of seconds in parentheses, if the unit is not already seconds.

  4. Implement a function for adding two time spans. If the two time spans use different units of time, use the smaller unit of time to ensure a lossless conversion.

Generic Data Types

module Tutorial.DataTypes.GenericDataTypes

import Tutorial.DataTypes.Enumerations

Sometimes, a concept is general enough that we'd like to apply it not only to a single type, but to all kinds of types. For instance, we might not want to define data types for lists of integers, lists of strings, and lists of booleans, as this would lead to a lot of code duplication. Instead, we'd like to have a single generic list type parameterized by the type of values it stores. This section explains how to define and use generic types.

Maybe

Consider the case of parsing a Weekday from user input. Surely, such a function should return Saturday, if the string input was "Saturday", but what if the input was "sdfkl332"? We have several options here. For instance, we could just return a default result (Sunday perhaps?). But is this the behavior programmers expect when using our library? Maybe not. To silently continue with a default value in the face of invalid user input is hardly ever the best choice and may lead to a lot of confusion.

In an imperative language, our function would probably throw an exception. We could do this in Idris as well (there is function idris_crash in the Prelude for this), but doing so, we would abandon totality! A high price to pay for such a common thing as a parsing error.

In languages like Java, our function might also return some kind of null value (leading to the dreaded NullPointerExceptions if not handled properly in client code). Our solution will be similar, but instead of silently returning null, we will make the possibility of failure visible in the types! We define a custom data type, which encapsulates the possibility of failure. Defining new data types in Idris is very cheap (in terms of the amount of code needed), therefore this is often the way to go in order to increase type safety. Here's an example how to do this:

data MaybeWeekday = WD Weekday | NoWeekday

total
readWeekday : String -> MaybeWeekday
readWeekday "Monday"    = WD Monday
readWeekday "Tuesday"   = WD Tuesday
readWeekday "Wednesday" = WD Wednesday
readWeekday "Thursday"  = WD Thursday
readWeekday "Friday"    = WD Friday
readWeekday "Saturday"  = WD Saturday
readWeekday "Sunday"    = WD Sunday
readWeekday _           = NoWeekday

But assume now, we'd also like to read Bool values from user input. We'd now have to write a custom data type MaybeBool and so on for all types we'd like to read from String, and the conversion of which might fail.

Idris, like many other programming languages, allows us to generalize this behavior by using generic data types. Here's an example:

data Option a = Some a | None

total
readBool : String -> Option Bool
readBool "True"    = Some True
readBool "False"   = Some False
readBool _         = None

It is important to go to the REPL and look at the types:

Tutorial.DataTypes> :t Some
Tutorial.DataTypes.Some : a -> Option a
Tutorial.DataTypes> :t None
Tutorial.DataTypes.None : Option a
Tutorial.DataTypes> :t Option
Tutorial.DataTypes.Option : Type -> Type

We need to introduce some jargon here. Option is what we call a type constructor. It is not yet a saturated type: It is a function from Type to Type. However, Option Bool is a type, as is Option Weekday. Even Option (Option Bool) is a valid type. Option is a type constructor parameterized over a parameter of type Type. Some and None are Options data constructors: The functions used to create values of type Option a for a type a.

Let's see some other use cases for Option. Below is a safe division operation:

total
safeDiv : Integer -> Integer -> Option Integer
safeDiv n 0 = None
safeDiv n k = Some (n `div` k)

The possibility of returning some kind of null value in the face of invalid input is so common, that there is a data type like Option already in the Prelude: Maybe, with data constructors Just and Nothing.

It is important to understand the difference between returning Maybe Integer in a function, which might fail, and returning null in languages like Java: In the former case, the possibility of failure is visible in the types. The type checker will force us to treat Maybe Integer differently than Integer: Idris will not allow us to forget to eventually handle the failure case. Not so, if null is silently returned without adjusting the types. Programmers may (and often will) forget to handle the null case, leading to unexpected and sometimes hard to debug runtime exceptions.

Either

While Maybe is very useful to quickly provide a default value to signal some kind of failure, this value (Nothing) is not very informative. It will not tell us what exactly went wrong. For instance, in case of our Weekday reading function, it might be interesting later on to know the value of the invalid input string. And just like with Maybe and Option above, this concept is general enough that we might encounter other types of invalid values. Here's a data type to encapsulate this:

data Validated e a = Invalid e | Valid a

Validated is a type constructor parameterized over two type parameters e and a. It's data constructors are Invalid and Valid, the former holding a value describing some error condition, the latter the result in case of a successful computation. Let's see this in action:

total
readWeekdayV : String -> Validated String Weekday
readWeekdayV "Monday"    = Valid Monday
readWeekdayV "Tuesday"   = Valid Tuesday
readWeekdayV "Wednesday" = Valid Wednesday
readWeekdayV "Thursday"  = Valid Thursday
readWeekdayV "Friday"    = Valid Friday
readWeekdayV "Saturday"  = Valid Saturday
readWeekdayV "Sunday"    = Valid Sunday
readWeekdayV s           = Invalid ("Not a weekday: " ++ s)

Again, this is such a general concept that a data type similar to Validated is already available from the Prelude: Either with data constructors Left and Right. It is very common for functions to encapsulate the possibility of failure by returning an Either err val, where err is the error type and val is the desired return type. This is the type safe (and total!) alternative to throwing a catchable exception in an imperative language.

Note, however, that the semantics of Either are not always "Left is an error and Right a success". A function returning an Either just means that it can have to different types of results, each of which are tagged with the corresponding data constructor.

List

One of the most important data structures in pure functional programming is the singly linked list. Here is its definition (called Seq in order for it not to collide with List, which is of course already available from the Prelude):

data Seq a = Nil | (::) a (Seq a)

This calls for some explanations. Seq consists of two data constructors: Nil (representing an empty sequence of values) and (::) (also called the cons operator), which prepends a new value of type a to an already existing list of values of the same type. As you can see, we can also use operators as data constructors, but please do not overuse this. Use clear names for your functions and data constructors and only introduce new operators when it truly helps readability!

Here is an example of how to use the List constructors (I use List here, as this is what you should use in your own code):

total
ints : List Int64
ints = 1 :: 2 :: -3 :: Nil

However, there is a more concise way of writing the above. Idris accepts special syntax for constructing data types consisting exactly of the two constructors Nil and (::):

total
ints2 : List Int64
ints2 = [1, 2, -3]

total
ints3 : List Int64
ints3 = []

The two definitions ints and ints2 are treated identically by the compiler. Note, that list syntax can also be used in pattern matches.

There is another thing that's special about Seq and List: Each of them is defined in terms of itself (the cons operator accepts a value and another Seq as arguments). We call such data types recursive data types, and their recursive nature means, that in order to decompose or consume them, we typically require recursive functions. In an imperative language, we might use a for loop or similar construct to iterate over the values of a List or a Seq, but these things do not exist in a language without in-place mutation. Here's how to sum a list of integers:

total
intSum : List Integer -> Integer
intSum Nil       = 0
intSum (n :: ns) = n + intSum ns

Recursive functions can be hard to grasp at first, so I'll break this down a bit. If we invoke intSum with the empty list, the first pattern matches and the function returns zero immediately. If, however, we invoke intSum with a non-empty list - [7,5,9] for instance - the following happens:

  1. The second pattern matches and splits the list into two parts: Its head (7) is bound to variable n and its tail ([5,9]) is bound to ns:

    7 + intSum [5,9]
    
  2. In a second invocation, intSum is called with a new list: [5,9]. The second pattern matches and n is bound to 5 and ns is bound to [9]:

    7 + (5 + intSum [9])
    
  3. In a third invocation intSum is called with list [9]. The second pattern matches and n is bound to 9 and ns is bound to []:

    7 + (5 + (9 + intSum [])
    
  4. In a fourth invocation, intSum is called with list [] and returns 0 immediately:

    7 + (5 + (9 + 0)
    
  5. In the third invocation, 9 and 0 are added and 9 is returned:

    7 + (5 + 9)
    
  6. In the second invocation, 5 and 9 are added and 14 is returned:

    7 + 14
    
  7. Finally, our initial invocation of intSum adds 7 and 14 and returns 21.

Thus, the recursive implementation of intSum leads to a sequence of nested calls to intSum, which terminates once the argument is the empty list.

Generic Functions

In order to fully appreciate the versatility that comes with generic data types, we also need to talk about generic functions. Like generic types, these are parameterized over one or more type parameters.

Consider for instance the case of breaking out of the Option data type. In case of a Some, we'd like to return the stored value, while for the None case we provide a default value. Here's how to do this, specialized to Integers:

total
integerFromOption : Integer -> Option Integer -> Integer
integerFromOption _ (Some y) = y
integerFromOption x None     = x

It's pretty obvious that this, again, is not general enough. Surely, we'd also like to break out of Option Bool or Option String in a similar fashion. That's exactly what the generic function fromOption does:

total
fromOption : a -> Option a -> a
fromOption _ (Some y) = y
fromOption x None     = x

The lower-case a is again a type parameter. You can read the type signature as follows: "For any type a, given a value of type a, and an Option a, we can return a value of type a." Note, that fromOption knows nothing else about a, other than it being a type. It is therefore not possible, to conjure a value of type a out of thin air. We must have a value available to deal with the None case.

The pendant to fromOption for Maybe is called fromMaybe and is available from module Data.Maybe from the base library.

Sometimes, fromOption is not general enough. Assume we'd like to print the value of a freshly parsed Bool, giving some generic error message in case of a None. We can't use fromOption for this, as we have an Option Bool and we'd like to return a String. Here's how to do this:

total
option : b -> (a -> b) -> Option a -> b
option _ f (Some y) = f y
option x _ None     = x

total
handleBool : Option Bool -> String
handleBool = option "Not a boolean value." show

Function option is parameterized over two type parameters: a represents the type of values stored in the Option, while b is the return type. In case of a Just, we need a way to convert the stored a to a b, an that's done using the function argument of type a -> b.

In Idris, lower-case identifiers in function types are treated as type parameters, while upper-case identifiers are treated as types or type constructors that must be in scope.

Exercises part 4

If this is your first time programming in a purely functional language, the exercises below are very important. Do not skip any of them! Take your time and work through them all. In most cases, the types should be enough to explain what's going on, even though they might appear cryptic in the beginning. Otherwise, have a look at the comments (if any) of each exercise.

Remember, that lower-case identifiers in a function signature are treated as type parameters.

  1. Implement the following generic functions for Maybe:

    -- make sure to map a `Just` to a `Just`.
    total
    mapMaybe : (a -> b) -> Maybe a -> Maybe b
    
    -- Example: `appMaybe (Just (+2)) (Just 20) = Just 22`
    total
    appMaybe : Maybe (a -> b) -> Maybe a -> Maybe b
    
    -- Example: `bindMaybe (Just 12) Just = Just 12`
    total
    bindMaybe : Maybe a -> (a -> Maybe b) -> Maybe b
    
    -- keep the value in a `Just` only if the given predicate holds
    total
    filterMaybe : (a -> Bool) -> Maybe a -> Maybe a
    
    -- keep the first value that is not a `Nothing` (if any)
    total
    first : Maybe a -> Maybe a -> Maybe a
    
    -- keep the last value that is not a `Nothing` (if any)
    total
    last : Maybe a -> Maybe a -> Maybe a
    
    -- this is another general way to extract a value from a `Maybe`.
    -- Make sure the following holds:
    -- `foldMaybe (+) 5 Nothing = 5`
    -- `foldMaybe (+) 5 (Just 12) = 17`
    total
    foldMaybe : (acc -> el -> acc) -> acc -> Maybe el -> acc
    
  2. Implement the following generic functions for Either:

    total
    mapEither : (a -> b) -> Either e a -> Either e b
    
    -- In case of both `Either`s being `Left`s, keep the
    -- value stored in the first `Left`.
    total
    appEither : Either e (a -> b) -> Either e a -> Either e b
    
    total
    bindEither : Either e a -> (a -> Either e b) -> Either e b
    
    -- Keep the first value that is not a `Left`
    -- If both `Either`s are `Left`s, use the given accumulator
    -- for the error values
    total
    firstEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
    
    -- Keep the last value that is not a `Left`
    -- If both `Either`s are `Left`s, use the given accumulator
    -- for the error values
    total
    lastEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
    
    total
    fromEither : (e -> c) -> (a -> c) -> Either e a -> c
    
  3. Implement the following generic functions for List:

    total
    mapList : (a -> b) -> List a -> List b
    
    total
    filterList : (a -> Bool) -> List a -> List a
    
    -- re-implement list concatenation (++) such that e.g. (++) [1, 2] [3, 4] = [1, 2, 3, 4]
    -- note that because this function conflicts with the standard
    -- Prelude.List.(++), if you use it then you will need to prefix it with
    -- the name of your module, like DataTypes.(++) or Ch3.(++). alternatively
    -- you could simply call the function something unique like myListConcat or concat'
    total
    (++) : List a -> List a -> List a
    
    -- return the first value of a list, if it is non-empty
    total
    headMaybe : List a -> Maybe a
    
    -- return everything but the first value of a list, if it is non-empty
    total
    tailMaybe : List a -> Maybe (List a)
    
    -- return the last value of a list, if it is non-empty
    total
    lastMaybe : List a -> Maybe a
    
    -- return everything but the last value of a list,
    -- if it is non-empty
    total
    initMaybe : List a -> Maybe (List a)
    
    -- accumulate the values in a list using the given
    -- accumulator function and initial value
    --
    -- Examples:
    -- `foldList (+) 10 [1,2,7] = 20`
    -- `foldList String.(++) "" ["Hello","World"] = "HelloWorld"`
    -- `foldList last Nothing (mapList Just [1,2,3]) = Just 3`
    total
    foldList : (acc -> el -> acc) -> acc -> List el -> acc
    
  4. Assume we store user data for our web application in the following record:

    record Client where
      constructor MkClient
      name          : String
      title         : Title
      age           : Bits8
      passwordOrKey : Either Bits64 String
    

    Using LoginError from an earlier exercise, implement function login, which, given a list of Clients plus a value of type Credentials will return either a LoginError in case no valid credentials where provided, or the first Client for whom the credentials match.

  5. Using your data type for chemical elements from an earlier exercise, implement a function for calculating the molar mass of a molecular formula.

    Use a list of elements each paired with its count (a natural number) for representing formulae. For instance:

    ethanol : List (Element,Nat)
    ethanol = [(C,2),(H,6),(O,1)]
    

    Hint: You can use function cast to convert a natural number to a Double.

Alternative Syntax for Data Definitions

module Tutorial.DataTypes.AltSyntax

While the examples in the section about parameterized data types are short and concise, there is a slightly more verbose but much more general form for writing such definitions, which makes it much clearer what's going on. In my opinion, this more general form should be preferred in all but the most simple data definitions.

Here are the definitions of Option, Validated, and Seq again, using this more general form (I put them in their own namespace, so Idris will not complain about identical names in the same source file):

-- GADT is an acronym for "generalized algebraic data type"
namespace GADT
  data Option : Type -> Type where
    Some : a -> Option a
    None : Option a

  data Validated : Type -> Type -> Type where
    Invalid : e -> Validated e a
    Valid   : a -> Validated e a

  data Seq : Type -> Type where
    Nil  : Seq a
    (::) : a -> GADT.Seq a -> Seq a

Here, Option is clearly declared as a type constructor (a function of type Type -> Type), while Some is a generic function of type a -> Option a (where a is a type parameter) and None is a nullary generic function of type Option a (a again being a type parameter). Likewise for Validated and Seq. Note, that in case of Seq we had to disambiguate between the different Seq definitions in the recursive case. Since we will usually not define several data types with the same name in a source file, this is not necessary most of the time.

Conclusion

We covered a lot of ground in this chapter, so I'll summarize the most important points below:

  • Enumerations are data types consisting of a finite number of possible values.

  • Sum types are data types with more than one data constructor, where each constructor describes a choice that can be made.

  • Product types are data types with a single constructor used to group several values of possibly different types.

  • We use pattern matching to deconstruct immutable values in Idris. The possible patterns correspond to a data type's data constructors.

  • We can bind variables to values in a pattern or use an underscore as a placeholder for a value that's not needed on the right hand side of an implementation.

  • We can pattern match on an intermediary result by introducing a case block.

  • The preferred way to define new product types is to define them as records, since these come with additional syntactic conveniences for setting and modifying individual record fields.

  • Generic types and functions allow us generalize certain concepts and make them available for many types by using type parameters instead of concrete types in function and type signatures.

  • Common concepts like nullary values (Maybe), computations that might fail with some error condition (Either), and handling collections of values of the same type at once (List) are example use cases of generic types and functions already provided by the Prelude.

What's next

In the next section, we will introduce interfaces, another approach to function overloading.

Interfaces

Function overloading - the definition of functions with the same name but different implementations - is a concept found in many programming languages. Idris natively supports overloading of functions: Two functions with the same name can be defined in different modules or namespaces, and Idris will try to disambiguate between these based on the types involved. Here is an example:

module Tutorial.Interfaces

%default total

namespace Bool
  export
  size : Bool -> Integer
  size True  = 1
  size False = 0

namespace Integer
  export
  size : Integer -> Integer
  size = id

namespace List
  export
  size : List a -> Integer
  size = cast . length

Here, we defined three different functions called size, each in its own namespace. We can disambiguate between these by prefixing them with their namespace:

Tutorial.Interfaces> :t Bool.size
Tutorial.Interfaces.Bool.size : Bool -> Integer

However, this is usually not necessary:

mean : List Integer -> Integer
mean xs = sum xs `div` size xs

As you can see, Idris can disambiguate between the different size functions, since xs is of type List Integer, which unifies only with List a, the argument type of List.size.

Interface Basics

module Tutorial.Interfaces.Basics

While function overloading as described above works well, there are use cases, where this form of overloaded functions leads to a lot of code duplication.

As an example, consider a function cmp (short for compare, which is already exported by the Prelude), for describing an ordering for the values of type String:

cmp : String -> String -> Ordering

We'd also like to have similar functions for many other data types. Function overloading allows us to do just that, but cmp is not an isolated piece of functionality. From it, we can derive functions like greaterThan', lessThan', minimum', maximum', and many others:

lessThan' : String -> String -> Bool
lessThan' s1 s2 = LT == cmp s1 s2

greaterThan' : String -> String -> Bool
greaterThan' s1 s2 = GT == cmp s1 s2

minimum' : String -> String -> String
minimum' s1 s2 =
  case cmp s1 s2 of
    LT => s1
    _  => s2

maximum' : String -> String -> String
maximum' s1 s2 =
  case cmp s1 s2 of
    GT => s1
    _  => s2

We'd need to implement all of these again for the other types with a cmp function, and most if not all of these implementations would be identical to the ones written above. That's a lot of code repetition.

One way to solve this is to use higher-order functions. For instance, we could define function minimumBy, which takes a comparison function as its first argument and returns the smaller of the two remaining arguments:

minimumBy : (a -> a -> Ordering) -> a -> a -> a
minimumBy f a1 a2 =
  case f a1 a2 of
    LT => a1
    _  => a2

This solution is another proof of how higher-order functions allow us to reduce code duplication. However, the need to explicitly pass around the comparison function all the time can get tedious as well. It would be nice, if we could teach Idris to come up with such a function on its own.

Interfaces solve exactly this issue. Here's an example:

public export
interface Comp a where
  comp : a -> a -> Ordering

export
implementation Comp Bits8 where
  comp = compare

export
implementation Comp Bits16 where
  comp = compare

The code above defines interface Comp providing function comp for calculating the ordering for two values of a type a, followed by two implementations of this interface for types Bits8 and Bits16. Note, that the implementation keyword is optional.

The comp implementations for Bits8 and Bits16 both use function compare, which is part of a similar interface from the Prelude called Ord.

The next step is to look at the type of comp at the REPL:

Tutorial.Interfaces> :t comp
Tutorial.Interfaces.comp : Comp a => a -> a -> Ordering

The interesting part in the type signature of comp is the initial Comp a => argument. Here, Comp is a constraint on type parameter a. This signature can be read as: "For any type a, given an implementation of interface Comp for a, we can compare two values of type a and return an Ordering for these." Whenever we invoke comp, we expect Idris to come up with a value of type Comp a on its own, hence the new => arrow. If Idris fails to do so, it will answer with a type error.

We can now use comp in the implementations of related functions. All we have to do is to also prefix these derived functions with a Comp constraint:

lessThan : Comp a => a -> a -> Bool
lessThan s1 s2 = LT == comp s1 s2

greaterThan : Comp a => a -> a -> Bool
greaterThan s1 s2 = GT == comp s1 s2

minimum : Comp a => a -> a -> a
minimum s1 s2 =
  case comp s1 s2 of
    LT => s1
    _  => s2

maximum : Comp a => a -> a -> a
maximum s1 s2 =
  case comp s1 s2 of
    GT => s1
    _  => s2

Note, how the definition of minimum is almost identical to minimumBy. The only difference being that in case of minimumBy we had to pass the comparison function as an explicit argument, while for minimum it is provided as part of the Comp implementation, which is passed around by Idris for us.

Thus, we have defined all these utility functions once and for all for every type with an implementation of interface Comp.

Exercises Part 1

  1. Implement function anyLarger, which should return True, if and only if a list of values contains at least one element larger than a given reference value. Use interface Comp in your implementation.

  2. Implement function allLarger, which should return True, if and only if a list of values contains only elements larger than a given reference value. Note, that this is trivially true for the empty list. Use interface Comp in your implementation.

  3. Implement function maxElem, which tries to extract the largest element from a list of values with a Comp implementation. Likewise for minElem, which tries to extract the smallest element. Note, that the possibility of the list being empty must be considered when deciding on the output type.

  4. Define an interface Concat for values like lists or strings, which can be concatenated. Provide implementations for lists and strings.

  5. Implement function concatList for concatenating the values in a list holding values with a Concat implementation. Make sure to reflect the possibility of the list being empty in your output type.

More About Interfaces

module Tutorial.Interfaces.More

import Tutorial.Interfaces.Basics

In the last section, we learned about the very basics of interfaces: Why they are useful and how to define and implement them. In this section, we will learn about some slightly advanced concepts: Extending interfaces, interfaces with constraints, and default implementations.

Extending Interfaces

Some interfaces form a kind of hierarchy. For instance, for the Concat interface used in exercise 4, there might be a child interface called Empty, for those types, which have a neutral element with relation to concatenation. In such a case, we make an implementation of Concat a prerequisite for implementing Empty:

interface Concat a where
  concat : a -> a -> a

implementation Concat String where
  concat = (++)

interface Concat a => Empty a where
  empty : a

implementation Empty String where
  empty = ""

Concat a => Empty a should be read as: "An implementation of Concat for type a is a prerequisite for there being an implementation of Empty for a." But this also means that, whenever we have an implementation of interface Empty, we must also have an implementation of Concat and can invoke the corresponding functions:

concatListE : Empty a => List a -> a
concatListE []        = empty
concatListE (x :: xs) = concat x (concatListE xs)

Note, how in the type of concatListE we only used an Empty constraint, and how in the implementation we were still able to invoke both empty and concat.

Constrained Implementations

Sometimes, it is only possible to implement an interface for a generic type, if its type parameters implement this interface as well. For instance, implementing interface Comp for Maybe a makes sense only if type a itself implements Comp. We can constrain interface implementations with the same syntax we use for constrained functions:

implementation Comp a => Comp (Maybe a) where
  comp Nothing  Nothing  = EQ
  comp (Just _) Nothing  = GT
  comp Nothing  (Just _) = LT
  comp (Just x) (Just y) = comp x y

This is not the same as extending an interface, although the syntax looks very similar. Here, the constraint lies on a type parameter instead of the full type. The last line in the implementation of Comp (Maybe a) compares the values stored in the two Justs. This is only possible, if there is a Comp implementation for these values as well. Go ahead, and remove the Comp a constraint from the above implementation. Learning to read and understand Idris' type errors is important for fixing them.

The good thing is, that Idris will solve all these constraints for us:

maxTest : Maybe Bits8 -> Ordering
maxTest = comp (Just 12)

Here, Idris tries to find an implementation for Comp (Maybe Bits8). In order to do so, it needs an implementation for Comp Bits8. Go ahead, and replace Bits8 in the type of maxTest with Bits64, and have a look at the error message Idris produces.

Default Implementations

Sometimes, we'd like to pack several related functions in an interface to allow programmers to implement each in the most efficient way, although they could be implemented in terms of each other. For instance, consider an interface Equals for comparing two values for equality, with functions eq returning True if two values are equal and neq returning True if they are not. Surely, we can implement neq in terms of eq, so most of the time when implementing Equals, we will only implement the latter. In this case, we can give an implementation for neq already in the definition of Equals:

interface Equals a where
  eq : a -> a -> Bool

  neq : a -> a -> Bool
  neq a1 a2 = not (eq a1 a2)

If in an implementation of Equals we only implement eq, Idris will use the default implementation for neq as shown above:

Equals String where
  eq = (==)

If on the other hand we'd like to provide explicit implementations for both functions, we can do so as well:

Equals Bool where
  eq True True   = True
  eq False False = True
  eq _ _         = False

  neq True  False = True
  neq False True  = True
  neq _ _         = False

Exercises part 2

  1. Implement interfaces Equals, Comp, Concat, and Empty for pairs, constraining your implementations as necessary. (Note, that multiple constraints can be given sequentially like other function arguments: Comp a => Comp b => Comp (a,b).)

  2. Below is an implementation of a binary tree. Implement interfaces Equals and Concat for this type.

    data Tree : Type -> Type where
      Leaf : a -> Tree a
      Node : Tree a -> Tree a -> Tree a
    

Interfaces in the Prelude

module Tutorial.Interfaces.Prelude

The Idris Prelude provides several interfaces plus implementations that are useful in almost every non-trivial program. I'll introduce the basic ones here. The more advanced ones will be discussed in later chapters.

Most of these interfaces come with associated mathematical laws, and implementations are assumed to adhere to these laws. These laws will be given here as well.

Eq

Probably the most often used interface, Eq corresponds to interface Equals we used above as an example. Instead of eq and neq, Eq provides two operators (==) and (/=) for comparing two values of the same type for being equal or not. Most of the data types defined in the Prelude come with an implementation of Eq, and whenever programmers define their own data types, Eq is typically one of the first interfaces they implement.

Eq Laws

We expect the following laws to hold for all implementations of Eq:

  • (==) is reflexive: x == x = True for all x. This means, that every value is equal to itself.

  • (==) is symmetric: x == y = y == x for all x and y. This means, that the order of arguments passed to (==) does not matter.

  • (==) is transitive: From x == y = True and y == z = True follows x == z = True.

  • (/=) is the negation of (==): x == y = not (x /= y) for all x and y.

In theory, Idris has the power to verify these laws at compile time for many non-primitive types. However, out of pragmatism this is not required when implementing Eq, since writing such proofs can be quite involved.

Ord

The pendant to Comp in the Prelude is interface Ord. In addition to compare, which is identical to our own comp it provides comparison operators (>=), (>), (<=), and (<), as well as utility functions max and min. Unlike Comp, Ord extends Eq, so whenever there is an Ord constraint, we also have access to operators (==) and (/=) and related functions.

Ord Laws

We expect the following laws to hold for all implementations of Ord:

  • (<=) is reflexive and transitive.
  • (<=) is antisymmetric: From x <= y = True and y <= x = True follows x == y = True.
  • x <= y = y >= x.
  • x < y = not (y <= x)
  • x > y = not (y >= x)
  • compare x y = EQ => x == y = True
  • compare x y == GT = x > y
  • compare x y == LT = x < y

Semigroup and Monoid

Semigroup is the pendant to our example interface Concat, with operator (<+>) (also called append) corresponding to function concat.

Likewise, Monoid corresponds to Empty, with neutral corresponding to empty.

These are incredibly important interfaces, which can be used to combine two or more values of a data type into a single value of the same type. Examples include but are not limited to addition or multiplication of numeric types, concatenation of sequences of data, or sequencing of computations.

As an example, consider a data type for representing distances in a geometric application. We could just use Double for this, but that's not very type safe. It would be better to use a single field record wrapping values type Double, to give such values clear semantics:

record Distance where
  constructor MkDistance
  meters : Double

There is a natural way for combining two distances: We sum up the values they hold. This immediately leads to an implementation of Semigroup:

Semigroup Distance where
  x <+> y = MkDistance $ x.meters + y.meters

It is also immediately clear, that zero is the neutral element of this operation: Adding zero to any value does not affect the value at all. This allows us to implement Monoid as well:

Monoid Distance where
  neutral = MkDistance 0

Semigroup and Monoid Laws

We expect the following laws to hold for all implementations of Semigroup and Monoid:

  • (<+>) is associative: x <+> (y <+> z) = (x <+> y) <+> z, for all values x, y, and z.
  • neutral is the neutral element with relation to (<+>): neutral <+> x = x <+> neutral = x, for all x.

Show

The Show interface is mainly used for debugging purposes, and is supposed to display values of a given type as a string, typically closely resembling the Idris code used to create the value. This includes the proper wrapping of arguments in parentheses where necessary. For instance, experiment with the output of the following function at the REPL:

showExample : Maybe (Either String (List (Maybe Integer))) -> String
showExample = show

And at the REPL:

Tutorial.Interfaces> showExample (Just (Right [Just 12, Nothing]))
"Just (Right [Just 12, Nothing])"

We will learn how to implement instances of Show in an exercise.

Overloaded Literals

Literal values in Idris, such as integer literals (12001), string literals ("foo bar"), floating point literals (12.112), and character literals ('$') can be overloaded. This means, that we can create values of types other than String from just a string literal. The exact workings of this has to wait for another section, but for many common cases, it is sufficient for a value to implement interfaces FromString (for using string literals), FromChar (for using character literals), or FromDouble (for using floating point literals). The case of integer literals is special, and will be discussed in the next section.

Here is an example of using FromString. Assume, we write an application where users can identify themselves with a username and password. Both consist of strings of characters, so it is pretty easy to confuse and mix up the two things, although they clearly have very different semantics. In these cases, it is advisable to come up with new types for the two, especially since getting these things wrong is a security concern.

Here are three example record types to do this:

record UserName where
  constructor MkUserName
  name : String

record Password where
  constructor MkPassword
  value : String

record User where
  constructor MkUser
  name     : UserName
  password : Password

In order to create a value of type User, even for testing, we'd have to wrap all strings using the given constructors:

hock : User
hock = MkUser (MkUserName "hock") (MkPassword "not telling")

This is rather cumbersome, and some people might think this to be too high a price to pay just for an increase in type safety (I'd tend to disagree). Luckily, we can get the convenience of string literals back very easily:

FromString UserName where
  fromString = MkUserName

FromString Password where
  fromString = MkPassword

hock2 : User
hock2 = MkUser "hock" "not telling"

Numeric Interfaces

The Prelude also exports several interfaces providing the usual arithmetic operations. Below is a comprehensive list of the interfaces and the functions each provides:

  • Num

    • (+) : Addition
    • (*) : Multiplication
    • fromInteger : Overloaded integer literals
  • Neg

    • negate : Negation
    • (-) : Subtraction
  • Integral

    • div : Integer division
    • mod : Modulo operation
  • Fractional

    • (/) : Division
    • recip : Calculates the reciprocal of a value

As you can see: We need to implement interface Num to use integer literals for a given type. In order to use negative integer literals like -12, we also have to implement interface Neg.

Cast

The last interface we will quickly discuss in this section is Cast. It is used to convert values of one type to values of another via function cast. Cast is special, since it is parameterized over two type parameters unlike the other interfaces we looked at so far, with only one type parameter.

So far, Cast is mainly used for interconversion between primitive types in the standard libraries, especially numeric types. When you look at the implementations exported from the Prelude (for instance, by invoking :doc Cast at the REPL), you'll see that there are dozens of implementations for most pairings of primitive types.

Although Cast would also be useful for other conversions (for going from Maybe to List or for going from Either e to Maybe, for instance), the Prelude and base seem not to introduce these consistently. For instance, there are Cast implementations from going from SnocList to List and vice versa, but not for going from Vect n to List, or for going from List1 to List, although these would be just as feasible.

Exercises part 3

These exercises are meant to make you comfortable with implementing interfaces for your own data types, as you will have to do so regularly when writing Idris code.

While it is immediately clear why interfaces like Eq, Ord, or Num are useful, the usability of Semigroup and Monoid may be harder to appreciate at first. Therefore, there are several exercises where you'll implement different instances for these.

  1. Define a record type Complex for complex numbers, by pairing two values of type Double. Implement interfaces Eq, Num, Neg, and Fractional for Complex.

  2. Implement interface Show for Complex. Have a look at data type Prec and function showPrec and how these are used in the Prelude to implement instances for Either and Maybe.

    Verify the correct behavior of your implementation by wrapping a value of type Complex in a Just and show the result at the REPL.

  3. Consider the following wrapper for optional values:

    record First a where
      constructor MkFirst
      value : Maybe a
    

    Implement interfaces Eq, Ord, Show, FromString, FromChar, FromDouble, Num, Neg, Integral, and Fractional for First a. All of these will require corresponding constraints on type parameter a. Consider implementing and using the following utility functions where they make sense:

    pureFirst : a -> First a
    
    mapFirst : (a -> b) -> First a -> First b
    
    mapFirst2 : (a -> b -> c) -> First a -> First b -> First c
    
  4. Implement interfaces Semigroup and Monoid for First a in such a way, that (<+>) will return the first non-nothing argument and neutral is the corresponding neutral element. There must be no constraints on type parameter a in these implementations.

  5. Repeat exercises 3 and 4 for record Last. The Semigroup implementation should return the last non-nothing value.

    record Last a where
      constructor MkLast
      value : Maybe a
    
  6. Function foldMap allows us to map a function returning a Monoid over a list of values and accumulate the result using (<+>) at the same time. This is a very powerful way to accumulate the values stored in a list. Use foldMap and Last to extract the last element (if any) from a list.

    Note, that the type of foldMap is more general and not specialized to lists only. It works also for Maybe, Either and other container types we haven't looked at so far. We will learn about interface Foldable in a later section.

  7. Consider record wrappers Any and All for boolean values:

    record Any where
      constructor MkAny
      any : Bool
    
    record All where
      constructor MkAll
      all : Bool
    

    Implement Semigroup and Monoid for Any, so that the result of (<+>) is True, if and only if at least one of the arguments is True. Make sure that neutral is indeed the neutral element for this operation.

    Likewise, implement Semigroup and Monoid for All, so that the result of (<+>) is True, if and only if both of the arguments are True. Make sure that neutral is indeed the neutral element for this operation.

  8. Implement functions anyElem and allElems using foldMap and Any or All, respectively:

    -- True, if the predicate holds for at least one element
    anyElem : (a -> Bool) -> List a -> Bool
    
    -- True, if the predicate holds for all elements
    allElems : (a -> Bool) -> List a -> Bool
    
  9. Record wrappers Sum and Product are mainly used to hold numeric types.

    record Sum a where
      constructor MkSum
      value : a
    
    record Product a where
      constructor MkProduct
      value : a
    

    Given an implementation of Num a, implement Semigroup (Sum a) and Monoid (Sum a), so that (<+>) corresponds to addition.

    Likewise, implement Semigroup (Product a) and Monoid (Product a), so that (<+>) corresponds to multiplication.

    When implementing neutral, remember that you can use integer literals when working with numeric types.

  10. Implement sumList and productList by using foldMap together with the wrappers from Exercise 9:

    sumList : Num a => List a -> a
    
    productList : Num a => List a -> a
    
  11. To appreciate the power and versatility of foldMap, after solving exercises 6 to 10 (or by loading Solutions.Inderfaces in a REPL session), run the following at the REPL, which will - in a single list traversal! - calculate the first and last element of the list as well as the sum and product of all values.

    > foldMap (\x => (pureFirst x, pureLast x, MkSum x, MkProduct x)) [3,7,4,12]
    (MkFirst (Just 3), (MkLast (Just 12), (MkSum 26, MkProduct 1008)))
    

    Note, that there are also Semigroup implementations for types with an Ord implementation, which will return the smaller or larger of two values. In case of types with an absolute minimum or maximum (for instance, 0 for natural numbers, or 0 and 255 for Bits8), these can even be extended to Monoid.

  12. In an earlier exercise, you implemented a data type representing chemical elements and wrote a function for calculating their atomic masses. Define a new single field record type for representing atomic masses, and implement interfaces Eq, Ord, Show, FromDouble, Semigroup, and Monoid for this.

  13. Use the new data type from exercise 12 to calculate the atomic mass of an element and compute the molecular mass of a molecule given by its formula.

    Hint: With a suitable utility function, you can use foldMap once again for this.

Final notes: If you are new to functional programming, make sure to give your implementations of exercises 6 to 10 a try at the REPL. Note, how we can implement all of these functions with a minimal amount of code and how, as shown in exercise 11, these behaviors can be combined in a single list traversal.

Conclusion

  • Interfaces allow us to implement the same function with different behavior for different types.
  • Functions taking one or more interface implementations as arguments are called constrained functions.
  • Interfaces can be organized hierarchically by extending other interfaces.
  • Interfaces implementations can themselves be constrained requiring other implementations to be available.
  • Interface functions can be given a default implementation, which can be overridden by implementers, for instance for reasons of efficiency.
  • Certain interfaces allow us to use literal values such as string or integer literals for our own data types.

Note, that I did not yet tell the whole story about literal values in this section. More details for using literals with types that accept only a restricted set of values can be found in the chapter about primitives.

What's next

In the next chapter, we have a closer look at functions and their types. We will learn about named arguments, implicit arguments, and erased arguments as well as some constructors for implementing more complex functions.

Functions Part 2

So far, we learned about the core features of the Idris language, which it has in common with several other pure, strongly typed programming languages like Haskell: (Higher-order) Functions, algebraic data types, pattern matching, parametric polymorphism (generic types and functions), and ad hoc polymorphism (interfaces and constrained functions).

In this chapter, we start to dissect Idris functions and their types for real. We learn about implicit arguments, named arguments, as well as erasure and quantities. But first, we'll look at let bindings and where blocks, which help us implement functions too complex to fit on a single line of code. Let's get started!

module Tutorial.Functions2

%default total

Let Bindings and Local Definitions

module Tutorial.Functions2.LetBindings

%default total

The functions we looked at so far were simple enough to be implemented directly via pattern matching without the need of additional auxiliary functions or variables. This is not always the case, and there are two important language constructs for introducing and reusing new local variables and functions. We'll look at these in two case studies.

Use Case 1: Arithmetic Mean and Standard Deviation

In this example, we'd like to calculate the arithmetic mean and the standard deviation of a list of floating point values. There are several things we need to consider.

First, we need a function for calculating the sum of a list of numeric values. The Prelude exports function sum for this:

Main> :t sum
Prelude.sum : Num a => Foldable t => t a -> a

This is - of course - similar to sumList from Exercise 10 of the last section, but generalized to all container types with a Foldable implementation. We will learn about interface Foldable in a later section.

In order to also calculate the variance, we need to convert every value in the list to a new value, as we have to subtract the mean from every value in the list and square the result. In the previous section's exercises, we defined function mapList for this. The Prelude - of course - already exports a similar function called map, which is again more general and works also like our mapMaybe for Maybe and mapEither for Either e. Here's its type:

Main> :t map
Prelude.map : Functor f => (a -> b) -> f a -> f b

Interface Functor is another one we'll talk about in a later section.

Finally, we need a way to calculate the length of a list of values. We use function length for this:

Main> :t List.length
Prelude.List.length : List a -> Nat

Here, Nat is the type of natural numbers (unbounded, unsigned integers). Nat is actually not a primitive data type but a sum type defined in the Prelude with data constructors Z : Nat (for zero) and S : Nat -> Nat (for successor). It might seem highly inefficient to define natural numbers this way, but the Idris compiler treats these and several other number-like types specially, and replaces them with primitive integers during code generation.

We are now ready to give the implementation of mean a go. Since this is Idris, and we care about clear semantics, we will quickly define a custom record type instead of just returning a tuple of Doubles. This makes it clearer, which floating point number corresponds to which statistic entity:

square : Double -> Double
square n = n * n

record Stats where
  constructor MkStats
  mean      : Double
  variance  : Double
  deviation : Double

stats : List Double -> Stats
stats xs =
  let len      := cast (length xs)
      mean     := sum xs / len
      variance := sum (map (\x => square (x - mean)) xs) / len
   in MkStats mean variance (sqrt variance)

As usual, we first try this at the REPL:

Tutorial.Functions2> stats [2,4,4,4,5,5,7,9]
MkStats 5.0 4.0 2.0

Seems to work, so let's digest this step by step. We introduce several new local variables (len, mean, and variance), which all will be used more than once in the remainder of the implementation. To do so, we use a let binding. This consists of the let keyword, followed by one or more variable assignments, followed by the final expression, which has to be prefixed by in. Note, that whitespace is significant again: We need to properly align the three variable names. Go ahead, and try out what happens if you remove a space in front of mean or variance. Note also, that the alignment of assignment operators := is optional. I do this, since I thinks it helps readability.

Let's also quickly look at the different variables and their types. len is the length of the list cast to a Double, since this is what's needed later on, where we divide other values of type Double by the length. Idris is very strict about this: We are not allowed to mix up numeric types without explicit casts. Please note, that in this case Idris is able to infer the type of len from the surrounding context. mean is straight forward: We sum up the values stored in the list and divide by the list's length. variance is the most involved of the three: We map each item in the list to a new value using an anonymous function to subtract the mean and square the result. We then sum up the new terms and divide again by the number of values.

Use Case 2: Simulating a Simple Web Server

In the second use case, we are going to write a slightly larger application. This should give you an idea about how to design data types and functions around some business logic you'd like to implement.

Assume we run a music streaming web server, where users can buy whole albums and listen to them online. We'd like to simulate a user connecting to the server and getting access to one of the albums they bought.

We first define a bunch of record types:

record Artist where
  constructor MkArtist
  name : String

record Album where
  constructor MkAlbum
  name   : String
  artist : Artist

record Email where
  constructor MkEmail
  value : String

record Password where
  constructor MkPassword
  value : String

record User where
  constructor MkUser
  name     : String
  email    : Email
  password : Password
  albums   : List Album

Most of these should be self-explanatory. Note, however, that in several cases (Email, Artist, Password) we wrap a single value in a new record type. Of course, we could have used the unwrapped String type instead, but we'd have ended up with many String fields, which can be hard to disambiguate. In order not to confuse an email string with a password string, it can therefore be helpful to wrap both of them in a new record type to drastically increase type safety at the cost of having to reimplement some interfaces. Utility function on from the Prelude is very useful for this. Don't forget to inspect its type at the REPL, and try to understand what's going on here.

Eq Artist where (==) = (==) `on` name

Eq Email where (==) = (==) `on` value

Eq Password where (==) = (==) `on` value

Eq Album where (==) = (==) `on` \a => (a.name, a.artist)

In case of Album, we wrap the two fields of the record in a Pair, which already comes with an implementation of Eq. This allows us to again use function on, which is very convenient.

Next, we have to define the data types representing server requests and responses:

record Credentials where
  constructor MkCredentials
  email    : Email
  password : Password

record Request where
  constructor MkRequest
  credentials : Credentials
  album       : Album

data Response : Type where
  UnknownUser     : Email -> Response
  InvalidPassword : Response
  AccessDenied    : Email -> Album -> Response
  Success         : Album -> Response

For server responses, we use a custom sum type encoding the possible outcomes of a client request. In practice, the Success case would return some kind of connection to start the actual album stream, but we just wrap up the album we found to simulate this behavior.

We can now go ahead and simulate the handling of a request at the server. To emulate our user data base, a simple list of users will do. Here's the type of the function we'd like to implement:

DB : Type
DB = List User

handleRequest : DB -> Request -> Response

Note, how we defined a short alias for List User called DB. This is often useful to make lengthy type signatures more readable and communicate the meaning of a type in the given context. However, this will not introduce a new type, nor will it increase type safety: DB is identical to List User, and as such, a value of type DB can be used wherever a List User is expected and vice versa. In more complex programs it is therefore usually preferable to define new types by wrapping values in single-field records.

The implementation will proceed as follows: It will first try and lookup a User by is email address in the data base. If this is successful, it will compare the provided password with the user's actual password. If the two match, it will lookup the requested album in the user's list of albums. If all of these steps succeed, the result will be an Album wrapped in a Success. If any of the steps fails, the result will describe exactly what went wrong.

Here's a possible implementation:

handleRequest db (MkRequest (MkCredentials email pw) album) =
  case lookupUser db of
    Just (MkUser _ _ password albums)  =>
      if password == pw then lookupAlbum albums else InvalidPassword

    Nothing => UnknownUser email

  where lookupUser : List User -> Maybe User
        lookupUser []        = Nothing
        lookupUser (x :: xs) =
          if x.email == email then Just x else lookupUser xs

        lookupAlbum : List Album -> Response
        lookupAlbum []        = AccessDenied email album
        lookupAlbum (x :: xs) =
          if x == album then Success album else lookupAlbum xs

I'd like to point out several things in this example. First, note how we can extract values from nested records in a single pattern match. Second, we defined two local functions in a where block: lookupUser, and lookupAlbum. Both of these have access to all variables in the surrounding scope. For instance, lookupUser uses the email variable from the pattern match in the implementation's first line. Likewise, lookupAlbum makes use of the album variable.

A where block introduces new local definitions, accessible only from the surrounding scope and from other functions defined later in the same where block. These need to be explicitly typed and indented by the same amount of whitespace.

Local definitions can also be introduced before a function's implementation by using the let keyword. This usage of let is not to be confused with let bindings described above, which are used to bind and reuse the results of intermediate computations. Below is how we could have implemented handleRequest with local definitions introduced by the let keyword. Again, all definitions have to be properly typed and indented:

handleRequest' : DB -> Request -> Response
handleRequest' db (MkRequest (MkCredentials email pw) album) =
  let lookupUser : List User -> Maybe User
      lookupUser []        = Nothing
      lookupUser (x :: xs) =
        if x.email == email then Just x else lookupUser xs

      lookupAlbum : List Album -> Response
      lookupAlbum []        = AccessDenied email album
      lookupAlbum (x :: xs) =
        if x == album then Success album else lookupAlbum xs

   in case lookupUser db of
        Just (MkUser _ _ password albums)  =>
          if password == pw then lookupAlbum albums else InvalidPassword

        Nothing => UnknownUser email

Exercises

The exercises in this section are supposed to increase you experience in writing purely functional code. In some cases it might be useful to use let expressions or where blocks, but this will not always be required.

Exercise 3 is again of utmost importance. traverseList is a specialized version of the more general traverse, one of the most powerful and versatile functions available in the Prelude (check out its type!).

  1. Module Data.List in base exports functions find and elem. Inspect their types and use these in the implementation of handleRequest. This should allow you to completely get rid of the where block.

  2. Refactor handleRequest to use Either, such that handleRequest : DB -> Request -> Either Failure Album, where

    data Failure : Type where
      UnknownUser : Email -> Failure
      InvalidPassword : Failure
      AccessDenied : Email -> Album -> Failure
    

    Hint: You may find nested case statements helpful.

  3. Define an enumeration type listing the four nucleobases occurring in DNA strands. Define also a type alias DNA for lists of nucleobases. Declare and implement function readBase for converting a single character (type Char) to a nucleobase. You can use character literals in your implementation like so: 'A', 'a'. Note, that this function might fail, so adjust the result type accordingly.

  4. Implement the following function, which tries to convert all values in a list with a function, which might fail. The result should be a Just holding the list of converted values in unmodified order, if and only if every single conversion was successful.

    traverseList : (a -> Maybe b) -> List a -> Maybe (List b)
    

    You can verify, that the function behaves correctly with the following test: traverseList Just [1,2,3] = Just [1,2,3].

  5. Implement function readDNA : String -> Maybe DNA using the functions and types defined in exercises 2 and 3. You will also need function unpack from the Prelude.

  6. Implement function complement : DNA -> DNA to calculate the complement of a strand of DNA.

The Truth about Function Arguments

module Tutorial.Functions2.TheTruth

%default total

So far, when we defined a top level function, it looked something like the following:

zipEitherWith : (a -> b -> c) -> Either e a -> Either e b -> Either e c
zipEitherWith f (Right va) (Right vb) = Right (f va vb)
zipEitherWith f (Left e)   _          = Left e
zipEitherWith f _          (Left e)   = Left e

Function zipEitherWith is a generic higher-order function combining the values stored in two Eithers via a binary function. If either of the Either arguments is a Left, the result is also a Left.

This is a generic function with type parameters a, b, c, and e. However, there is a more verbose type for zipEitherWith, which is visible in the REPL when entering :ti zipEitherWith (the i here tells Idris to include implicit arguments). You will get a type similar to this:

zipEitherWith' :  {0 a : Type}
               -> {0 b : Type}
               -> {0 c : Type}
               -> {0 e : Type}
               -> (a -> b -> c)
               -> Either e a
               -> Either e b
               -> Either e c

In order to understand what's going on here, we will have to talk about named arguments, implicit arguments, and quantities.

Named Arguments

In a function type, we can give each argument a name. Like so:

fromMaybe : (deflt : a) -> (ma : Maybe a) -> a
fromMaybe deflt Nothing = deflt
fromMaybe _    (Just x) = x

Here, the first argument is given name deflt, the second ma. These names can be reused in a function's implementation, as was done for deflt, but this is not mandatory: We are free to use different names in the implementation. There are several reasons, why we'd choose to name our arguments: It can serve as documentation, but it also allows us to pass the arguments to a function in arbitrary order when using the following syntax:

extractBool : Maybe Bool -> Bool
extractBool v = fromMaybe { ma = v, deflt = False }

Or even :

extractBool2 : Maybe Bool -> Bool
extractBool2 = fromMaybe { deflt = False }

The arguments in a record's constructor are automatically named in accordance with the field names:

record Dragon where
  constructor MkDragon
  name      : String
  strength  : Nat
  hitPoints : Int16

gorgar : Dragon
gorgar = MkDragon { strength = 150, name = "Gorgar", hitPoints = 10000 }

For the use cases described above, named arguments are merely a convenience and completely optional. However, Idris is a dependently typed programming language: Types can be calculated from and depend on values. For instance, the result type of a function can depend on the value of one of its arguments. Here's a contrived example:

IntOrString : Bool -> Type
IntOrString True  = Integer
IntOrString False = String

intOrString : (v : Bool) -> IntOrString v
intOrString False = "I'm a String"
intOrString True  = 1000

If you see such a thing for the first time, it can be hard to understand what's going on here. First, function IntOrString computes a Type from a Bool value: If the argument is True, it returns type Integer, if the argument is False it returns String. We use this to calculate the return type of function intOrString based on its boolean argument v: If v is True, the return type is (in accordance with IntOrString True = Integer) Integer, otherwise it is String.

Note, how in the type signature of intOrString, we must give the argument of type Bool a name (v) in order to reference it in the result type IntOrString v.

You might wonder at this moment, why this is useful and why we would ever want to define a function with such a strange type. We will see lots of very useful examples in due time! For now, suffice to say that in order to express dependent function types, we need to name at least some of the function's arguments and refer to them by name in the types of other arguments.

Implicit Arguments

Implicit arguments are arguments, the values of which the compiler should infer and fill in for us automatically. For instance, in the following function signature, we expect the compiler to infer the value of type parameter a automatically from the types of the other arguments (ignore the 0 quantity for the moment; I'll explain it in the next subsection):

maybeToEither : {0 a : Type} -> Maybe a -> Either String a
maybeToEither Nothing  = Left "Nope"
maybeToEither (Just x) = Right x

-- Please remember, that the above is
-- equivalent to the following:
maybeToEither' : Maybe a -> Either String a
maybeToEither' Nothing  = Left "Nope"
maybeToEither' (Just x) = Right x

As you can see, implicit arguments are wrapped in curly braces, unlike explicit named arguments, which are wrapped in parentheses. Inferring the value of an implicit argument is not always possible. For instance, if we enter the following at the REPL, Idris will fail with an error:

Tutorial.Functions2> show (maybeToEither Nothing)
Error: Can't find an implementation for Show (Either String ?a).

Idris is unable to find an implementation of Show (Either String a) without knowing what a actually is. Note the question mark in front of the type parameter: ?a. If this happens, there are several ways to help the type checker. We could, for instance, pass a value for the implicit argument explicitly. Here's the syntax to do this:

Tutorial.Functions2> show (maybeToEither {a = Int8} Nothing)
"Left "Nope""

As you can see, we use the same syntax as shown above for explicit named arguments and the two forms of argument passing can be mixed.

We could also specify the type of the whole expression using utility function the from the Prelude:

Tutorial.Functions2> show (the (Either String Int8) (maybeToEither Nothing))
"Left "Nope""

It is instructive to have a look at the type of the:

Tutorial.Functions2> :ti the
Prelude.the : (0 a : Type) -> a -> a

Compare this with the identity function id:

Tutorial.Functions2> :ti id
Prelude.id : {0 a : Type} -> a -> a

The only difference between the two: In case of the, the type parameter a is an explicit argument, while in case of id, it is an implicit argument. Although the two functions have almost identical types (and implementations!), they serve quite different purposes: the is used to help type inference, while id is used whenever we'd like to return an argument without modifying it at all (which, in the presence of higher-order functions, happens surprisingly often).

Both ways to improve type inference shown above are used quite often, and must be understood by Idris programmers.

Multiplicities

Finally, we need to talk about the zero multiplicity, which appeared in several of the type signatures in this section. Idris 2, unlike its predecessor Idris 1, is based on a core language called quantitative type theory (QTT): Every variable in Idris 2 is associated with one of three possible multiplicities:

  • 0, meaning that the variable is erased at runtime.
  • 1, meaning that the variable is used exactly once at runtime.
  • Unrestricted (the default), meaning that the variable is used an arbitrary number of times at runtime.

We will not talk about the most complex of the three, multiplicity 1, here. We are, however, often interested in multiplicity 0: A variable with multiplicity 0 is only relevant at compile time. It will not make any appearance at runtime, and the computation of such a variable will never affect a program's runtime performance.

In the type signature of maybeToEither we see that type parameter a has multiplicity 0, and will therefore be erased and is only relevant at compile time, while the Maybe a argument has unrestricted multiplicity.

It is also possible to annotate explicit arguments with multiplicities, in which case the argument must again be put in parentheses. For an example, look again at the type signature of the.

Underscores

It is often desirable, to only write as little code as necessary and let Idris figure out the rest. We have already learned about one such occasion: Catch-all patterns. If a variable in a pattern match is not used on the right hand side, we can't just drop it, as this would make it impossible for Idris to know, which of several arguments we were planning to drop, but we can use an underscore as a placeholder instead:

isRight : Either a b -> Bool
isRight (Right _) = True
isRight _         = False

But when we look at the type signature of isRight, we will note that type parameters a and b are also only used once, and are therefore of no importance. Let's get rid of them:

isRight' : Either _ _ -> Bool
isRight' (Right _) = True
isRight' _         = False

In the detailed type signature of zipEitherWith, it should be obvious for Idris that the implicit arguments are of type Type. After all, all of them are later on applied to the Either type constructor, which is of type Type -> Type -> Type. Let's get rid of them:

zipEitherWith'' :  {0 a : _}
                -> {0 b : _}
                -> {0 c : _}
                -> {0 e : _}
                -> (a -> b -> c)
                -> Either e a
                -> Either e b
                -> Either e c

Consider the following contrived example:

foo : Integer -> String
foo n = show (the (Either String Integer) (Right n))

Since we wrap an Integer in a Right, it is obvious that the second argument in Either String Integer is Integer. Only the String argument can't be inferred by Idris. Even better, the Either itself is obvious! Let's get rid of the unnecessary noise:

foo' : Integer -> String
foo' n = show (the (_ String _) (Right n))

Please note, that using underscores as in foo' is not always desirable, as it can quite drastically obfuscate the written code. Always use a syntactic convenience to make code more readable, and not to show people how clever you are.

Programming with Holes

module Tutorial.Functions2.Holes

%default total

Solved all the exercises so far? Got angry at the type checker for always complaining and never being really helpful? It's time to change that. Idris comes with several highly useful interactive editing features. Sometimes, the compiler is able to implement complete functions for us (if the types are specific enough). Even if that's not possible, there's an incredibly useful and important feature, which can help us when the types are getting too complicated: Holes. Holes are variables, the names of which are prefixed with a question mark. We can use them as placeholders whenever we plan to implement a piece of functionality at a later time. In addition, their types and the types and quantities of all other variables in scope can be inspected at the REPL (or in your editor, if you setup the necessary plugin). Let's see them holes in action.

Remember the traverseList example from an Exercise earlier in this section? If this was your first encounter with applicative list traversals, this might have been a nasty bit of work. Well, let's just make it a wee bit harder still. We'd like to implement the same piece of functionality for functions returning Either e, where e is a type with a Semigroup implementation, and we'd like to accumulate the values in all Lefts we meet along the way.

Here's the type of the function:

traverseEither :  Semigroup e
               => (a -> Either e b)
               -> List a
               -> Either e (List b)

As an optional exercise, you may wish to attempt this yourself first. You've seen everything you need. Consider:

  • semigroups have an append operation <+> : e -> e -> e that combines two values into one
  • the empty list will succeed vacuously
  • if any of the function applications fail, you'll return a consolidation of all of the errors e
  • if all of the function applications succeed, you'll return a list with all of the results b
  • if you get it to compile, there are some test functions and variables at the bottom of this section for you to confirm that it's working as intended

Now, in order to follow along, you might want to start your own Idris source file, load it into a REPL session and adjust the code as described here. The first thing we'll do, is write a skeleton implementation with a hole on the right hand side:

traverseEither fun as = ?impl

When you now go to the REPL and reload the file using command :r, you can enter :m to list all the metavariables:

Tutorial.Functions2> :m
1 hole:
  Tutorial.Functions2.impl : Either e (List b)

Next, we'd like to display the hole's type (including all variables in the surrounding context plus their types):

Tutorial.Functions2> :t impl
 0 b : Type
 0 a : Type
 0 e : Type
   as : List a
   fun : a -> Either e b
------------------------------
impl : Either e (List b)

So, we have some erased type parameters (a, b, and e), a value of type List a called as, and a function from a to Either e b called fun. Our goal is to come up with a value of type Either a (List b).

We could just return a Right [], but that only make sense if our input list is indeed the empty list. We therefore should start with a pattern match on the list:

traverseEither fun []        = ?impl_0
traverseEither fun (x :: xs) = ?impl_1

The result is two holes, which must be given distinct names. When inspecting impl_0, we get the following result:

Tutorial.Functions2> :t impl_0
 0 b : Type
 0 a : Type
 0 e : Type
   fun : a -> Either e b
------------------------------
impl_0 : Either e (List b)

Now, this is an interesting situation. We are supposed to come up with a value of type Either e (List b) with nothing to work with. We know nothing about a, so we can't provide an argument with which to invoke fun. Likewise, we know nothing about e or b either, so we can't produce any values of these either. The only option we have is to replace impl_0 with an empty list wrapped in a Right:

traverseEither fun []        = Right []

The non-empty case is of course slightly more involved. Here's the context of ?impl_1:

Tutorial.Functions2> :t impl_1
 0 b : Type
 0 a : Type
 0 e : Type
   x : a
   xs : List a
   fun : a -> Either e b
------------------------------
impl_1 : Either e (List b)

Since x is of type a, we can either use it as an argument to fun or drop and ignore it. xs, on the other hand, is the remainder of the list of type List a. We could again drop it or process it further by invoking traverseEither recursively. Since the goal is to try and convert all values, we should drop neither. Since in case of two Lefts we are supposed to accumulate the values, we eventually need to run both computations anyway (invoking fun, and recursively calling traverseEither). We therefore can do both at the same time and analyze the results in a single pattern match by wrapping both in a Pair:

traverseEither fun (x :: xs) =
  case (fun x, traverseEither fun xs) of
   p => ?impl_2

Once again, we inspect the context:

Tutorial.Functions2> :t impl_2
 0 b : Type
 0 a : Type
 0 e : Type
   xs : List a
   fun : a -> Either e b
   x : a
   p : (Either e b, Either e (List b))
------------------------------
impl_2 : Either e (List b)

We'll definitely need to pattern match on pair p next to figure out, which of the two computations succeeded:

traverseEither fun (x :: xs) =
  case (fun x, traverseEither fun xs) of
    (Left y, Left z)   => ?impl_6
    (Left y, Right _)  => ?impl_7
    (Right _, Left z)  => ?impl_8
    (Right y, Right z) => ?impl_9

At this point we might have forgotten what we actually wanted to do (at least to me, this happens annoyingly often), so we'll just quickly check what our goal is:

Tutorial.Functions2> :t impl_6
 0 b : Type
 0 a : Type
 0 e : Type
   xs : List a
   fun : a -> Either e b
   x : a
   y : e
   z : e
------------------------------
impl_6 : Either e (List b)

So, we are still looking for a value of type Either e (List b), and we have two values of type e in scope. According to the spec we want to accumulate these using es Semigroup implementation. We can proceed for the other cases in a similar manner, remembering that we should return a Right, if and only if all conversions where successful:

traverseEither fun (x :: xs) =
  case (fun x, traverseEither fun xs) of
    (Left y, Left z)   => Left (y <+> z)
    (Left y, Right _)  => Left y
    (Right _, Left z)  => Left z
    (Right y, Right z) => Right (y :: z)

To reap the fruits of our labour, let's show off with a small example:

data Nucleobase = Adenine | Cytosine | Guanine | Thymine

readNucleobase : Char -> Either (List String) Nucleobase
readNucleobase 'A' = Right Adenine
readNucleobase 'C' = Right Cytosine
readNucleobase 'G' = Right Guanine
readNucleobase 'T' = Right Thymine
readNucleobase c   = Left ["Unknown nucleobase: " ++ show c]

DNA : Type
DNA = List Nucleobase

readDNA : String -> Either (List String) DNA
readDNA = traverseEither readNucleobase . unpack

Let's try this at the REPL:

Tutorial.Functions2> readDNA "CGTTA"
Right [Cytosine, Guanine, Thymine, Thymine, Adenine]
Tutorial.Functions2> readDNA "CGFTAQ"
Left ["Unknown nucleobase: 'F'", "Unknown nucleobase: 'Q'"]

Interactive Editing

There are plugins available for several editors and programming environments, which facilitate interacting with the Idris compiler when implementing your functions. One editor, which is well supported in the Idris community, is Neovim. Since I am a Neovim user myself, I added some examples of what's possible to the appendix. Now would be a good time to start using the utilities discussed there.

If you use a different editor, probably with less support for the Idris programming language, you should at the very least have a REPL session open all the time, where the source file you are currently working on is loaded. This allows you to introduce new metavariables and inspect their types and context as you develop your code.

Conclusion

We again covered a lot of ground in this section. I can't stress enough that you should get yourselves accustomed to programming with holes and let the type checker help you figure out what to do next.

  • When in need of local utility functions, consider defining them as local definitions in a where block.

  • Use let expressions to define and reuse local variables.

  • Function arguments can be given a name, which can serve as documentation, can be used to pass arguments in any order, and is used to refer to them in dependent types.

  • Implicit arguments are wrapped in curly braces. The compiler is supposed to infer them from the context. If that's not possible, they can be passed explicitly as other named arguments.

  • Whenever possible, Idris adds implicit erased arguments for all type parameters automatically.

  • Quantities allow us to track how often a function argument is used. Quantity 0 means, the argument is erased at runtime.

  • Use holes as placeholders for pieces of code you plan to fill in at a later time. Use the REPL (or your editor) to inspect the types of holes together with the names, types, and quantities of all variables in their context.

What's next

In the next chapter we'll start using dependent types to help us write provably correct code. Having a good understanding of how to read Idris' type signatures will be of paramount importance there. Whenever you feel lost, add one or more holes and inspect their context to decide what to do next.

Dependent Types

The ability to calculate types from values, pass them as arguments to functions, and return them as results from functions - in short, being a dependently typed language - is one of the most distinguishing features of Idris. Many of the more advanced type level extensions of languages like Haskell (and quite a bit more) can be treated in one fell swoop with dependent types.

module Tutorial.Dependent

%default total

Consider the following functions:

bogusMapList : (a -> b) -> List a -> List b
bogusMapList _ _ = []

bogusZipList : (a -> b -> c) -> List a -> List b -> List c
bogusZipList _ _ _ = []

The implementations type check, and still, they are obviously not what users of our library would expect. In the first example, we'd expect the implementation to apply the function argument to all values stored in the list, without dropping any of them or changing their order. The second is trickier: The two list arguments might be of different length. What are we supposed to do when that's the case? Return a list of the same length as the smaller of the two? Return an empty list? Or shouldn't we in most use cases expect the two lists to be of the same length? How could we even describe such a precondition?

Length-Indexed Lists

module Tutorial.Dependent.LengthIndexedLists

%default total

The answer to the issues described above is of course: Dependent types. Before we proceed to our example, first consider how Idris recursively defines the natural numbers (here affixed with apostrophes to avoid introducing a conflict with the actual definition of Nat, which you can find here for reference)):

data Nat' : Type where
  Z' : Nat'
  S' : Nat' -> Nat'

In this scheme, 0 is represented by Z, 1 is represented by S Z, 2 is represented by S (S Z), and so on. Idris does this automatically so if you enter Z or S Z into the REPL, it will return 0 or 1. Note that the only function inherently available to act on a value of type Nat is our data constructor S, which represents the successor function, i.e. adding 1.

Also note that in Idris, every Nat can be represented as either a Z or an S n where n is another Nat. Much as every List a can be represented as either a Nil or an x :: xs (where x is an a and xs is a List a), this informs our pattern matching when solving problems.

Now we can consider the textbook introductory example of dependent types, the vector, which is a list indexed by its length:

public export
data Vect : (len : Nat) -> (a : Type) -> Type where
  Nil  : Vect 0 a
  (::) : (x : a) -> (xs : Vect n a) -> Vect (S n) a

Before we move on, please compare this with the implementation of Seq in the section about algebraic data types. The constructors are exactly the same: Nil and (::). But there is an important difference: Vect, unlike Seq or List, is not a function from Type to Type, it is a function from Nat to Type to Type. Go ahead! Open the REPL and verify this! The Nat argument (also called an index) represents the length of the vector here. Nil has type Vect 0 a: A vector of length zero. Cons has type a -> Vect n a -> Vect (S n) a: It is exactly one element longer (S n) than its second argument, which is of length n.

Let's experiment with this idea to gain a better understanding. There is only one way to come up with a vector of length zero:

ex1 : Vect 0 Integer
ex1 = Nil

The following, on the other hand, leads to a type error (a pretty complicated one, actually):

failing "Mismatch between: S ?n and 0."
  ex2 : Vect 0 Integer
  ex2 = [12]

The problem: [12] gets desugared to 12 :: Nil, but this has the wrong type! Since Nil has type Vect 0 Integer here, 12 :: Nil has type Vect (S 0) Integer, which is identical to Vect 1 Integer. Idris verifies, at compile time, that our vector is of the correct length!

ex3 : Vect 1 Integer
ex3 = [12]

So, we found a way to encode the length of a list-like data structure in its type, and it is a type error if the number of elements in a vector does not agree with then length given in its type. We will shortly see several use cases, where this additional piece of information allows us to be more precise in the types and rule out additional programming mistakes. But first, we need to quickly clarify some terminology.

Type Indices versus Type Parameters

Vect is not only a generic type, parameterized over the type of elements it holds, it is actually a family of types, each of them associated with a natural number representing it's length. We also say, the type family Vect is indexed by its length.

The difference between a type parameter and an index is, that the latter can and does change across data constructors, while the former is the same for all data constructors. Or, put differently, we can learn about the value of an index by pattern matching on a value of the type family, while this is not possible with a type parameter.

Let's demonstrate this with a contrived example:

data Indexed : Nat -> Type where
  I0 : Indexed 0
  I3 : Indexed 3
  I4 : String -> Indexed 4

Here, Indexed is indexed over its Nat argument, as values of the index changes across constructors (I chose some arbitrary value for each constructor), and we can learn about these values by pattern matching on Indexed values. We can use this, for instance, to create a Vect of the same length as the index of Indexed:

fromIndexed : Indexed n -> a -> Vect n a

Go ahead, and try implementing this yourself! Work with holes, pattern match on the Indexed argument, and learn about the expected output type in each case by inspecting the holes and their context.

Here is my implementation:

fromIndexed I0     va = []
fromIndexed I3     va = [va, va, va]
fromIndexed (I4 _) va = [va, va, va, va]

As you can see, by pattern matching on the value of the Indexed n argument, we learned about the value of the n index itself, which was necessary to return a Vect of the correct length.

Length-Preserving map

Function bogusMapList behaved unexpectedly, because it always returned the empty list. With Vect, we need to be true to the types here. If we map over a Vect, the argument and output type contain a length index, and these length indices will tell us exactly, if and how the lengths of our vectors are modified:

map3_1 : (a -> b) -> Vect 3 a -> Vect 1 b
map3_1 f [_,y,_] = [f y]

map5_0 : (a -> b) -> Vect 5 a -> Vect 0 b
map5_0 f _ = []

map5_10 : (a -> b) -> Vect 5 a -> Vect 10 b
map5_10 f [u,v,w,x,y] = [f u, f u, f v, f v, f w, f w, f x, f x, f y, f y]

While these examples are quite interesting, they are not really useful, are they? That's because they are too specialized. We'd like to have a general function for mapping vectors of any length. Instead of using concrete lengths in type signatures, we can also use variables as already seen in the definition of Vect. This allows us to declare the general case:

mapVect' : (a -> b) -> Vect n a -> Vect n b

This type describes a length-preserving map. It is actually more instructive (but not necessary) to include the implicit arguments as well:

mapVect : {0 a,b : _} -> {0 n : Nat} -> (a -> b) -> Vect n a -> Vect n b

We ignore the two type parameters a, and b, as these just describe a generic function (note, however, that we can group arguments of the same type and quantity in a single pair of curly braces; this is optional, but it sometimes helps making type signatures a bit shorter). The implicit argument of type Nat, however, tells us that the input and output Vect are of the same length. It is a type error to not uphold to this contract. When implementing mapVect, it is very instructive to follow along and use some holes. In order to get any information about the length of the Vect argument, we need to pattern match on it:

mapVect _ Nil       = ?impl_0
mapVect f (x :: xs) = ?impl_1

At the REPL, we learn the following:

Tutorial.Dependent> :t impl_0
 0 a : Type
 0 b : Type
 0 n : Nat
------------------------------
impl_0 : Vect 0 b


Tutorial.Dependent> :t impl_1
 0 a : Type
 0 b : Type
   x : a
   xs : Vect n a
   f : a -> b
 0 n : Nat
------------------------------
impl_1 : Vect (S n) b

The first hole, impl_0 is of type Vect 0 b. There is only one such value, as discussed above:

mapVect _ Nil       = Nil

The second case is again more interesting. We note, that xs is of type Vect n a, for an arbitrary length n (given as an erased argument), while the result is of type Vect (S n) b. So, the result has to be one element longer than xs. Luckily, we already have a value of type a (bound to variable x) and a function from a to b (bound to variable f), so we can apply f to x and prepend the result to a yet unknown remainder:

mapVect f (x :: xs) = f x :: ?rest

Let's inspect the new hole at the REPL:

Tutorial.Dependent> :t rest
 0 a : Type
 0 b : Type
   x : a
   xs : Vect n a
   f : a -> b
 0 n : Nat
------------------------------
rest : Vect n b

Now, we have a Vect n a and need a Vect n b, without knowing anything else about n. We could learn more about n by pattern matching further on xs, but this would quickly lead us down a rabbit hole, since after such a pattern match, we'd end up with another Nil case and another cons case, with a new tail of unknown length. Instead, we can invoke mapVect recursively to convert the remainder (xs) to a Vect n b. The type checker guarantees, that the lengths of xs and mapVect f xs are the same, so the whole expression type checks and we are done:

mapVect f (x :: xs) = f x :: mapVect f xs

Zipping Vectors

Let us now have a look at bogusZipList: We'd like to pairwise merge two lists holding elements of (possibly) distinct types through a given binary function. As discussed above, the most reasonable thing to do is to expect the two lists as well as the result to be of equal length. With Vect, this can be expressed and implemented as follows:

export
zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
zipWith f []        []         = Nil
zipWith f (x :: xs) (y :: ys)  = f x y :: zipWith f xs ys

Now, here is an interesting thing: The totality checker (activated throughout this source file due to the initial %default total pragma) accepts the above implementation as being total, although it is missing two more cases. This works, because Idris can figure out on its own, that the other two cases are impossible. From the pattern match on the first Vect argument, Idris learns whether n is zero or the successor of another natural number. But from this it can derive, whether the second vector, being also of length n, is a Nil or a cons. Still, it can be informative to add the impossible cases explicitly. We can use keyword impossible to do so:

zipWith _ [] (_ :: _) impossible
zipWith _ (_ :: _) [] impossible

It is - of course - a type error to annotate a case in a pattern match with impossible, if Idris cannot verify that this case is indeed impossible. We will learn in a later section what to do, when we think we are right about an impossible case and Idris is not.

Let's give zipWith a spin at the REPL:

Tutorial.Dependent> zipWith (*) [1,2,3] [10,20,30]
[10, 40, 90]
Tutorial.Dependent> zipWith (\x,y => x ++ ": " ++ show y) ["The answer"] [42]
["The answer: 42"]
Tutorial.Dependent> zipWith (*) [1,2,3] [10,20]
... Nasty type error ...

Simplifying Type Errors

It is amazing to experience the amount of work Idris can do for us and the amount of things it can infer on its own when things go well. When things don't go well, however, the error messages we get from Idris can be quite long and hard to understand, especially for programmers new to the language. For instance, the error message in the last REPL example above was pretty long, listing different things Idris tried to do together with the reason why each of them failed.

If this happens, it often means that a combination of a type error and an ambiguity resulting from overloaded function names is at work. In the example above, the two vectors are of distinct length, which leads to a type error if we interpret the list literals as vectors. However, list literals are overloaded to work with all data types with constructors Nil and (::), so Idris will now try other data constructors than those of Vect (the ones of List and Stream from the Prelude in this case), each of which will again fail with a type error since zipWith expects arguments of type Vect, and neither List nor Stream will work.

If this happens, prefixing overloaded function names with their namespaces can often simplify things, as Idris no longer needs to disambiguate these functions:

Tutorial.Dependent> zipWith (*) (Dependent.(::) 1 Dependent.Nil) Dependent.Nil
Error: When unifying:
    Vect 0 ?c
and:
    Vect 1 ?c
Mismatch between: 0 and 1.

Here, the message is much clearer: Idris can't unify the lengths of the two vectors. Unification means: Idris tries to at compile time convert two expressions to the same normal form. If this succeeds, the two expressions are considered to be equivalent, if it doesn't, Idris fails with a unification error.

As an alternative to prefixing overloaded functions with their namespace, we can use the to help with type inference:

Tutorial.Dependent> zipWith (*) (the (Vect 3 _) [1,2,3]) (the (Vect 2 _) [10,20])
Error: When unifying:
    Vect 2 ?c
and:
    Vect 3 ?c
Mismatch between: 0 and 1.

It is interesting to note, that the error above is not "Mismatch between: 2 and 3" but "Mismatch between: 0 and 1" instead. Here's what's going on: Idris tries to unify integer literals 2 and 3, which are first converted to the corresponding Nat values S (S Z) and S (S (S Z)), respectively. The two patterns match until we arrive at Z vs S Z, corresponding to values 0 and 1, which is the discrepancy reported in the error message.

Creating Vectors

So far, we were able to learn something about the lengths of vectors by pattern matching on them. In the Nil case, it was clear that the length is 0, while in the cons case the length was the successor of another natural number. This is not possible when we want to create a new vector:

failing "Mismatch between: S ?n and n."
  fill : a -> Vect n a

You will have a hard time implementing fill. The following, for instance, leads to a type error:

  fill va = [va,va]

The problem is, that the callers of our function decide about the length of the resulting vector. The full type of fill is actually the following:

fill' : {0 a : Type} -> {0 n : Nat} -> a -> Vect n a

You can read this type as follows: For every type a and for every natural number n (about which I know nothing at runtime, since it has quantity zero), given a value of type a, I'll give you a vector holding exactly n elements of type a. This is like saying: "Think about a natural number n, and I'll give you n apples without you telling me the value of n". Idris is powerful, but it is not a clairvoyant.

In order to implement fill, we need to know what n actually is: We need to pass n as an explicit, unerased argument, which will allow us to pattern match on it and decide - based on this pattern match - which constructors of Vect to use:

export
replicate : (n : Nat) -> a -> Vect n a

Now, replicate is a dependent function type: The output type depends on the value of one of the arguments. It is straight forward to implement replicate by pattern matching on n:

replicate 0     _  = []
replicate (S k) va = va :: replicate k va

This is a pattern that comes up often when working with indexed types: We can learn about the values of the indices by pattern matching on the values of the type family. However, in order to return a value of the type family from a function, we need to either know the values of the indices at compile time (see constants ex1 or ex3, for instance), or we need to have access to the values of the indices at runtime, in which case we can pattern match on them and learn from this, which constructor(s) of the type family to use.

Exercises part 1

  1. Implement a function len : List a -> Nat for calculating the length of a List. For example, len [1, 1, 1] produces 3.

  2. Implement function head for non-empty vectors:

    head : Vect (S n) a -> a
    

    Note, how we can describe non-emptiness by using a pattern in the length of Vect. This rules out the Nil case, and we can return a value of type a, without having to wrap it in a Maybe! Make sure to add an impossible clause for the Nil case (although this is not strictly necessary here).

  3. Using head as a reference, declare and implement function tail for non-empty vectors. The types should reflect that the output is exactly one element shorter than the input.

  4. Implement zipWith3. If possible, try to doing so without looking at the implementation of zipWith:

    zipWith3 : (a -> b -> c -> d) -> Vect n a -> Vect n b -> Vect n c -> Vect n d
    
  5. Declare and implement a function foldSemi for accumulating the values stored in a List through Semigroups append operator ((<+>)). (Make sure to only use a Semigroup constraint, as opposed to a Monoid constraint.)

  6. Do the same as in Exercise 4, but for non-empty vectors. How does a vector's non-emptiness affect the output type?

  7. Given an initial value of type a and a function a -> a, we'd like to generate Vects of as, the first value of which is a, the second value being f a, the third being f (f a) and so on.

    For instance, if a is 1 and f is (* 2), we'd like to get results similar to the following: [1,2,4,8,16,...].

    Declare and implement function iterate, which should encapsulate this behavior. Get some inspiration from replicate if you don't know where to start.

  8. Given an initial value of a state type s and a function fun : s -> (s,a), we'd like to generate Vects of as. Declare and implement function generate, which should encapsulate this behavior. Make sure to use the updated state in every new invocation of fun.

    Here's an example how this can be used to generate the first n Fibonacci numbers:

    generate 10 (\(x,y) => let z = x + y in ((y,z),z)) (0,1)
    [1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
    
  9. Implement function fromList, which converts a list of values to a Vect of the same length. Use holes if you get stuck:

    fromList : (as : List a) -> Vect (length as) a
    

    Note how, in the type of fromList, we can calculate the length of the resulting vector by passing the list argument to function length.

  10. Consider the following declarations:

maybeSize : Maybe a -> Nat

fromMaybe : (m : Maybe a) -> Vect (maybeSize m) a

Choose a reasonable implementation for maybeSize and implement fromMaybe afterwards.

Fin: Safe Indexing into Vectors

module Tutorial.Dependent.Fin

import Tutorial.Dependent.LengthIndexedLists

%default total

Consider function index, which tries to extract a value from a List at the given position:

indexList : (pos : Nat) -> List a -> Maybe a
indexList _     []        = Nothing
indexList 0     (x :: _)  = Just x
indexList (S k) (_ :: xs) = indexList k xs

Now, here is a thing to consider when writing functions like indexList: Do we want to express the possibility of failure in the output type, or do we want to restrict the accepted arguments, so the function can no longer fail? These are important design decisions, especially in larger applications. Returning a Maybe or Either from a function forces client code to eventually deal with the Nothing or Left case, and until this happens, all intermediary results will carry the Maybe or Either stain, which will make it more cumbersome to run calculations with these intermediary results. On the other hand, restricting the values accepted as input will complicate the argument types and will put the burden of input validation on our functions' callers, (although, at compile time we can get help from Idris, as we will see when we talk about auto implicits) while keeping the output pure and clean.

Languages without dependent types (like Haskell), can often only take the route described above: To wrap the result in a Maybe or Either. However, in Idris we can often refine the input types to restrict the set of accepted values, thus ruling out the possibility of failure.

Assume, as an example, we'd like to extract a value from a Vect n a at (zero-based) index k. Surely, this can succeed if and only if k is a natural number strictly smaller than the length n of the vector. Luckily, we can express this precondition in an indexed type:

data Fin : (n : Nat) -> Type where
  FZ : {0 n : Nat} -> Fin (S n)
  FS : (k : Fin n) -> Fin (S n)

Fin n is the type of natural numbers strictly smaller than n. It is defined inductively: FZ corresponds to natural number zero, which, as can be seen in its type, is strictly smaller than S n for any natural number n. FS is the inductive case: If k is strictly smaller than n (k being of type Fin n), then FS k is strictly smaller than S n.

Let's come up with some values of type Fin:

fin0_5 : Fin 5
fin0_5 = FZ

fin0_7 : Fin 7
fin0_7 = FZ

fin1_3 : Fin 3
fin1_3 = FS FZ

fin4_5 : Fin 5
fin4_5 = FS (FS (FS (FS FZ)))

Note, that there is no value of type Fin 0. We will learn in a later session, how to express "there is no value of type x" in a type.

Let us now check, whether we can use Fin to safely index into a Vect:

index : Fin n -> Vect n a -> a

Before you continue, try to implement index yourself, making use of holes if you get stuck.

index FZ     (x :: _) = x
index (FS k) (_ :: xs) = index k xs

Note, how there is no Nil case and the totality checker is still happy. That's because Nil is of type Vect 0 a, but there is no value of type Fin 0! We can verify this by adding the missing impossible clauses:

index FZ     Nil impossible
index (FS _) Nil impossible

Exercises part 2

  1. Implement function update, which, given a function of type a -> a, updates the value in aVect n a at position k < n.

  2. Implement function insert, which inserts a value of type a at position k <= n in a Vect n a. Note, that k is the index of the freshly inserted value, so that the following holds:

    index k (insert k v vs) = v
    
  3. Implement function delete, which deletes a value from a vector at the given index.

    This is trickier than Exercises 1 and 2, as we have to properly encode in the types that the vector is getting one element shorter.

  4. We can use Fin to implement safe indexing into Lists as well. Try to come up with a type and implementation for safeIndexList.

    Note: If you don't know how to start, look at the type of fromList for some inspiration. You might also need give the arguments in a different order than for index.

  5. Implement function finToNat, which converts a Fin n to the corresponding natural number, and use this to declare and implement function take for splitting of the first k elements of a Vect n a with k <= n.

  6. Implement function minus for subtracting a value k from a natural number n with k <= n.

  7. Use minus from Exercise 6 to declare and implement function drop, for dropping the first k values from a Vect n a, with k <= n.

  8. Implement function splitAt for splitting a Vect n a at position k <= n, returning the prefix and suffix of the vector wrapped in a pair.

    Hint: Use take and drop in your implementation.

Hint: Since Fin n consists of the values strictly smaller than n, Fin (S n) consists of the values smaller than or equal to n.

Note: Functions take, drop, and splitAt, while correct and provably total, are rather cumbersome to type. There is an alternative way to declare their types, as we will see in the next section.

Compile-Time Computations

module Tutorial.Dependent.Comptime

import Tutorial.Dependent.LengthIndexedLists

%default total

In the last section - especially in some of the exercises - we started more and more to use compile time computations to describe the types of our functions and values. This is a very powerful concept, as it allows us to compute output types from input types. Here's an example:

It is possible to concatenate two Lists with the (++) operator. Surely, this should also be possible for Vect. But Vect is indexed by its length, so we have to reflect in the types exactly how the lengths of the inputs affect the lengths of the output. Here's how to do this:

(++) : Vect m a -> Vect n a -> Vect (m + n) a
(++) []        ys = ys
(++) (x :: xs) ys = x :: (xs ++ ys)

Note, how we keep track of the lengths at the type-level, again ruling out certain common programming errors like inadvertently dropping some values.

We can also use type-level computations as patterns on the input types. Here is an alternative type and implementation for drop, which you implemented in the exercises by using a Fin n argument:

drop' : (m : Nat) -> Vect (m + n) a -> Vect n a
drop' 0     xs        = xs
drop' (S k) (_ :: xs) = drop' k xs

Note that changing the order from (m + n) to (n + m) in the second parameter will cause an error at the second xs:

While processing right hand side of drop'. Can't solve constraint between: plus n 0 and n.

You will learn why in the next section.

Limitations

After all the examples and exercises in this section you might have come to the conclusion that we can use arbitrary expressions in the types and Idris will happily evaluate and unify all of them for us.

I'm afraid that's not even close to the truth. The examples in this section were hand-picked because they are known to just work. The reason being, that there was always a direct link between our own pattern matches and the implementations of functions we used at compile time.

For instance, here is the implementation of addition of natural numbers:

add : Nat -> Nat -> Nat
add Z     n = n
add (S k) n = S $ add k n

As you can see, add is implemented via a pattern match on its first argument, while the second argument is never inspected. Note, how this is exactly how (++) for Vect is implemented: There, we also pattern match on the first argument, returning the second unmodified in the Nil case, and prepending the head to the result of appending the tail in the cons case. Since there is a direct correspondence between the two pattern matches, it is possible for Idris to unify 0 + n with n in the Nil case, and (S k) + n with S (k + n) in the cons case.

Here is a simple example, where Idris will not longer be convinced without some help from us:

failing "Can't solve constraint"
  reverse : Vect n a -> Vect n a
  reverse []        = []
  reverse (x :: xs) = reverse xs ++ [x]

When we type-check the above, Idris will fail with the following error message: "Can't solve constraint between: plus n 1 and S n." Here's what's going on: From the pattern match on the left hand side, Idris knows that the length of the vector is S n, for some natural number n corresponding to the length of xs. The length of the vector on the right hand side is n + 1, according to the type of (++) and the lengths of xs and [x]. Overloaded operator (+) is implemented via function Prelude.plus, that's why Idris replaces (+) with plus in the error message.

As you can see from the above, Idris can't verify on its own that 1 + n is the same thing as n + 1. It can accept some help from us, though. If we come up with a proof that the above equality holds (or - more generally - that our implementation of addition for natural numbers is commutative), we can use this proof to rewrite the types on the right hand side of reverse. Writing proofs and using rewrite will require some in-depth explanations and examples. Therefore, these things will have to wait until another chapter.

Unrestricted Implicits

In functions like replicate, we pass a natural number n as an explicit, unrestricted argument from which we infer the length of the vector to return. In some circumstances, n can be inferred from the context. For instance, in the following example it is tedious to pass n explicitly:

ex4 : Vect 3 Integer
ex4 = zipWith (*) (replicate 3 10) (replicate 3 11)

The value n is clearly derivable from the context, which can be confirmed by replacing it with underscores:

ex5 : Vect 3 Integer
ex5 = zipWith (*) (replicate _ 10) (replicate _ 11)

We therefore can implement an alternative version of replicate, where we pass n as an implicit argument of unrestricted quantity:

replicate' : {n : _} -> a -> Vect n a
replicate' = replicate n

Note how, in the implementation of replicate', we can refer to n and pass it as an explicit argument to replicate.

Deciding whether to pass potentially inferable arguments to a function implicitly or explicitly is a question of how often the arguments actually are inferable by Idris. Sometimes it might even be useful to have both versions of a function. Remember, however, that even in case of an implicit argument we can still pass the value explicitly:

ex6 : Vect ? Bool
ex6 = replicate' {n = 2} True

In the type signature above, the question mark (?) means, that Idris should try and figure out the value on its own by unification. This forces us to specify n explicitly on the right hand side of ex6.

Pattern Matching on Implicits

The implementation of replicate' makes use of function replicate, where we could pattern match on the explicit argument n. However, it is also possible to pattern match on implicit, named arguments of non-zero quantity:

replicate'' : {n : _} -> a -> Vect n a
replicate'' {n = Z}   _ = Nil
replicate'' {n = S _} v = v :: replicate'' v

Exercises part 3

  1. Here is a function declaration for flattening a List of Lists:

    flattenList : List (List a) -> List a
    

    Implement flattenList and declare and implement a similar function flattenVect for flattening vectors of vectors.

  2. Implement functions take' and splitAt' like in the exercises of the previous section but using the technique shown for drop'.

  3. Implement function transpose for converting an m x n-matrix (represented as a Vect m (Vect n a)) to an n x m-matrix.

    Note: This might be a challenging exercise, but make sure to give it a try. As usual, make use of holes if you get stuck!

    Here is an example how this should work in action:

    Solutions.Dependent> transpose [[1,2,3],[4,5,6]]
    [[1, 4], [2, 5], [3, 6]]
    

Conclusion

  • Dependent types allow us to calculate types from values. This makes it possible to encode properties of values at the type-level and verify these properties at compile time.

  • Length-indexed lists (vectors) let us rule out certain implementation errors, by forcing us to be precise about the lengths of input and output vectors.

  • We can use patterns in type signatures, for instance to express that the length of a vector is non-zero and therefore, the vector is non-empty.

  • When creating values of a type family, the values of the indices need to be known at compile time, or they need to be passed as arguments to the function creating the values, where we can pattern match on them to figure out, which constructors to use.

  • We can use Fin n, the type of natural numbers strictly smaller than n, to safely index into a vector of length n.

  • Sometimes, it is convenient to pass inferable arguments as non-erased implicits, in which case we can still inspect them by pattern matching or pass them to other functions, while Idris will try and fill in the values for us.

Note, that data type Vect together with many of the functions we implemented here is available from module Data.Vect from the base library. Likewise, Fin is available from Data.Fin from base.

What's next

In the next section, it is time to learn how to write effectful programs and how to do this while still staying pure.

IO: Programming with Side Effects

So far, all our examples and exercises dealt with pure, total functions. We didn't read or write content from or to files, nor did we write any messages to the standard output. It is time to change that and learn, how we can write effectful programs in Idris.

Pure Side Effects?

module Tutorial.IO.PureSideEffects

import Data.List1
import Data.String
import Data.Vect

%default total

If we once again look at the hello world example from the introduction, it had the following type and implementation:

hello : IO ()
hello = putStrLn "Hello World!"

If you load this module in a REPL session and evaluate hello, you'll get the following:

Tutorial.IO> hello
MkIO (prim__putStr "Hello World!")

This might not be what you expected, given that we'd actually wanted the program to just print "Hello World!". In order to explain what's going on here, we need to quickly look at how evaluation at the REPL works.

When we evaluate some expression at the REPL, Idris tries to reduce it to a value until it gets stuck somewhere. In the above case, Idris gets stuck at function prim__putStr. This is a foreign function defined in the Prelude, which has to be implemented by each backend in order to be available there. At compile time (and at the REPL), Idris knows nothing about the implementations of foreign functions and therefore can't reduce foreign function calls, unless they are built into the compiler itself. But even then, values of type IO a (a being a type parameter) are typically not reduced.

It is important to understand that values of type IO a describe a program, which, when being executed, will return a value of type a, after performing arbitrary side effects along the way. For instance, putStrLn has type String -> IO (). Read this as: "putStrLn is a function, which, when given a String argument, will return a description of an effectful program with an output type of ()". (() is syntactic sugar for type Unit, the empty tuple defined at the Prelude, which has only one value called MkUnit, for which we can also use () in our code.)

Since values of type IO a are mere descriptions of effectful computations, functions returning such values or taking such values as arguments are still pure and thus referentially transparent. It is, however, not possible to extract a value of type a from a value of type IO a, that is, there is no generic function IO a -> a, as such a function would inadvertently execute the side effects when extracting the result from its argument, thus breaking referential transparency. (Actually, there is such a function called unsafePerformIO. Do not ever use it in your code unless you know what you are doing.)

Do Blocks

If you are new to pure functional programming, you might now - rightfully - mumble something about how useless it is to have descriptions of effectful programs without being able to run them. So please, hear me out. While we are not able to run values of type IO a when writing programs, that is, there is no function of type IO a -> a, we are able to chain such computations and describe more complex programs. Idris provides special syntax for this: Do blocks. Here's an example:

export
readHello : IO ()
readHello = do
  name <- getLine
  putStrLn $ "Hello " ++ name ++ "!"

Before we talk about what's going on here, let's give this a go at the REPL:

Tutorial.IO> :exec readHello
Stefan
Hello Stefan!

This is an interactive program, which will read a line from standard input (getLine), assign the result to variable name, and then use name to create a friendly greeting and write it to standard output.

Note the do keyword at the beginning of the implementation of readHello: It starts a do block, where we can chain IO computations and bind intermediary results to variables using arrows pointing to the left (<-), which can then be used in later IO actions. This concept is powerful enough to let us encapsulate arbitrary programs with side effects in a single value of type IO. Such a description can then be returned by function main, the main entry point to an Idris program, which is being executed when we run a compiled Idris binary.

The Difference between Program Description and Execution

In order to better understand the difference between describing an effectful computation and executing or running it, here is a small program:

launchMissiles : IO ()
launchMissiles = putStrLn "Boom! You're dead."

export
friendlyReadHello : IO ()
friendlyReadHello = do
  _ <- putStrLn "Please enter your name."
  readHello

actions : Vect 3 (IO ())
actions = [launchMissiles, friendlyReadHello, friendlyReadHello]

runActions : Vect (S n) (IO ()) -> IO ()
runActions (_ :: xs) = go xs
  where go : Vect k (IO ()) -> IO ()
        go []        = pure ()
        go (y :: ys) = do
          _ <- y
          go ys

readHellos : IO ()
readHellos = runActions actions

Before I explain what the code above does, please note function pure used in the implementation of runActions. It is a constrained function, about which we will learn in the next chapter. Specialized to IO, it has generic type a -> IO a: It allows us to wrap a value in an IO action. The resulting IO program will just return the wrapped value without performing any side effects. We can now look at the big picture of what's going on in readHellos.

First, we define a friendlier version of readHello: When executed, this will ask about our name explicitly. Since we will not use the result of putStrLn any further, we can use an underscore as a catch-all pattern here. Afterwards, readHello is invoked. We also define launchMissiles, which, when being executed, will lead to the destruction of planet earth.

Now, runActions is the function we use to demonstrate that describing an IO action is not the same as running it. It will drop the first action from the non-empty vector it takes as its argument and return a new IO action, which describes the execution of the remaining IO actions in sequence. If this behaves as expected, the first IO action passed to runActions should be silently dropped together with all its potential side effects.

When we execute readHellos at the REPL, we will be asked for our name twice, although actions also contains launchMissiles at the beginning. Luckily, although we described how to destroy the planet, the action was not executed, and we are (probably) still here.

From this example we learn several things:

  • Values of type IO a are pure descriptions of programs, which, when being executed, perform arbitrary side effects before returning a value of type a.

  • Values of type IO a can be safely returned from functions and passed around as arguments or in data structures, without the risk of them being executed.

  • Values of type IO a can be safely combined in do blocks to describe new IO actions.

  • An IO action will only ever get executed when it's passed to :exec at the REPL, or when it is the main function of a compiled Idris program that is being executed.

  • It is not possible to ever break out of the IO context: There is no function of type IO a -> a, as such a function would need to execute its argument in order to extract the final result, and this would break referential transparency.

Combining Pure Code with IO Actions

The title of this subsection is somewhat misleading. IO actions are pure values, but what is typically meant here, is that we combine non-IO functions with effectful computations.

As a demonstration, in this section we are going to write a small program for evaluating arithmetic expressions. We are going to keep things simple and allow only expressions with a single operator and two arguments, both of which must be integers, for instance 12 + 13.

We are going to use function split from Data.String in base to tokenize arithmetic expressions. We are then trying to parse the two integer values and the operator. These operations might fail, since user input can be invalid, so we also need an error type. We could actually just use String, but I consider it to be good practice to use custom sum types for erroneous conditions.

public export
data Error : Type where
  NotAnInteger    : (value : String) -> Error
  UnknownOperator : (value : String) -> Error
  ParseError      : (input : String) -> Error

dispError : Error -> String
dispError (NotAnInteger v)    = "Not an integer: " ++ v ++ "."
dispError (UnknownOperator v) = "Unknown operator: " ++ v ++ "."
dispError (ParseError v)      = "Invalid expression: " ++ v ++ "."

In order to parse integer literals, we use function parseInteger from Data.String:

export
readInteger : String -> Either Error Integer
readInteger s = maybe (Left $ NotAnInteger s) Right $ parseInteger s

Likewise, we declare and implement a function for parsing arithmetic operators:

export
readOperator : String -> Either Error (Integer -> Integer -> Integer)
readOperator "+" = Right (+)
readOperator "*" = Right (*)
readOperator s   = Left (UnknownOperator s)

We are now ready to parse and evaluate simple arithmetic expressions. This consists of several steps (splitting the input string, parsing each literal), each of which can fail. Later, when we learn about monads, we will see that do blocks can be used in such occasions just as well. However, in this case we can use an alternative syntactic convenience: Pattern matching in let bindings. Here is the code:

eval : String -> Either Error Integer
eval s =
  let [x,y,z]  := forget $ split isSpace s | _ => Left (ParseError s)
      Right v1 := readInteger x  | Left e => Left e
      Right op := readOperator y | Left e => Left e
      Right v2 := readInteger z  | Left e => Left e
   in Right $ op v1 v2

Let's break this down a bit. On the first line, we split the input string at all whitespace occurrences. Since split returns a List1 (a type for non-empty lists exported from Data.List1 in base) but pattern matching on List is more convenient, we convert the result using Data.List1.forget. Note, how we use a pattern match on the left hand side of the assignment operator :=. This is a partial pattern match (partial meaning, that it doesn't cover all possible cases), therefore we have to deal with the other possibilities as well, which is done after the vertical line. This can be read as follows: "If the pattern match on the left hand side is successful, and we get a list of exactly three tokens, continue with the let expression, otherwise return a ParseError in a Left immediately".

The other three lines behave exactly the same: Each has a partial pattern match on the left hand side with instructions what to return in case of invalid input after the vertical bar. We will later see, that this syntax is also available in do blocks.

Note, how all of the functionality implemented so far is pure, that is, it does not describe computations with side effects. (One could argue that already the possibility of failure is an observable effect, but even then, the code above is still referentially transparent, can be easily tested at the REPL, and evaluated at compile time, which is the important thing here.)

Finally, we can wrap this functionality in an IO action, which reads a string from standard input and tries to evaluate the arithmetic expression:

exprProg : IO ()
exprProg = do
  s <- getLine
  case eval s of
    Left err  => do
      putStrLn "An error occured:"
      putStrLn (dispError err)
    Right res => putStrLn (s ++ " = " ++ show res)

Note, how in exprProg we were forced to deal with the possibility of failure and handle both constructors of Either differently in order to print a result. Note also, that do blocks are ordinary expressions, and we can, for instance, start a new do block on the right hand side of a case expression.

Exercises part 1

In these exercises, you are going to implement some small command-line applications. Some of these will potentially run forever, as they will only stop when the user enters a keyword for quitting the application. Such programs are no longer provably total. If you added the %default total pragma at the top of your source file, you'll need to annotate these functions with covering, meaning that you covered all cases in all pattern matches but your program might still loop due to unrestricted recursion.

  1. Implement function rep, which will read a line of input from the terminal, evaluate it using the given function, and print the result to standard output:

    rep : (String -> String) -> IO ()
    
  2. Implement function repl, which behaves just like rep but will repeat itself forever (or until being forcefully terminated):

    covering
    repl : (String -> String) -> IO ()
    
  3. Implement function replTill, which behaves just like repl but will only continue looping if the given function returns a Right. If it returns a Left, replTill should print the final message wrapped in the Left and then stop.

    covering
    replTill : (String -> Either String String) -> IO ()
    
  4. Write a program, which reads arithmetic expressions from standard input, evaluates them using eval, and prints the result to standard output. The program should loop until users stops it by entering "done", in which case the program should terminate with a friendly greeting. Use replTill in your implementation.

  5. Implement function replWith, which behaves just like repl but uses some internal state to accumulate values. At each iteration (including the very first one!), the current state should be printed to standard output using function dispState, and the next state should be computed using function next. The loop should terminate in case of a Left and print a final message using dispResult:

    covering
    replWith :  (state      : s)
             -> (next       : s -> String -> Either res s)
             -> (dispState  : s -> String)
             -> (dispResult : res -> s -> String)
             -> IO ()
    
  6. Use replWith from Exercise 5 to write a program for reading natural numbers from standard input and printing the accumulated sum of these numbers. The program should terminate in case of invalid input and if a user enters "done".

Do Blocks, Desugared

module Tutorial.IO.DoUnsugared

import Data.List1
import Data.String
import Data.Vect

import Tutorial.IO.PureSideEffects

%default total

Here's an important piece of information: There is nothing special about do blocks. They are just syntactic sugar, which is converted to a sequence of operator applications. With syntactic sugar, we mean syntax in a programming language that makes it easier to express certain things in that language without making the language itself any more powerful or expressive. Here, it means you could write all the IO programs without using do notation, but the code you'll write will sometimes be harder to read, so do blocks provide nicer syntax for these occasions.

Consider the following example program:

sugared1 : IO ()
sugared1 = do
  str1 <- getLine
  str2 <- getLine
  str3 <- getLine
  putStrLn (str1 ++ str2 ++ str3)

The compiler will convert this to the following program before disambiguating function names and type checking:

desugared1 : IO ()
desugared1 =
  getLine >>= (\str1 =>
    getLine >>= (\str2 =>
      getLine >>= (\str3 =>
        putStrLn (str1 ++ str2 ++ str3)
      )
    )
  )

There is a new operator ((>>=)) called bind in the implementation of desugared1. If you look at its type at the REPL, you'll see the following:

Main> :t (>>=)
Prelude.>>= : Monad m => m a -> (a -> m b) -> m b

This is a constrained function requiring an interface called Monad. We will talk about Monad and some of its friends in the next chapter. Specialized to IO, bind has the following type:

Main> :t (>>=) {m = IO}
>>= : IO a -> (a -> IO b) -> IO b

This describes a sequencing of IO actions. Upon execution, the first IO action is being run and its result is being passed as an argument to the function generating the second IO action, which is then also being executed.

You might remember, that you already implemented something similar in an earlier exercise: In Algebraic Data Types, you implemented bind for Maybe and Either e. We will learn in the next chapter, that Maybe and Either e too come with an implementation of Monad. For now, suffice to say that Monad allows us to run computations with some kind of effect in sequence by passing the result of the first computation to the function returning the second computation. In desugared1 you can see, how we first perform an IO action and use its result to compute the next IO action and so on. The code is somewhat hard to read, since we use several layers of nested anonymous function, that's why in such cases, do blocks are a nice alternative to express the same functionality.

Since do block are always desugared to sequences of applied bind operators, we can use them to chain any monadic computation. For instance, we can rewrite function eval by using a do block like so:

evalDo : String -> Either Error Integer
evalDo s = case forget $ split isSpace s of
  [x,y,z] => do
    v1 <- readInteger x
    op <- readOperator y
    v2 <- readInteger z
    Right $ op v1 v2
  _       => Left (ParseError s)

Don't worry, if this doesn't make too much sense yet. We will see many more examples, and you'll get the hang of this soon enough. The important thing to remember is how do blocks are always converted to sequences of bind operators as shown in desugared1.

Binding Unit

Remember our implementation of friendlyReadHello? Here it is again:

friendlyReadHello' : IO ()
friendlyReadHello' = do
  _ <- putStrLn "Please enter your name."
  readHello

The underscore in there is a bit ugly and unnecessary. In fact, a common use case is to just chain effectful computations with result type Unit (()), merely for the side effects they perform. For instance, we could repeat friendlyReadHello three times, like so:

friendly3 : IO ()
friendly3 = do
  _ <- friendlyReadHello
  _ <- friendlyReadHello
  friendlyReadHello

This is such a common thing to do, that Idris allows us to drop the bound underscores altogether:

friendly4 : IO ()
friendly4 = do
  friendlyReadHello
  friendlyReadHello
  friendlyReadHello
  friendlyReadHello

Note, however, that the above gets desugared slightly differently:

friendly4Desugared : IO ()
friendly4Desugared =
  friendlyReadHello >>
  friendlyReadHello >>
  friendlyReadHello >>
  friendlyReadHello

Operator (>>) has the following type:

Main> :t (>>)
Prelude.>> : Monad m => m () -> Lazy (m b) -> m b

Note the Lazy keyword in the type signature. This means, that the wrapped argument will be lazily evaluated. This makes sense in many occasions. For instance, if the Monad in question is Maybe the result will be Nothing if the first argument is Nothing, in which case there is no need to even evaluate the second argument.

Do, Overloaded

Because Idris supports function and operator overloading, we can write custom bind operators, which allows us to use do notation for types without an implementation of Monad. For instance, here is a custom implementation of (>>=) for sequencing computations returning vectors. Every value in the first vector (of length m) will be converted to a vector of length n, and the results will be concatenated leading to a vector of length m * n:

flatten : Vect m (Vect n a) -> Vect (m * n) a
flatten []        = []
flatten (x :: xs) = x ++ flatten xs

(>>=) : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
as >>= f = flatten (map f as)

It is not possible to write an implementation of Monad, which encapsulates this behavior, as the types wouldn't match: Monadic bind specialized to Vect has type Vect k a -> (a -> Vect k b) -> Vect k b. As you see, the sizes of all three occurrences of Vect have to be the same, which is not what we expressed in our custom version of bind. Here is an example to see this in action:

modString : String -> Vect 4 String
modString s = [s, reverse s, toUpper s, toLower s]

testDo : Vect 24 String
testDo = DoUnsugared.do
  s1 <- ["Hello", "World"]
  s2 <- [1, 2, 3]
  modString (s1 ++ show s2)

Try to figure out how testDo works by desugaring it manually and then comparing its result with what you expected at the REPL. Note, how we helped Idris disambiguate, which version of the bind operator to use by prefixing the do keyword with part of the operator's namespace. In this case, this wasn't strictly necessary, although Vect k does have an implementation of Monad, but it is still good to know that it is possible to help the compiler with disambiguating do blocks.

Of course, we can (and should!) overload (>>) in the same manner as (>>=), if we want to overload the behavior of do blocks.

Modules and Namespaces

Every data type, function, or operator can be unambiguously identified by prefixing it with its namespace. A function's namespace typically is the same as the module where it was defined. For instance, the fully qualified name of function eval would be Tutorial.IO.eval. Function and operator names must be unique in their namespace.

As we already learned, Idris can often disambiguate between functions with the same name but defined in different namespaces based on the types involved. If this is not possible, we can help the compiler by prefixing the function or operator name with a suffix of the full namespace. Let's demonstrate this at the REPL:

Tutorial.IO> :t (>>=)
Prelude.>>= : Monad m => m a -> (a -> m b) -> m b
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b

As you can see, if we load this module in a REPL session and inspect the type of (>>=), we get two results as two operators with this name are in scope. If we only want the REPL to print the type of our custom bind operator, is is sufficient to prefix it with IO, although we could also prefix it with its full namespace:

Tutorial.IO> :t IO.(>>=)
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b
Tutorial.IO> :t Tutorial.IO.(>>=)
Tutorial.IO.>>= : Vect m a -> (a -> Vect n b) -> Vect (m * n) b

Since function names must be unique in their namespace and we still may want to define two overloaded versions of a function in an Idris module, Idris makes it possible to add additional namespaces to modules. For instance, in order to define another function called eval, we need to add it to its own namespace (note, that all definitions in a namespace must be indented by the same amount of whitespace):

namespace Foo
  export
  eval : Nat -> Nat -> Nat
  eval = (*)

-- prefixing `eval` with its namespace is not strictly necessary here
testFooEval : Nat
testFooEval = Foo.eval 12 100

Now, here is an important thing: For functions and data types to be accessible from outside their namespace or module, they need to be exported by annotating them with the export or public export keywords.

The difference between export and public export is the following: A function annotated with export exports its type and can be called from other namespaces. A data type annotated with export exports its type constructor but not its data constructors. A function annotated with public export also exports its implementation. This is necessary to use the function in compile-time computations. A data type annotated with public export exports its data constructors as well.

In general, consider annotating data types with public export, since otherwise you will not be able to create values of these types or deconstruct them in pattern matches. Likewise, unless you plan to use your functions in compile-time computations, annotate them with export.

Bind, with a Bang

Sometimes, even do blocks are too noisy to express a combination of effectful computations. In this case, we can prefix the effectful parts with an exclamation mark (wrapping them in parentheses if they contain additional whitespace), while leaving pure expressions unmodified:

getHello : IO ()
getHello = putStrLn $ "Hello " ++ !getLine ++ "!"

The above gets desugared to the following do block:

getHello' : IO ()
getHello' = do
  s <- getLine
  putStrLn $ "Hello " ++ s ++ "!"

Here is another example:

bangExpr : String -> String -> String -> Maybe Integer
bangExpr s1 s2 s3 =
  Just $ !(parseInteger s1) + !(parseInteger s2) * !(parseInteger s3)

And here is the desugared do block:

bangExpr' : String -> String -> String -> Maybe Integer
bangExpr' s1 s2 s3 = do
  x1 <- parseInteger s1
  x2 <- parseInteger s2
  x3 <- parseInteger s3
  Just $ x1 + x2 * x3

Please remember the following: Syntactic sugar has been introduced to make code more readable or more convenient to write. If it is abused just to show how clever you are, you make things harder for other people (including your future self!) reading and trying to understand your code.

Exercises part 2

  1. Reimplement the following do blocks, once by using bang notation, and once by writing them in their desugared form with nested binds:

    ex1a : IO String
    ex1a = do
      s1 <- getLine
      s2 <- getLine
      s3 <- getLine
      pure $ s1 ++ reverse s2 ++ s3
    
    ex1b : Maybe Integer
    ex1b = do
      n1 <- parseInteger "12"
      n2 <- parseInteger "300"
      Just $ n1 + n2 * 100
    
  2. Below is the definition of an indexed family of types, the index of which keeps track of whether the value in question is possibly empty or provably non-empty:

    data List01 : (nonEmpty : Bool) -> Type -> Type where
      Nil  : List01 False a
      (::) : a -> List01 False a -> List01 ne a
    

    Please note, that the Nil case must have the nonEmpty tag set to False, while with the cons case, this is optional. So, a List01 False a can be empty or non-empty, and we'll only find out, which is the case, by pattern matching on it. A List01 True a on the other hand must be a cons, as for the Nil case the nonEmpty tag is always set to False.

    1. Declare and implement function head for non-empty lists:

      head : List01 True a -> a
      
    2. Declare and implement function weaken for converting any List01 ne a to a List01 False a of the same length and order of values.

    3. Declare and implement function tail for extracting the possibly empty tail from a non-empty list.

    4. Implement function (++) for concatenating two values of type List01. Note, how we use a type-level computation to make sure the result is non-empty if and only if at least one of the two arguments is non-empty:

      (++) : List01 b1 a -> List01 b2 a -> List01 (b1 || b2) a
      
    5. Implement utility function concat' and use it in the implementation of concat. Note, that in concat the two boolean tags are passed as unrestricted implicits, since you will need to pattern match on these to determine whether the result is provably non-empty or not:

      concat' : List01 ne1 (List01 ne2 a) -> List01 False a
      
      concat :  {ne1, ne2 : _}
             -> List01 ne1 (List01 ne2 a)
             -> List01 (ne1 && ne2) a
      
    6. Implement map01:

      map01 : (a -> b) -> List01 ne a -> List01 ne b
      
    7. Implement a custom bind operator in namespace List01 for sequencing computations returning List01s.

      Hint: Use map01 and concat in your implementation and make sure to use unrestricted implicits where necessary.

      You can use the following examples to test your custom bind operator:

      -- this and lf are necessary to make sure, which tag to use
      -- when using list literals
      lt : List01 True a -> List01 True a
      lt = id
      
      lf : List01 False a -> List01 False a
      lf = id
      
      test : List01 True Integer
      test = List01.do
        x  <- lt [1,2,3]
        y  <- lt [4,5,6,7]
        op <- lt [(*), (+), (-)]
        [op x y]
      
      test2 : List01 False Integer
      test2 = List01.do
        x  <- lt [1,2,3]
        y  <- Nil {a = Integer}
        op <- lt [(*), (+), (-)]
        lt [op x y]
      

Some notes on Exercise 2: Here, we combined the capabilities of List and Data.List1 in a single indexed type family. This allowed us to treat list concatenation correctly: If at least one of the arguments is provably non-empty, the result is also non-empty. To tackle this correctly with List and List1, a total of four concatenation functions would have to be written. So, while it is often possible to define distinct data types instead of indexed families, the latter allow us to perform type-level computations to be more precise about the pre- and postconditions of the functions we write, at the cost of more-complex type signatures. In addition, sometimes it's not possible to derive the values of the indices from pattern matching on the data values alone, so they have to be passed as unerased (possibly implicit) arguments.

Please remember, that do blocks are first desugared, before type-checking, disambiguating which bind operator to use, and filling in implicit arguments. It is therefore perfectly fine to define bind operators with arbitrary constraints or implicit arguments as was shown above. Idris will handle all the details, after desugaring the do blocks.

Working with Files

module Tutorial.IO.Files

import Data.List1
import Data.String
import Data.Vect

import System.File

%default total

Module System.File from the base library exports utilities necessary to work with file handles and read and write from and to files. When you have a file path (for instance "/home/hock/idris/tutorial/tutorial.ipkg"), the first thing we will typically do is to try and create a file handle (of type System.File.File by calling fileOpen).

Here is a program for counting all empty lines in a Unix/Linux-file:

covering
countEmpty : (path : String) -> IO (Either FileError Nat)
countEmpty path = openFile path Read >>= either (pure . Left) (go 0)
  where covering go : Nat -> File -> IO (Either FileError Nat)
        go k file = do
          False <- fEOF file | True => closeFile file $> Right k
          Right "\n" <- fGetLine file
            | Right _  => go k file
            | Left err => closeFile file $> Left err
          go (k + 1) file

In the example above, I invoked (>>=) without starting a do block. Make sure you understand what's going on here. Reading concise functional code is important in order to understand other people's code. Have a look at function either at the REPL, try figuring out what (pure . Left) does, and note how we use a curried version of go as the second argument to either.

Function go calls for some additional explanations. First, note how we used the same syntax for pattern matching intermediary results as we also saw for let bindings. As you can see, we can use several vertical bars to handle more than one additional pattern. In order to read a single line from a file, we use function fGetLine. As with most operations working with the file system, this function might fail with a FileError, which we have to handle correctly. Note also, that fGetLine will return the line including its trailing newline character '\n', so in order to check for empty lines, we have to match against "\n" instead of the empty string "".

Finally, go is not provably total and rightfully so. Files like /dev/urandom or /dev/zero provide infinite streams of data, so countEmpty will never terminate when invoked with such a file path.

Safe Resource Handling

Note, how we had to manually open and close the file handle in countEmpty. This is error-prone and tedious. Resource handling is a big topic, and we definitely won't be going into the details here, but there is a convenient function exported from System.File: withFile, which handles the opening, closing and handling of file errors for us.

covering
countEmpty' : (path : String) -> IO (Either FileError Nat)
countEmpty' path = withFile path Read pure (go 0)
  where covering go : Nat -> File -> IO (Either FileError Nat)
        go k file = do
          False <- fEOF file | True => pure (Right k)
          Right "\n" <- fGetLine file
            | Right _  => go k file
            | Left err => pure (Left err)
          go (k + 1) file

Go ahead, and have a look at the type of withFile, then have a look how we use it to simplify the implementation of countEmpty'. Reading and understanding slightly more complex function types is important when learning to program in Idris.

Interface HasIO

When you look at the IO functions we used so far, you'll notice that most if not all of them actually don't work with IO itself but with a type parameter io with a constraint of HasIO. This interface allows us to lift a value of type IO a into another context. We will see use cases for this in later chapters, especially when we talk about monad transformers. For now, you can treat these io parameters as being specialized to IO.

Exercises part 3

  1. As we have seen in the examples above, IO actions working with file handles often come with the risk of failure. We can therefore simplify things by writing some utility functions and a custom bind operator to work with these nested effects. In a new namespace IOErr, implement the following utility functions and use these to further cleanup the implementation of countEmpty':

    pure : a -> IO (Either e a)
    
    fail : e -> IO (Either e a)
    
    lift : IO a -> IO (Either e a)
    
    catch : IO (Either e1 a) -> (e1 -> IO (Either e2 a)) -> IO (Either e2 a)
    
    (>>=) : IO (Either e a) -> (a -> IO (Either e b)) -> IO (Either e b)
    
    (>>) : IO (Either e ()) -> Lazy (IO (Either e a)) -> IO (Either e a)
    
  2. Write a function countWords for counting the words in a file. Consider using Data.String.words and the utilities from exercise 1 in your implementation.

  3. We can generalize the functionality used in countEmpty and countWords, by implementing a helper function for iterating over the lines in a file and accumulating some state along the way. Implement withLines and use it to reimplement countEmpty and countWords:

    covering
    withLines :  (path : String)
              -> (accum : s -> String -> s)
              -> (initialState : s)
              -> IO (Either FileError s)
    
  4. We often use a Monoid for accumulating values. It is therefore convenient to specialize withLines for this case. Use withLines to implement foldLines according to the type given below:

    covering
    foldLines :  Monoid s
              => (path : String)
              -> (f    : String -> s)
              -> IO (Either FileError s)
    
  5. Implement function wordCount for counting the number of lines, words, and characters in a text document. Define a custom record type together with an implementation of Monoid for storing and accumulating these values and use foldLines in your implementation of wordCount.

How IO is Implemented

In this final section of an already lengthy chapter, we will risk a glance at how IO is implemented in Idris. It is interesting to note, that IO is not a built-in type but a regular data type with only one minor speciality. Let's learn about it at the REPL:

Tutorial.IO> :doc IO
data PrimIO.IO : Type -> Type
  Totality: total
  Constructor: MkIO : (1 _ : PrimIO a) -> IO a
  Hints:
    Applicative IO
    Functor IO
    HasLinearIO IO
    Monad IO

Here, we learn that IO has a single data constructor called MkIO, which takes a single argument of type PrimIO a with quantity 1. We are not going to talk about the quantities here, as in fact they are not important to understand how IO works.

Now, PrimIO a is a type alias for the following function:

Tutorial.IO> :printdef PrimIO
PrimIO.PrimIO : Type -> Type
PrimIO a = (1 _ : %World) -> IORes a

Again, don't mind the quantities. There is only one piece of the puzzle missing: IORes a, which is a publicly exported record type:

Solutions.IO> :doc IORes
data PrimIO.IORes : Type -> Type
  Totality: total
  Constructor: MkIORes : a -> (1 _ : %World) -> IORes a

So, to put this all together, IO is a wrapper around something similar to the following function type:

%World -> (a, %World)

You can think of type %World as a placeholder for the state of the outside world of a program (file system, memory, network connections, and so on). Conceptually, to execute an IO a action, we pass it the current state of the world, and in return get an updated world state plus a result of type a. The world state being updated represents all the side effects describable in a computer program.

Now, it is important to understand that there is no such thing as the state of the world. The %World type is just a placeholder, which is converted to some kind of constant that's passed around and never inspected at runtime. So, if we had a value of type %World, we could pass it to an IO a action and execute it, and this is exactly what happens at runtime: A single value of type %World (an uninteresting placeholder like null, 0, or - in case of the JavaScript backends - undefined) is passed to the main function, thus setting the whole program in motion. However, it is impossible to programmatically create a value of type %World (it is an abstract, primitive type), and therefore we cannot ever extract a value of type a from an IO a action (modulo unsafePerformIO).

Once we will talk about monad transformers and the state monad, you will see that IO is nothing else but a state monad in disguise but with an abstract state type, which makes it impossible for us to run the stateful computation.

Conclusion

  • Values of type IO a describe programs with side effects, which will eventually result in a value of type a.

  • While we cannot safely extract a value of type a from an IO a, we can use several combinators and syntactic constructs to combine IO actions and build more-complex programs.

  • Do blocks offer a convenient way to run and combine IO actions sequentially.

  • Do blocks are desugared to nested applications of bind operators ((>>=)).

  • Bind operators, and thus do blocks, can be overloaded to achieve custom behavior instead of the default (monadic) bind.

  • Under the hood, IO actions are stateful computations operating on a symbolic %World state.

What's next

Now, that we had a glimpse at monads and the bind operator, it is time to in the next chapter introduce Monad and some related interfaces for real.

Functor and Friends

Programming, like mathematics, is about abstraction. We try to model parts of the real world, reusing recurring patterns by abstracting over them.

In this chapter, we will learn about several related interfaces, which are all about abstraction and therefore can be hard to understand at the beginning. Especially figuring out why they are useful and when to use them will take time and experience. This chapter therefore comes with tons of exercises, most of which can be solved with only a few short lines of code. Don't skip them. Come back to them several times until these things start feeling natural to you. You will then realize that their initial complexity has vanished.

Functor

module Tutorial.Functor.Functor

import Data.List1
import Data.String
import Data.Vect

%default total

What do type constructors like List, List1, Maybe, or IO have in common? First, all of them are of type Type -> Type. Second, they all put values of a given type in a certain context. With List, the context is non-determinism: We know there to be zero or more values, but we don't know the exact number until we start taking the list apart by pattern matching on it. Likewise for List1, though we know for sure that there is at least one value. For Maybe, we are still not sure about how many values there are, but the possibilities are much smaller: Zero or one. With IO, the context is a different one: Arbitrary side effects.

Although the type constructors discussed above are quite different in how they behave and when they are useful, there are certain operations that keep coming up when working with them. The first such operation is mapping a pure function over the data type, without affecting its underlying structure.

For instance, given a list of numbers, we'd like to multiply each number by two, without changing their order or removing any values:

multBy2List : Num a => List a -> List a
multBy2List []        = []
multBy2List (x :: xs) = 2 * x :: multBy2List xs

But we might just as well convert every string in a list of strings to upper case characters:

toUpperList : List String -> List String
toUpperList []        = []
toUpperList (x :: xs) = toUpper x :: toUpperList xs

Sometimes, the type of the stored value changes. In the next example, we calculate the lengths of the strings stored in a list:

toLengthList : List String -> List Nat
toLengthList []        = []
toLengthList (x :: xs) = length x :: toLengthList xs

I'd like you to appreciate, just how boring these functions are. They are almost identical, with the only interesting part being the function we apply to each element. Surely, there must be a pattern to abstract over:

mapList : (a -> b) -> List a -> List b
mapList f []        = []
mapList f (x :: xs) = f x :: mapList f xs

This is often the first step of abstraction in functional programming: Write a (possibly generic) higher-order function. We can now concisely implement all examples shown above in terms of mapList:

multBy2List' : Num a => List a -> List a
multBy2List' = mapList (2 *)

toUpperList' : List String -> List String
toUpperList' = mapList toUpper

toLengthList' : List String -> List Nat
toLengthList' = mapList length

But surely we'd like to do the same kind of thing with List1 and Maybe! After all, they are just container types like List, the only difference being some detail about the number of values they can or can't hold:

mapMaybe : (a -> b) -> Maybe a -> Maybe b
mapMaybe f Nothing  = Nothing
mapMaybe f (Just v) = Just (f v)

Even with IO, we'd like to be able to map pure functions over effectful computations. The implementation is a bit more involved, due to the nested layers of data constructors, but if in doubt, the types will surely guide us. Note, however, that IO is not publicly exported, so its data constructor is unavailable to us. We can use functions toPrim and fromPrim, however, for converting IO from and to PrimIO, which we can freely dissect:

mapIO : (a -> b) -> IO a -> IO b
mapIO f io = fromPrim $ mapPrimIO (toPrim io)
  where mapPrimIO : PrimIO a -> PrimIO b
        mapPrimIO prim w =
          let MkIORes va w2 = prim w
           in MkIORes (f va) w2

From the concept of mapping a pure function over values in a context follow some derived functions, which are often useful. Here are some of them for IO:

mapConstIO : b -> IO a -> IO b
mapConstIO = mapIO . const

forgetIO : IO a -> IO ()
forgetIO = mapConstIO ()

Of course, we'd want to implement mapConst and forget as well for List, List1, and Maybe (and dozens of other type constructors with some kind of mapping function), and they'd all look the same and be equally boring.

When we come upon a recurring class of functions with several useful derived functions, we should consider defining an interface. But how should we go about this here? When you look at the types of mapList, mapMaybe, and mapIO, you'll see that it's the List, List1, and IO types we need to get rid of. These are not of type Type but of type Type -> Type. Luckily, there is nothing preventing us from parametrizing an interface over something else than a Type.

The interface we are looking for is called Functor. Here is its definition and an example implementation (I appended a tick at the end of the names for them not to overlap with the interface and functions exported by the Prelude):

public export
interface Functor' (0 f : Type -> Type) where
  map' : (a -> b) -> f a -> f b

export
implementation Functor' Maybe where
  map' _ Nothing  = Nothing
  map' f (Just v) = Just $ f v

Note, that we had to give the type of parameter f explicitly, and in that case it needs to be annotated with quantity zero if you want it to be erased at runtime (which you almost always want).

Now, reading type signatures consisting only of type parameters like the one of map' can take some time to get used to, especially when some type parameters are applied to other parameters as in f a. It can be very helpful to inspect these signatures together with all implicit arguments at the REPL (I formatted the output to make it more readable):

Tutorial.Functor> :ti map'
Tutorial.Functor.map' :  {0 b : Type}
                      -> {0 a : Type}
                      -> {0 f : Type -> Type}
                      -> Functor' f
                      => (a -> b)
                      -> f a
                      -> f b

It can also be helpful to replace type parameter f with a concrete value of the same type:

Tutorial.Functor> :t map' {f = Maybe}
map' : (?a -> ?b) -> Maybe ?a -> Maybe ?b

Remember, being able to interpret type signatures is paramount to understanding what's going on in an Idris declaration. You must practice this and make use of the tools and utilities given to you.

Derived Functions

There are several functions and operators directly derivable from interface Functor. Eventually, you should know and remember all of them as they are highly useful. Here they are together with their types:

Tutorial.Functor> :t (<$>)
Prelude.<$> : Functor f => (a -> b) -> f a -> f b

Tutorial.Functor> :t (<&>)
Prelude.<&> : Functor f => f a -> (a -> b) -> f b

Tutorial.Functor> :t ($>)
Prelude.$> : Functor f => f a -> b -> f b

Tutorial.Functor> :t (<$)
Prelude.<$ : Functor f => b -> f a -> f b

Tutorial.Functor> :t ignore
Prelude.ignore : Functor f => f a -> f ()

(<$>) is an operator alias for map and allows you to sometimes drop some parentheses. For instance:

tailShowReversNoOp : Show a => List1 a -> List String
tailShowReversNoOp xs = map (reverse . show) (tail xs)

tailShowReverse : Show a => List1 a -> List String
tailShowReverse xs = reverse . show <$> tail xs

(<&>) is an alias for (<$>) with the arguments flipped. The other three (ignore, ($>), and (<$)) are all used to replace the values in a context with a constant. They are often useful when you don't care about the values themselves but want to keep the underlying structure.

Functors with more than one Type Parameter

The type constructors we looked at so far were all of type Type -> Type. However, we can also implement Functor for other type constructors. The only prerequisite is that the type parameter we'd like to change with function map must be the last in the argument list. For instance, here is the Functor implementation for Either e (note, that Either e has of course type Type -> Type as required):

implementation Functor' (Either e) where
  map' _ (Left ve)  = Left ve
  map' f (Right va) = Right $ f va

Here is another example, this time for a type constructor of type Bool -> Type -> Type (you might remember this from the exercises in the last chapter):

data List01 : (nonEmpty : Bool) -> Type -> Type where
  Nil  : List01 False a
  (::) : a -> List01 False a -> List01 ne a

implementation Functor (List01 ne) where
  map _ []        = []
  map f (x :: xs) = f x :: map f xs

Functor Composition

The nice thing about functors is how they can be paired and nested with other functors and the results are functors again:

record Product (f,g : Type -> Type) (a : Type) where
  constructor MkProduct
  fst : f a
  snd : g a

implementation Functor f => Functor g => Functor (Product f g) where
  map f (MkProduct l r) = MkProduct (map f l) (map f r)

The above allows us to conveniently map over a pair of functors. Note, however, that Idris needs some help with inferring the types involved:

toPair : Product f g a -> (f a, g a)
toPair (MkProduct fst snd) = (fst, snd)

fromPair : (f a, g a) -> Product f g a
fromPair (x,y) = MkProduct x y

productExample :  Show a
               => (Either e a, List a)
               -> (Either e String, List String)
productExample = toPair . map show . fromPair {f = Either e, g = List}

More often, we'd like to map over several layers of nested functors at once. Here's how to do this with an example:

record Comp (f,g : Type -> Type) (a : Type) where
  constructor MkComp
  unComp  : f (g a)

implementation Functor f => Functor g => Functor (Comp f g) where
  map f (MkComp v) = MkComp $ map f <$> v

compExample :  Show a => List (Either e a) -> List (Either e String)
compExample = unComp . map show . MkComp {f = List, g = Either e}

Named Implementations

Sometimes, there are more ways to implement an interface for a given type. For instance, for numeric types we can have a Monoid representing addition and one representing multiplication. Likewise, for nested functors, map can be interpreted as a mapping over only the first layer of values, or a mapping over several layers of values.

One way to go about this is to define single-field wrappers as shown with data type Comp above. However, Idris also allows us to define additional interface implementations, which must then be given a name. For instance:

[Compose'] Functor f => Functor g => Functor (f . g) where
  map f = (map . map) f

Note, that this defines a new implementation of Functor, which will not be considered during implicit resolution in order to avoid ambiguities. However, it is possible to explicitly choose to use this implementation by passing it as an explicit argument to map, prefixed with an @:

compExample2 :  Show a => List (Either e a) -> List (Either e String)
compExample2 = map @{Compose} show

In the example above, we used Compose instead of Compose', since the former is already exported by the Prelude.

Functor Laws

Implementations of Functor are supposed to adhere to certain laws, just like implementations of Eq or Ord. Again, these laws are not verified by Idris, although it would be possible (and often cumbersome) to do so.

  1. map id = id: Mapping the identity function over a functor must not have any visible effect such as changing a container's structure or affecting the side effects perfomed when running an IO action.

  2. map (f . g) = map f . map g: Sequencing two mappings must be identical to a single mapping using the composition of the two functions.

Both of these laws request, that map is preserving the structure of values. This is easier to understand with container types like List, Maybe, or Either e, where map is not allowed to add or remove any wrapped value, nor - in case of List - change their order. With IO, this can best be described as map not performing additional side effects.

Exercises part 1

  1. Write your own implementations of Functor' for Maybe, List, List1, Vect n, Either e, and Pair a.

  2. Write a named implementation of Functor for pairs of functors (similar to the one implemented for Product).

  3. Implement Functor for data type Identity (which is available from Control.Monad.Identity in base):

    record Identity a where
      constructor Id
      value : a
    
  4. Here is a curious one: Implement Functor for Const e (which is also available from Control.Applicative.Const in base). You might be confused about the fact that the second type parameter has absolutely no relevance at runtime, as there is no value of that type. Such types are sometimes called phantom types. They can be quite useful for tagging values with additional typing information.

    Don't let the above confuse you: There is only one possible implementation. As usual, use holes and let the compiler guide you if you get lost.

    record Const (e,a : Type) where
      constructor MkConst
      value : e
    
  5. Here is a sum type for describing CRUD operations (Create, Read, Update, and Delete) in a data store:

    data Crud : (i : Type) -> (a : Type) -> Type where
      Create : (value : a) -> Crud i a
      Update : (id : i) -> (value : a) -> Crud i a
      Read   : (id : i) -> Crud i a
      Delete : (id : i) -> Crud i a
    

    Implement Functor for Crud i.

  6. Here is a sum type for describing responses from a data server:

    data Response : (e, i, a : Type) -> Type where
      Created : (id : i) -> (value : a) -> Response e i a
      Updated : (id : i) -> (value : a) -> Response e i a
      Found   : (values : List a) -> Response e i a
      Deleted : (id : i) -> Response e i a
      Error   : (err : e) -> Response e i a
    

    Implement Functor for Repsonse e i.

  7. Implement Functor for Validated e:

    data Validated : (e,a : Type) -> Type where
      Invalid : (err : e) -> Validated e a
      Valid   : (val : a) -> Validated e a
    

Applicative

module Tutorial.Functor.Applicative

import Tutorial.Functor.Functor

import Data.List1
import Data.String
import Data.Vect

%default total

While Functor allows us to map a pure, unary function over a value in a context, it doesn't allow us to combine n such values under an n-ary function.

For instance, consider the following functions:

liftMaybe2 : (a -> b -> c) -> Maybe a -> Maybe b -> Maybe c
liftMaybe2 f (Just va) (Just vb) = Just $ f va vb
liftMaybe2 _ _         _         = Nothing

liftVect2 : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
liftVect2 _ []        []        = []
liftVect2 f (x :: xs) (y :: ys) = f x y :: liftVect2 f xs ys

liftIO2 : (a -> b -> c) -> IO a -> IO b -> IO c
liftIO2 f ioa iob = fromPrim $ go (toPrim ioa) (toPrim iob)
  where go : PrimIO a -> PrimIO b -> PrimIO c
        go pa pb w =
          let MkIORes va w2 = pa w
              MkIORes vb w3 = pb w2
           in MkIORes (f va vb) w3

This behavior is not covered by Functor, yet it is a very common thing to do. For instance, we might want to read two numbers from standard input (both operations might fail), calculating the product of the two. Here's the code:

multNumbers : Num a => Neg a => IO (Maybe a)
multNumbers = do
  s1 <- getLine
  s2 <- getLine
  pure $ liftMaybe2 (*) (parseInteger s1) (parseInteger s2)

And it won't stop here. We might just as well want to have liftMaybe3 for ternary functions and three Maybe arguments and so on, for arbitrary numbers of arguments.

But there is more: We'd also like to lift pure values into the context in question. With this, we could do the following:

liftMaybe3 : (a -> b -> c -> d) -> Maybe a -> Maybe b -> Maybe c -> Maybe d
liftMaybe3 f (Just va) (Just vb) (Just vc) = Just $ f va vb vc
liftMaybe3 _ _         _         _         = Nothing

pureMaybe : a -> Maybe a
pureMaybe = Just

multAdd100 : Num a => Neg a => String -> String -> Maybe a
multAdd100 s t = liftMaybe3 calc (parseInteger s) (parseInteger t) (pure 100)
  where calc : a -> a -> a -> a
        calc x y z = x * y + z

As you'll of course already know, I am now going to present a new interface to encapsulate this behavior. It's called Applicative. Here is its definition and an example implementation:

public export
interface Functor' f => Applicative' f where
  app   : f (a -> b) -> f a -> f b
  pure' : a -> f a

export
implementation Applicative' Maybe where
  app (Just fun) (Just val) = Just $ fun val
  app _          _          = Nothing

  pure' = Just

Interface Applicative is of course already exported by the Prelude. There, function app is an operator sometimes called app or apply: (<*>).

You may wonder, how functions like liftMaybe2 or liftIO3 are related to operator apply. Let me demonstrate this:

liftA2 : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2 fun fa fb = pure fun <*> fa <*> fb

liftA3 : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3 fun fa fb fc = pure fun <*> fa <*> fb <*> fc

It is really important for you to understand what's going on here, so let's break these down. If we specialize liftA2 to use Maybe for f, pure fun is of type Maybe (a -> b -> c). Likewise, pure fun <*> fa is of type Maybe (b -> c), as (<*>) will apply the value stored in fa to the function stored in pure fun (currying!).

You'll often see such chains of applications of apply, the number of applies corresponding to the arity of the function we lift. You'll sometimes also see the following, which allows us to drop the initial call to pure, and use the operator version of map instead:

liftA2' : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2' fun fa fb = fun <$> fa <*> fb

liftA3' : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3' fun fa fb fc = fun <$> fa <*> fb <*> fc

So, interface Applicative allows us to lift values (and functions!) into computational contexts and apply them to values in the same contexts. Before we will see an extended example why this is useful, I'll quickly introduce some syntactic sugar for working with applicative functors.

Idiom Brackets

The programming style used for implementing liftA2' and liftA3' is also referred to as applicative style and is used a lot in Haskell for combining several effectful computations with a single pure function.

In Idris, there is an alternative to using such chains of operator applications: Idiom brackets. Here's another reimplementation of liftA2 and liftA3:

liftA2'' : Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2'' fun fa fb = [| fun fa fb |]

liftA3'' : Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3'' fun fa fb fc = [| fun fa fb fc |]

The above implementations will be desugared to the one given for liftA2 and liftA3, again before disambiguating, type checking, and filling in of implicit values. Like with the bind operator, we can therefore write custom implementations for pure and (<*>), and Idris will use these if it can disambiguate between the overloaded function names.

Use Case: CSV Reader

In order to understand the power and versatility that comes with applicative functors, we will look at a slightly extended example. We are going to write some utilities for parsing and decoding content from CSV files. These are files where each line holds a list of values separated by commas (or some other delimiter). Typically, they are used to store tabular data, for instance from spread sheet applications. What we would like to do is convert lines in a CSV file and store the result in custom records, where each record field corresponds to a column in the table.

For instance, here is a simple example file, containing tabular user information from a web store: First name, last name, age (optional), email address, gender, and password.

Jon,Doe,42,jon@doe.ch,m,weijr332sdk
Jane,Doe,,jane@doe.ch,f,aa433sd112
Stefan,Hoeck,,nope@goaway.ch,m,password123

And here are the Idris data types necessary to hold this information at runtime. We use again custom string wrappers for increased type safety and because it will allow us to define for each data type what we consider to be valid input:

data Gender = Male | Female | Other

public export
record Name where
  constructor MkName
  value : String

record Email where
  constructor MkEmail
  value : String

record Password where
  constructor MkPassword
  value : String

record User where
  constructor MkUser
  firstName : Name
  lastName  : Name
  age       : Maybe Nat
  email     : Email
  gender    : Gender
  password  : Password

We start by defining an interface for reading fields in a CSV file and writing implementations for the data types we'd like to read:

public export
interface CSVField a where
  read : String -> Maybe a

Below are implementations for Gender and Bool. I decided to in these cases encode each value with a single lower case character:

export
CSVField Gender where
  read "m" = Just Male
  read "f" = Just Female
  read "o" = Just Other
  read _   = Nothing

export
CSVField Bool where
  read "t" = Just True
  read "f" = Just False
  read _   = Nothing

For numeric types, we can use the parsing functions from Data.String:

export
CSVField Nat where
  read = parsePositive

export
CSVField Integer where
  read = parseInteger

export
CSVField Double where
  read = parseDouble

For optional values, the stored type must itself come with an instance of CSVField. We can then treat the empty string "" as Nothing, while a non-empty string will be passed to the encapsulated type's field reader. (Remember that (<$>) is an alias for map.)

export
CSVField a => CSVField (Maybe a) where
  read "" = Just Nothing
  read s  = Just <$> read s

Finally, for our string wrappers, we need to decide what we consider to be valid values. For simplicity, I decided to limit the length of allowed strings and the set of valid characters.

readIf : (String -> Bool) -> (String -> a) -> String -> Maybe a
readIf p mk s = if p s then Just (mk s) else Nothing

isValidName : String -> Bool
isValidName s =
  let len = length s
   in 0 < len && len <= 100 && all isAlpha (unpack s)

export
CSVField Name where
  read = readIf isValidName MkName

isEmailChar : Char -> Bool
isEmailChar '.' = True
isEmailChar '@' = True
isEmailChar c   = isAlphaNum c

isValidEmail : String -> Bool
isValidEmail s =
  let len = length s
   in 0 < len && len <= 100 && all isEmailChar (unpack s)

CSVField Email where
  read = readIf isValidEmail MkEmail

isPasswordChar : Char -> Bool
isPasswordChar ' ' = True
-- please note that isSpace holds as well for other characaters than ' '
-- e.g. for non-breaking space: isSpace '\160' = True
-- but only ' ' shall be llowed in passwords
isPasswordChar c   = not (isControl c) && not (isSpace c)

isValidPassword : String -> Bool
isValidPassword s =
  let len = length s
   in 8 < len && len <= 100 && all isPasswordChar (unpack s)

CSVField Password where
  read = readIf isValidPassword MkPassword

In a later chapter, we will learn about refinement types and how to store an erased proof of validity together with a validated value.

We can now start to decode whole lines in a CSV file. In order to do so, we first introduce a custom error type encapsulating how things can go wrong:

public export
data CSVError : Type where
  FieldError           : (line, column : Nat) -> (str : String) -> CSVError
  UnexpectedEndOfInput : (line, column : Nat) -> CSVError
  ExpectedEndOfInput   : (line, column : Nat) -> CSVError

We can now use CSVField to read a single field at a given line and position in a CSV file, and return a FieldError in case of a failure.

export
readField : CSVField a => (line, column : Nat) -> String -> Either CSVError a
readField line col str =
  maybe (Left $ FieldError line col str) Right (read str)

If we know in advance the number of fields we need to read, we can try and convert a list of strings to a Vect of the given length. This facilitates reading record values of a known number of fields, as we get the correct number of string variables when pattern matching on the vector:

toVect : (n : Nat) -> (line, col : Nat) -> List a -> Either CSVError (Vect n a)
toVect 0     line _   []        = Right []
toVect 0     line col _         = Left (ExpectedEndOfInput line col)
toVect (S k) line col []        = Left (UnexpectedEndOfInput line col)
toVect (S k) line col (x :: xs) = (x ::) <$> toVect k line (S col) xs

Finally, we can implement function readUser to try and convert a single line in a CSV-file to a value of type User:

readUser' : (line : Nat) -> List String -> Either CSVError User
readUser' line ss = do
  [fn,ln,a,em,g,pw] <- toVect 6 line 0 ss
  [| MkUser (readField line 1 fn)
            (readField line 2 ln)
            (readField line 3 a)
            (readField line 4 em)
            (readField line 5 g)
            (readField line 6 pw) |]

readUser : (line : Nat) -> String -> Either CSVError User
readUser line = readUser' line . forget . split (',' ==)

Let's give this a go at the REPL:

Tutorial.Functor> readUser 1 "Joe,Foo,46,j@f.ch,m,pw1234567"
Right (MkUser (MkName "Joe") (MkName "Foo")
  (Just 46) (MkEmail "j@f.ch") Male (MkPassword "pw1234567"))
Tutorial.Functor> readUser 7 "Joe,Foo,46,j@f.ch,m,shortPW"
Left (FieldError 7 6 "shortPW")

Note, how in the implementation of readUser' we used an idiom bracket to map a function of six arguments (MkUser) over six values of type Either CSVError. This will automatically succeed, if and only if all of the parsings have succeeded. It would have been notoriously cumbersome resulting in much less readable code to implement readUser' with a succession of six nested pattern matches.

However, the idiom bracket above looks still quite repetitive. Surely, we can do better?

A Case for Heterogeneous Lists

It is time to learn about a family of types, which can be used as a generic representation for record types, and which will allow us to represent and read rows in heterogeneous tables with a minimal amount of code: Heterogeneous lists.

namespace HList
  public export
  data HList : (ts : List Type) -> Type where
    Nil  : HList Nil
    (::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)

A heterogeneous list is a list type indexed over a list of types. This allows us to at each position store a value of the type at the same position in the list index. For instance, here is a variant, which stores three values of types Bool, Nat, and Maybe String (in that order):

hlist1 : HList [Bool, Nat, Maybe String]
hlist1 = [True, 12, Nothing]

You could argue that heterogeneous lists are just tuples storing values of the given types. That's right, of course, however, as you'll learn the hard way in the exercises, we can use the list index to perform compile-time computations on HList, for instance when concatenating two such lists to keep track of the types stored in the result at the same time.

But first, we'll make use of HList as a means to concisely parse CSV-lines. In order to do that, we need to introduce a new interface for types corresponding to whole lines in a CSV-file:

public export
interface CSVLine a where
  decodeAt : (line, col : Nat) -> List String -> Either CSVError a

We'll now write two implementations of CSVLine for HList: One for the Nil case, which will succeed if and only if the current list of strings is empty. The other for the cons case, which will try and read a single field from the head of the list and the remainder from its tail. We use again an idiom bracket to concatenate the results:

export
CSVLine (HList []) where
  decodeAt _ _ [] = Right Nil
  decodeAt l c _  = Left (ExpectedEndOfInput l c)

export
CSVField t => CSVLine (HList ts) => CSVLine (HList (t :: ts)) where
  decodeAt l c []        = Left (UnexpectedEndOfInput l c)
  decodeAt l c (s :: ss) = [| readField l c s :: decodeAt l (S c) ss |]

And that's it! All we need to add is two utility function for decoding whole lines before they have been split into tokens, one of which is specialized to HList and takes an erased list of types as argument to make it more convenient to use at the REPL:

decode : CSVLine a => (line : Nat) -> String -> Either CSVError a
decode line = decodeAt line 1 . forget . split (',' ==)

hdecode :  (0 ts : List Type)
        -> CSVLine (HList ts)
        => (line : Nat)
        -> String
        -> Either CSVError (HList ts)
hdecode _ = decode

It's time to reap the fruits of our labour and give this a go at the REPL:

Tutorial.Functor> hdecode [Bool,Nat,Double] 1 "f,100,12.123"
Right [False, 100, 12.123]
Tutorial.Functor> hdecode [Name,Name,Gender] 3 "Idris,,f"
Left (FieldError 3 2 "")

Applicative Laws

Again, Applicative implementations must follow certain laws. Here they are:

  • pure id <*> fa = fa: Lifting and applying the identity function has no visible effect.

  • [| f . g |] <*> v = f <*> (g <*> v): I must not matter, whether we compose our functions first and then apply them, or whether we apply our functions first and then compose them.

    The above might be hard to understand, so here they are again with explicit types and implementations:

    compL : Maybe (b -> c) -> Maybe (a -> b) -> Maybe a -> Maybe c
    compL f g v = [| f . g |] <*> v
    
    compR : Maybe (b -> c) -> Maybe (a -> b) -> Maybe a -> Maybe c
    compR f g v = f <*> (g <*> v)
    

    The second applicative law states, that the two implementations compL and compR should behave identically.

  • pure f <*> pure x = pure (f x). This is also called the homomorphism law. It should be pretty self-explaining.

  • f <*> pure v = pure ($ v) <*> f. This is called the law of interchange.

    This should again be explained with a concrete example:

    interL : Maybe (a -> b) -> a -> Maybe b
    interL f v = f <*> pure v
    
    interR : Maybe (a -> b) -> a -> Maybe b
    interR f v = pure ($ v) <*> f
    

    Note, that ($ v) has type (a -> b) -> b, so this is a function type being applied to f, which has a function of type a -> b wrapped in a Maybe context.

    The law of interchange states that it must not matter whether we apply a pure value from the left or right of the apply operator.

Exercises part 2

  1. Implement Applicative' for Either e and Identity.

  2. Implement Applicative' for Vect n. Note: In order to implement pure, the length must be known at runtime. This can be done by passing it as an unerased implicit to the interface implementation:

    implementation {n : _} -> Applicative' (Vect n) where
    
  3. Implement Applicative' for Pair e, with e having a Monoid constraint.

  4. Implement Applicative for Const e, with e having a Monoid constraint.

  5. Implement Applicative for Validated e, with e having a Semigroup constraint. This will allow us to use (<+>) to accumulate errors in case of two Invalid values in the implementation of apply.

  6. Add an additional data constructor of type CSVError -> CSVError -> CSVError to CSVError and use this to implement Semigroup for CSVError.

  7. Refactor our CSV-parsers and all related functions so that they return Validated instead of Either. This will only work, if you solved exercise 6.

    Two things to note: You will have to adjust very little of the existing code, as we can still use applicative syntax with Validated. Also, with this change, we enhanced our CSV-parsers with the ability of error accumulation. Here are some examples from a REPL session:

    Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "t,12,f"
    Valid [True, 12, Female]
    Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "o,-12,f"
    Invalid (App (FieldError 1 1 "o") (FieldError 1 2 "-12"))
    Solutions.Functor> hdecode [Bool,Nat,Gender] 1 "o,-12,foo"
    Invalid (App (FieldError 1 1 "o")
      (App (FieldError 1 2 "-12") (FieldError 1 3 "foo")))
    

    Behold the power of applicative functors and heterogeneous lists: With only a few lines of code we wrote a pure, type-safe, and total parser with error accumulation for lines in CSV-files, which is very convenient to use at the same time!

  8. Since we introduced heterogeneous lists in this chapter, it would be a pity not to experiment with them a little.

    This exercise is meant to sharpen your skills in type wizardry. It therefore comes with very few hints. Try to decide yourself what behavior you'd expect from a given function, how to express this in the types, and how to implement it afterwards. If your types are correct and precise enough, the implementations will almost come for free. Don't give up too early if you get stuck. Only if you truly run out of ideas should you have a glance at the solutions (and then, only at the types at first!)

    1. Implement head for HList.

    2. Implement tail for HList.

    3. Implement (++) for HList.

    4. Implement index for HList. This might be harder than the other three. Go back and look how we implemented indexList in an earlier exercise and start from there.

    5. Package contrib, which is part of the Idris project, provides Data.HVect.HVect, a data type for heterogeneous vectors. The only difference to our own HList is, that HVect is indexed over a vector of types instead of a list of types. This makes it easier to express certain operations at the type level.

      Write your own implementation of HVect together with functions head, tail, (++), and index.

    6. For a real challenge, try implementing a function for transposing a Vect m (HVect ts). You'll first have to be creative about how to even express this in the types.

      Note: In order to implement this, you'll need to pattern match on an erased argument in at least one case to help Idris with type inference. Pattern matching on erased arguments is forbidden (they are erased after all, so we can't inspect them at runtime), unless the structure of the value being matched on can be derived from another, un-erased argument.

      Also, don't worry if you get stuck on this one. It took me several tries to figure it out. But I enjoyed the experience, so I just had to include it here. :-)

      Note, however, that such a function might be useful when working with CSV-files, as it allows us to convert a table represented as rows (a vector of tuples) to one represented as columns (a tuple of vectors).

  9. Show, that the composition of two applicative functors is again an applicative functor by implementing Applicative for Comp f g.

  10. Show, that the product of two applicative functors is again an applicative functor by implementing Applicative for Prod f g.

Monad

module Tutorial.Functor.Monad

import Tutorial.Functor.Functor
import Tutorial.Functor.Applicative

import Data.List1
import Data.String
import Data.Vect

%default total

Finally, Monad. A lot of ink has been spilled about this one. However, after what we already saw in the chapter about IO, there is not much left to discuss here. Monad extends Applicative and adds two new related functions: The bind operator ((>>=)) and function join. Here is its definition:

interface Applicative' m => Monad' m where
  bind  : m a -> (a -> m b) -> m b
  join' : m (m a) -> m a

Implementers of Monad are free to choose to either implement (>>=) or join or both. You will show in an exercise, how join can be implemented in terms of bind and vice versa.

The big difference between Monad and Applicative is, that the former allows a computation to depend on the result of an earlier computation. For instance, we could decide based on a string read from standard input whether to delete a file or play a song. The result of the first IO action (reading some user input) will affect, which IO action to run next. This is not possible with the apply operator:

(<*>) : IO (a -> b) -> IO a -> IO b

The two IO actions have already been decided on when they are being passed as arguments to (<*>). The result of the first cannot - in the general case - affect which computation to run in the second. (Actually, with IO this would theoretically be possible via side effects: The first action could write some command to a file or overwrite some mutable state, and the second action could read from that file or state, thus deciding on the next thing to do. But this is a speciality of IO, not of applicative functors in general. If the functor in question was Maybe, List, or Vector, no such thing would be possible.)

Let's demonstrate the difference with an example. Assume we'd like to enhance our CSV-reader with the ability to decode a line of tokens to a sum type. For instance, we'd like to decode CRUD requests from the lines of a CSV-file:

data Crud : (i : Type) -> (a : Type) -> Type where
  Create : (value : a) -> Crud i a
  Update : (id : i) -> (value : a) -> Crud i a
  Read   : (id : i) -> Crud i a
  Delete : (id : i) -> Crud i a

We need a way to on each line decide, which data constructor to choose for our decoding. One way to do this is to put the name of the data constructor (or some other tag of identification) in the first column of the CSV-file:

hlift : (a -> b) -> HList [a] -> b
hlift f [x] = f x

hlift2 : (a -> b -> c) -> HList [a,b] -> c
hlift2 f [x,y] = f x y

decodeCRUD :  CSVField i
           => CSVField a
           => (line : Nat)
           -> (s    : String)
           -> Either CSVError (Crud i a)
decodeCRUD l s =
  let h ::: t = split (',' ==) s
   in do
     MkName n <- readField l 1 h
     case n of
       "Create" => hlift  Create  <$> decodeAt l 2 t
       "Update" => hlift2 Update  <$> decodeAt l 2 t
       "Read"   => hlift  Read    <$> decodeAt l 2 t
       "Delete" => hlift  Delete  <$> decodeAt l 2 t
       _        => Left (FieldError l 1 n)

I added two utility function for helping with type inference and to get slightly nicer syntax. The important thing to note is, how we pattern match on the result of the first parsing function to decide on the data constructor and thus the next parsing function to use.

Here's how this works at the REPL:

Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Create,jon@doe.ch"
Right (Create (MkEmail "jon@doe.ch"))
Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Update,12,jane@doe.ch"
Right (Update 12 (MkEmail "jane@doe.ch"))
Tutorial.Functor> decodeCRUD {i = Nat} {a = Email} 1 "Delete,jon@doe.ch"
Left (FieldError 1 2 "jon@doe.ch")

To conclude, Monad, unlike Applicative, allows us to chain computations sequentially, where intermediary results can affect the behavior of later computations. So, if you have n unrelated effectful computations and want to combine them under a pure, n-ary function, Applicative will be sufficient. If, however, you want to decide based on the result of an effectful computation what computation to run next, you need a Monad.

Note, however, that Monad has one important drawback compared to Applicative: In general, monads don't compose. For instance, there is no Monad instance for Either e . IO. We will later learn about monad transformers, which can be composed with other monads.

Monad Laws

Without further ado, here are the laws for Monad:

  • ma >>= pure = ma and pure v >>= f = f v. These are monad's identity laws. Here they are as concrete examples:

    id1L : Maybe a -> Maybe a
    id1L ma = ma >>= pure
    
    id2L : a -> (a -> Maybe b) -> Maybe b
    id2L v f = pure v >>= f
    
    id2R : a -> (a -> Maybe b) -> Maybe b
    id2R v f = f v
    

    These two laws state that pure should behave neutrally w.r.t. bind.

  • (m >>= f) >>= g = m >>= (f >=> g). This is the law of associativity for monad. You might not have seen the second operator (>=>). It can be used to sequence effectful computations and has the following type:

    Tutorial.Functor> :t (>=>)
    Prelude.>=> : Monad m => (a -> m b) -> (b -> m c) -> a -> m c
    

The above are the official monad laws. However, we need to consider a third one, given that in Idris (and Haskell) Monad extends Applicative: As (<*>) can be implemented in terms of (>>=), the actual implementation of (<*>) must behave the same as the implementation in terms of (>>=):

  • mf <*> ma = mf >>= (\fun => map (fun $) ma).

Exercises part 3

  1. Applicative extends Functor, because every Applicative is also a Functor. Proof this by implementing map in terms of pure and (<*>).

  2. Monad extends Applicative, because every Monad is also an Applicative. Proof this by implementing (<*>) in terms of (>>=) and pure.

  3. Implement (>>=) in terms of join and other functions in the Monad hierarchy.

  4. Implement join in terms of (>>=) and other functions in the Monad hierarchy.

  5. There is no lawful Monad implementation for Validated e. Why?

  6. In this slightly extended exercise, we are going to simulate CRUD operations on a data store. We will use a mutable reference (imported from Data.IORef from the base library) holding a list of Users paired with a unique ID of type Nat as our user data base:

    DB : Type
    DB = IORef (List (Nat,User))
    

    Most operations on a database come with a risk of failure: When we try to update or delete a user, the entry in question might no longer be there. When we add a new user, a user with the given email address might already exist. Here is a custom error type to deal with this:

    data DBError : Type where
      UserExists        : Email -> Nat -> DBError
      UserNotFound      : Nat -> DBError
      SizeLimitExceeded : DBError
    

    In general, our functions will therefore have a type similar to the following:

    someDBProg : arg1 -> arg2 -> DB -> IO (Either DBError a)
    

    We'd like to abstract over this, by introducing a new wrapper type:

    record Prog a where
      constructor MkProg
      runProg : DB -> IO (Either DBError a)
    

    We are now ready to write us some utility functions. Make sure to follow the following business rules when implementing the functions below:

    • Email addresses in the DB must be unique. (Consider implementing Eq Email to verify this).

    • The size limit of 1000 entries must not be exceeded.

    • Operations trying to lookup a user by their ID must fail with UserNotFound in case no entry was found in the DB.

    You'll need the following functions from Data.IORef when working with mutable references: newIORef, readIORef, and writeIORef. In addition, functions Data.List.lookup and Data.List.find might be useful to implement some of the functions below.

    1. Implement interfaces Functor, Applicative, and Monad for Prog.

    2. Implement interface HasIO for Prog.

    3. Implement the following utility functions:

      throw : DBError -> Prog a
      
      getUsers : Prog (List (Nat,User))
      
      -- check the size limit!
      putUsers : List (Nat,User) -> Prog ()
      
      -- implement this in terms of `getUsers` and `putUsers`
      modifyDB : (List (Nat,User) -> List (Nat,User)) -> Prog ()
      
    4. Implement function lookupUser. This should fail with an appropriate error, if a user with the given ID cannot be found.

      lookupUser : (id : Nat) -> Prog User
      
    5. Implement function deleteUser. This should fail with an appropriate error, if a user with the given ID cannot be found. Make use of lookupUser in your implementation.

      deleteUser : (id : Nat) -> Prog ()
      
    6. Implement function addUser. This should fail, if a user with the given Email already exists, or if the data banks size limit of 1000 entries is exceeded. In addition, this should create and return a unique ID for the new user entry.

      addUser : (new : User) -> Prog Nat
      
    7. Implement function updateUser. This should fail, if the user in question cannot be found or a user with the updated user's Email already exists. The returned value should be the updated user.

      updateUser : (id : Nat) -> (mod : User -> User) -> Prog User
      
    8. Data type Prog is actually too specific. We could just as well abstract over the error type and the DB environment:

      record Prog' env err a where
        constructor MkProg'
        runProg' : env -> IO (Either err a)
      

      Verify, that all interface implementations you wrote for Prog can be used verbatim to implement the same interfaces for Prog' env err. The same goes for throw with only a slight adjustment in the function's type.

Background and further Reading

Concepts like functor and monad have their origin in category theory, a branch of mathematics. That is also where their laws come from. Category theory was found to have applications in programming language theory, especially functional programming. It is a highly abstract topic, but there is a pretty accessible introduction for programmers, written by Bartosz Milewski.

The usefulness of applicative functors as a middle ground between functor and monad was discovered several years after monads had already been in use in Haskell. They were introduced in the article Applicative Programming with Effects, which is freely available online and a highly recommended read.

Conclusion

  • Interfaces Functor, Applicative, and Monad abstract over programming patterns that come up when working with type constructors of type Type -> Type. Such data types are also referred to as values in a context, or effectful computations.

  • Functor allows us to map over values in a context without affecting the context's underlying structure.

  • Applicative allows us to apply n-ary functions to n effectful computations and to lift pure values into a context.

  • Monad allows us to chain effectful computations, where the intermediary results can affect, which computation to run further down the chain.

  • Unlike Monad, Functor and Applicative compose: The product and composition of two functors or applicatives are again functors or applicatives, respectively.

  • Idris provides syntactic sugar for working with some of the interfaces presented here: Idiom brackets for Applicative, do blocks and the bang operator for Monad.

What's next?

In the next chapter we get to learn more about recursion, totality checking, and an interface for collapsing container types: Foldable.

Recursion and Folds

In this chapter, we are going to have a closer look at the computations we typically perform with container types: Parameterized data types like List, Maybe, or Identity, holding zero or more values of the parameter's type. Many of these functions are recursive in nature, so we start with a discourse about recursion in general, and tail recursion as an important optimization technique in particular. Most recursive functions in this part will describe pure iterations over lists.

It is recursive functions, for which totality is hard to determine, so we will next have a quick look at the totality checker and learn, when it will refuse to accept a function as being total and what to do about this.

Finally, we will start looking for common patterns in the recursive functions from the first part and will eventually introduce a new interface for consuming container types: Interface Foldable.

Recursion

module Tutorial.Folds.Recursion

import Data.List1
import Data.Maybe
import Data.Vect
import Debug.Trace

%default total

In this section, we are going to have a closer look at recursion in general and at tail recursion in particular.

Recursive functions are functions, which call themselves to repeat a task or calculation until a certain aborting condition (called the base case) holds. Please note, that it is recursive functions, which make it hard to verify totality: Non-recursive functions, which are covering (they cover all possible cases in their pattern matches) are automatically total if they only invoke other total functions.

Here is an example of a recursive function: It generates a list of the given length filling it with identical values:

replicateList : Nat -> a -> List a
replicateList 0     _ = []
replicateList (S k) x = x :: replicateList k x

As you can see (this module has the %default total pragma at the top), this function is provably total. Idris verifies, that the Nat argument gets strictly smaller in each recursive call, and that therefore, the function must eventually come to an end. Of course, we can do the same thing for Vect, where we can even show that the length of the resulting vector matches the given natural number:

replicateVect : (n : Nat) -> a -> Vect n a
replicateVect 0     _ = []
replicateVect (S k) x = x :: replicateVect k x

While we often use recursion to create values of data types like List or Vect, we also use recursion, when we consume such values. For instance, here is a function for calculating the length of a list:

len : List a -> Nat
len []        = 0
len (_ :: xs) = 1 + len xs

Again, Idris can verify that len is total, as the list we pass in the recursive case is strictly smaller than the original list argument.

But when is a recursive function non-total? Here is an example: The following function creates a sequence of values until the given generation function (gen) returns a Nothing. Note, how we use a state value (of generic type s) and use gen to calculate a value together with the next state:

covering
unfold : (gen : s -> Maybe (s,a)) -> s -> List a
unfold gen vs = case gen vs of
  Just (vs',va) => va :: unfold gen vs'
  Nothing       => []

With unfold, Idris can't verify that any of its arguments is converging towards the base case. It therefore rightfully refuses to accept that unfold is total. And indeed, the following function produces an infinite list (so please, don't try to inspect this at the REPL, as doing so will consume all your computer's memory):

fiboHelper : (Nat,Nat) -> ((Nat,Nat),Nat)
fiboHelper (f0,f1) = ((f1, f0 + f1), f0)

covering
fibonacci : List Nat
fibonacci = unfold (Just . fiboHelper) (1,1)

In order to safely create a (finite) sequence of Fibonacci numbers, we need to make sure the function generating the sequence will stop after a finite number of steps, for instance by limiting the length of the list:

unfoldTot : Nat -> (gen : s -> Maybe (s,a)) -> s -> List a
unfoldTot 0     _   _  = []
unfoldTot (S k) gen vs = case gen vs of
  Just (vs',va) => va :: unfoldTot k gen vs'
  Nothing       => []

fibonacciN : Nat -> List Nat
fibonacciN n = unfoldTot n (Just . fiboHelper) (1,1)

The Call Stack

In order to demonstrate what tail recursion is about, we require the following main function:

main : IO ()
main = printLn . len $ replicateList 10000 10

If you have Node.js installed on your system, you might try the following experiment. Compile and run this module using the Node.js backend of Idris instead of the default Chez Scheme backend and run the resulting JavaScript source file with the Node.js binary:

idris2 --cg node -o test.js --find-ipkg src/Tutorial/Folds.md
node build/exec/test.js

Node.js will fail with the following error message and a lengthy stack trace: RangeError: Maximum call stack size exceeded. What's going on here? How can it be that main fails with an exception although it is provably total?

First, remember that a function being total means that it will eventually produce a value of the given type in a finite amount of time, given enough resources like computer memory. Here, main hasn't been given enough resources as Node.js has a very small size limit on its call stack. The call stack can be thought of as a stack data structure (first in, last out), where nested function calls are put. In case of recursive functions, the stack size increases by one with every recursive function call. In case of our main function, we create and consume a list of length 10'000, so the call stack will hold at least 10'000 function calls before they are being invoked and the stack's size is reduced again. This exceeds Node.js's stack size limit by far, hence the overflow error.

Now, before we look at a solution how to circumvent this issue, please note that this is a very serious and limiting source of bugs when using the JavaScript backends of Idris. In Idris, having no access to control structures like for or while loops, we always have to resort to recursion in order to describe iterative computations. Luckily (or should I say "unfortunately", since otherwise this issue would already have been addressed with all seriousness), the Scheme backends don't have this issue, as their stack size limit is much larger and they perform all kinds of optimizations internally to prevent the call stack from overflowing.

Tail Recursion

A recursive function is said to be tail recursive, if all recursive calls occur at tail position: The last function call in a (sub)expression. For instance, the following version of len is tail recursive:

lenOnto : Nat -> List a -> Nat
lenOnto k []        = k
lenOnto k (_ :: xs) = lenOnto (k + 1) xs

Compare this to len as defined above: There, the last function call is an invocation of operator (+), and the recursive call happens in one of its arguments:

len (_ :: xs) = 1 + len xs

We can use lenOnto as a utility to implement a tail recursive version of len without the additional Nat argument:

lenTR : List a -> Nat
lenTR = lenOnto 0

This is a common pattern when writing tail recursive functions: We typically add an additional function argument for accumulating intermediary results, which is then passed on explicitly at each recursive call. For instance, here is a tail recursive version of replicateList:

replicateListTR : Nat -> a -> List a
replicateListTR n v = go Nil n
  where go : List a -> Nat -> List a
        go xs 0     = xs
        go xs (S k) = go (v :: xs) k

The big advantage of tail recursive functions is, that they can be easily converted to efficient, imperative loops by the Idris compiler, and are thus stack safe: Recursive function calls are not added to the call stack, thus avoiding the dreaded stack overflow errors.

main1 : IO ()
main1 = printLn . lenTR $ replicateListTR 10000 10

We can again run main1 using the Node.js backend. This time, we use slightly different syntax to execute a function other than main (Remember: The dollar prefix is only there to distinghish a terminal command from its output. It is not part of the command you enter in a terminal sesssion.):

$ idris2 --cg node --exec main1 --find-ipkg src/Tutorial/Folds.md
10000

As you can see, this time the computation finished without overflowing the call stack.

Tail recursive functions are allowed to consist of (possibly nested) pattern matches, with recursive calls at tail position in several of the branches. Here is an example:

countTR : (a -> Bool) -> List a -> Nat
countTR p = go 0
  where go : Nat -> List a -> Nat
        go k []        = k
        go k (x :: xs) = case p x of
          True  => go (S k) xs
          False => go k xs

Note, how each invocation of go is in tail position in its branch of the case expression.

Mutual Recursion

It is sometimes convenient to implement several related functions, which call each other recursively. In Idris, unlike in many other programming languages, a function must be declared in a source file before it can be called by other functions, as in general a function's implementation must be available during type checking (because Idris has dependent types). There are two ways around this, which actually result in the same internal representation in the compiler. Our first option is to write down the functions' declarations first with the implementations following after. Here's a silly example:

even : Nat -> Bool

odd : Nat -> Bool

even 0     = True
even (S k) = odd k

odd 0     = False
odd (S k) = even k

As you can see, function even is allowed to call function odd in its implementation, since odd has already been declared (but not yet implemented).

If you're like me and want to keep declarations and implementations next to each other, you can introduce a mutual block, which has the same effect. Like with other code blocks, functions in a mutual block must all be indented by the same amount of whitespace:

mutual
  even' : Nat -> Bool
  even' 0     = True
  even' (S k) = odd' k

  odd' : Nat -> Bool
  odd' 0     = False
  odd' (S k) = even' k

Just like with single recursive functions, mutually recursive functions can be optimized to imperative loops if all recursive calls occur at tail position. This is the case with functions even and odd, as can again be verified at the Node.js backend:

main2 : IO ()
main2 =  printLn (even 100000)
      >> printLn (odd 100000)
$ idris2 --cg node --exec main2 --find-ipkg src/Tutorial/Folds.md
True
False

Final Remarks

In this section, we learned about several important aspects of recursion and totality checking, which are summarized here:

  • In pure functional programming, recursion is the way to implement iterative procedures.

  • Recursive functions pass the totality checker, if it can verify that one of the arguments is getting strictly smaller in every recursive function call.

  • Arbitrary recursion can lead to stack overflow exceptions on backends with small stack size limits.

  • The JavaScript backends of Idris perform mutual tail call optimization: Tail recursive functions are converted to stack safe, imperative loops.

Note, that not all Idris backends you will come across in the wild will perform tail call optimization. Please check the corresponding documentation.

Note also, that most recursive functions in the core libraries (prelude and base) do not yet make use of tail recursion. There is an important reason for this: In many cases, non-tail recursive functions are easier to use in compile-time proofs, as they unify more naturally than their tail recursive counterparts. Compile-time proofs are an important aspect of programming in Idris (as we will see in later chapters), so there is a compromise to be made between what performs well at runtime and what works well at compile time. Eventually, the way to go might be to provide two implementations for most recursive functions with a transform rule telling the compiler to use the optimized version at runtime whenever programmers use the non-optimized version in their code. Such transform rules have - for instance - already been written for functions pack and unpack (which use fastPack and fastUnpack at runtime; see the corresponding rules in the following source file).

Exercises part 1

In these exercises you are going to implement several recursive functions. Make sure to use tail recursion whenever possible and quickly verify the correct behavior of all functions at the REPL.

  1. Implement functions anyList and allList, which return True if any element (or all elements in case of allList) in a list fulfills the given predicate:

    anyList : (a -> Bool) -> List a -> Bool
    
    allList : (a -> Bool) -> List a -> Bool
    
  2. Implement function findList, which returns the first value (if any) fulfilling the given predicate:

    findList : (a -> Bool) -> List a -> Maybe a
    
  3. Implement function collectList, which returns the first value (if any), for which the given function returns a Just:

    collectList : (a -> Maybe b) -> List a -> Maybe b
    

    Implement lookupList in terms of collectList:

    lookupList : Eq a => a -> List (a,b) -> Maybe b
    
  4. For functions like map or filter, which must loop over a list without affecting the order of elements, it is harder to write a tail recursive implementation. The safest way to do so is by using a SnocList (a reverse kind of list that's built from head to tail instead of from tail to head) to accumulate intermediate results. Its two constructors are Lin and (:<) (called the snoc operator). Module Data.SnocList exports two tail recursive operators called fish and chips ((<><) and (<>>)) for going from SnocList to List and vice versa. Have a look at the types of all new data constructors and operators before continuing with the exercise.

    Implement a tail recursive version of map for List by using a SnocList to reassemble the mapped list. Use then the chips operator with a Nil argument to in the end convert the SnocList back to a List.

    mapTR : (a -> b) -> List a -> List b
    
  5. Implement a tail recursive version of filter, which only keeps those values in a list, which fulfill the given predicate. Use the same technique as described in exercise 4.

    filterTR : (a -> Bool) -> List a -> List a
    
  6. Implement a tail recursive version of mapMaybe, which only keeps those values in a list, for which the given function argument returns a Just:

    mapMaybeTR : (a -> Maybe b) -> List a -> List b
    

    Implement catMaybesTR in terms of mapMaybeTR:

    catMaybesTR : List (Maybe a) -> List a
    
  7. Implement a tail recursive version of list concatenation:

    concatTR : List a -> List a -> List a
    
  8. Implement tail recursive versions of bind and join for List:

    bindTR : List a -> (a -> List b) -> List b
    
    joinTR : List (List a) -> List a
    

Notes on Totality Checking

module Tutorial.Folds.Totality

%default total

The totality checker in Idris verifies, that at least one (possibly erased!) argument in a recursive call converges towards a base case. For instance, with natural numbers, if the base case is zero (corresponding to data constructor Z), and we continue with k after pattern matching on S k, Idris can derive from Nat's constructors, that k is strictly smaller than S k and therefore the recursive call must converge towards a base case. Exactly the same reasoning is used when pattern matching on a list and continuing only with its tail in the recursive call.

While this works in many cases, it doesn't always go as expected. Below, I'll show you a couple of examples where totality checking fails, although we know, that the functions in question are definitely total.

Case 1: Recursion over a Primitive

Idris doesn't know anything about the internal structure of primitive data types. So the following function, although being obviously total, will not be accepted by the totality checker:

covering
replicatePrim : Bits32 -> a -> List a
replicatePrim 0 v = []
replicatePrim x v = v :: replicatePrim (x - 1) v

Unlike with natural numbers (Nat), which are defined as an inductive data type and are only converted to integer primitives during compilation, Idris can't tell that x - 1 is strictly smaller than x, and so it fails to verify that this must converge towards the base case. (The reason is, that x - 1 is implemented in terms of primitive function prim__sub_Bits32, which is built into the compiler and must be implemented by each backend individually. The totality checker knows about data types, constructors, and functions defined in Idris, but not about (primitive) functions and foreign functions implemented at the backends. While it is theoretically possible to also define and use laws for primitive and foreign functions, this hasn't yet been done for most of them.)

Since non-totality is highly contagious (all functions invoking a partial function are themselves considered to be partial by the totality checker), there is utility function assert_smaller, which we can use to convince the totality checker and still annotate our functions with the total keyword:

replicatePrim' : Bits32 -> a -> List a
replicatePrim' 0 v = []
replicatePrim' x v = v :: replicatePrim' (assert_smaller x $ x - 1) v

Please note, though, that whenever you use assert_smaller to silence the totality checker, the burden of proving totality rests on your shoulders. Failing to do so can lead to arbitrary and unpredictable program behavior (which is the default with most other programming languages).

Ex Falso Quodlibet

Below - as a demonstration - is a simple proof of Void. Void is an uninhabited type: a type with no values. Proofing Void means, that we implement a function accepted by the totality checker, which returns a value of type Void, although this is supposed to be impossible as there is no such value. Doing so allows us to completely disable the type system together with all the guarantees it provides. Here's the code and its dire consequences:

-- In order to proof `Void`, we just loop forever, using
-- `assert_smaller` to silence the totality checker.
proofOfVoid : Bits8 -> Void
proofOfVoid n = proofOfVoid (assert_smaller n n)

-- From a value of type `Void`, anything follows!
-- This function is safe and total, as there is no
-- value of type `Void`!
exFalsoQuodlibet : Void -> a
exFalsoQuodlibet _ impossible

-- By passing our proof of void to `exFalsoQuodlibet`
-- (exported by the *Prelude* by the name of `void`), we
-- can coerce any value to a value of any other type.
-- This renders type checking completely useless, as
-- we can freely convert between values of different
-- types.
coerce : a -> b
coerce _ = exFalsoQuodlibet (proofOfVoid 0)

-- Finally, we invoke `putStrLn` with a number instead
-- of a string. `coerce` allows us to do just that.
pain : IO ()
pain = putStrLn $ coerce 0

Please take a moment to marvel at provably total function coerce: It claims to convert any value to a value of any other type. And it is completely safe, as it only uses total functions in its implementation. The problem is - of course - that proofOfVoid should never ever have been a total function.

In pain we use coerce to conjure a string from an integer. In the end, we get what we deserve: The program crashes with an error. While things could have been much worse, it can still be quite time consuming and annoying to localize the source of such an error.

$ idris2 --cg node --exec pain --find-ipkg src/Tutorial/Folds.md
ERROR: No clauses

So, with a single thoughtless placement of assert_smaller we wrought havoc within our pure and total codebase sacrificing totality and type safety in one fell swoop. Therefore: Use at your own risk!

Note: I do not expect you to understand all the dark magic at work in the code above. I'll explain the details in due time in another chapter.

Second note: Ex falso quodlibet, also called the principle of explosion is a law in logic: From a contradiction, any statement can be proven. In our case, the contradiction was our proof of Void: The claim that we wrote a total function producing such a value, although Void is an uninhabited type. You can verify this by inspecting Void at the REPL with :doc Void: It has no data constructors.

Case 2: Recursion via Function Calls

Below is an implementation of a rose tree. Rose trees can represent search paths in computer algorithms, for instance in graph theory.

record Tree a where
  constructor Node
  value  : a
  forest : List (Tree a)

Forest : Type -> Type
Forest = List . Tree

We could try and compute the size of such a tree as follows:

covering
size : Tree a -> Nat
size (Node _ forest) = S . sum $ map size forest

In the code above, the recursive call happens within map. We know that we are using only subtrees in the recursive calls (since we know how map is implemented for List), but Idris can't know this (teaching a totality checker how to figure this out on its own seems to be an open research question). So it will refuse to accept the function as being total.

There are two ways to handle the case above. If we don't mind writing a bit of otherwise unneeded boilerplate code, we can use explicit recursion. In fact, since we often also work with search forests, this is the preferable way here.

mutual
  treeSize : Tree a -> Nat
  treeSize (Node _ forest) = S $ forestSize forest

  forestSize : Forest a -> Nat
  forestSize []        = 0
  forestSize (x :: xs) = treeSize x + forestSize xs

In the case above, Idris can verify that we don't blow up our trees behind its back as we are explicit about what happens in each recursive step. This is the safe, preferable way of going about this, especially if you are new to the language and totality checking in general.

However, sometimes the solution presented above is just too cumbersome to write. For instance, here is an implementation of Show for rose trees:

Show a => Show (Tree a) where
  showPrec p (Node v ts) =
    assert_total $ showCon p "Node" (showArg v ++ showArg ts)

In this case, we'd have to manually reimplement Show for lists of trees: A tedious task - and error-prone on its own. Instead, we resort to using the mighty sledgehammer of totality checking: assert_total. Needless to say that this comes with the same risks as assert_smaller, so be very careful.

Exercises part 2

Implement the following functions in a provably total way without "cheating". Note: It is not necessary to implement these in a tail recursive way.

  1. Implement function depth for rose trees. This should return the maximal number of Node constructors from the current node to the farthest child node. For instance, the current node should be at depth one, all its direct child nodes are at depth two, their immediate child nodes at depth three and so on.

  2. Implement interface Eq for rose trees.

  3. Implement interface Functor for rose trees.

  4. For the fun of it: Implement interface Show for rose trees.

  5. In order not to forget how to program with dependent types, implement function treeToVect for converting a rose tree to a vector of the correct size.

    Hint: Make sure to follow the same recursion scheme as in the implementation of treeSize. Otherwise, this might be very hard to get to work.

Interface Foldable

module Tutorial.Folds.Foldable

import Debug.Trace

%default total

When looking back at all the exercises we solved in the section about recursion, most tail recursive functions on lists were of the following pattern: Iterate over all list elements from head to tail while passing along some state for accumulating intermediate results. At the end of the list, return the final state or convert it with an additional function call.

Left Folds

This is functional programming, and we'd like to abstract over such reoccurring patterns. In order to tail recursively iterate over a list, all we need is an accumulator function and some initial state. But what should be the type of the accumulator? Well, it combines the current state with the list's next element and returns an updated state: state -> elem -> state. Surely, we can come up with a higher-order function to encapsulate this behavior:

leftFold : (acc : state -> el -> state) -> (st : state) -> List el -> state
leftFold _   st []        = st
leftFold acc st (x :: xs) = leftFold acc (acc st x) xs

We call this function a left fold, as it iterates over the list from left to right (head to tail), collapsing (or folding) the list until just a single value remains. This new value might still be a list or other container type, but the original list has been consumed from head to tail. Note how leftFold is tail recursive, and therefore all functions implemented in terms of leftFold are tail recursive (and thus, stack safe!) as well.

Here are a few examples:

sumLF : Num a => List a -> a
sumLF = leftFold (+) 0

reverseLF : List a -> List a
reverseLF = leftFold (flip (::)) Nil

-- this is more natural than `reverseLF`!
toSnocListLF : List a -> SnocList a
toSnocListLF = leftFold (:<) Lin

Right Folds

The example functions we implemented in terms of leftFold had to always completely traverse the whole list, as every single element was required to compute the result. This is not always necessary, however. For instance, if you look at findList from the exercises, we could abort iterating over the list as soon as our search was successful. It is not possible to implement this more efficient behavior in terms of leftFold: There, the result will only be returned when our pattern match reaches the Nil case.

Interestingly, there is another, non-tail recursive fold, which reflects the list structure more naturally, we can use for breaking out early from an iteration. We call this a right fold. Here is its implementation:

rightFold : (acc : el -> state -> state) -> state -> List el -> state
rightFold acc st []        = st
rightFold acc st (x :: xs) = acc x (rightFold acc st xs)

Now, it might not immediately be obvious how this differs from leftFold. In order to see this, we will have to talk about lazy evaluation first.

Lazy Evaluation in Idris

For some computations, it is not necessary to evaluate all function arguments in order to return a result. For instance, consider boolean operator (&&): If the first argument evaluates to False, we already know that the result is False without even looking at the second argument. In such a case, we don't want to unnecessarily evaluate the second argument, as this might include a lengthy computation.

Consider the following REPL session:

Tutorial.Folds> False && (length [1..10000000000] > 100)
False

If the second argument were evaluated, this computation would most certainly blow up your computer's memory, or at least take a very long time to run to completion. However, in this case, the result False is printed immediately. If you look at the type of (&&), you'll see the following:

Tutorial.Folds> :t (&&)
Prelude.&& : Bool -> Lazy Bool -> Bool

As you can see, the second argument is wrapped in a Lazy type constructor. This is a built-in type, and the details are handled by Idris automatically most of the time. For instance, when passing arguments to (&&), we don't have to manually wrap the values in some data constructor. A lazy function argument will only be evaluated at the moment it is required in the function's implementation, for instance, because it is being pattern matched on, or it is being passed as a strict argument to another function. In the implementation of (&&), the pattern match happens on the first argument, so the second will only be evaluated if the first argument is True and the second is returned as the function's (strict) result.

There are two utility functions for working with lazy evaluation: Function delay wraps a value in the Lazy data type. Note, that the argument of delay is strict, so the following might take several seconds to print its result:

Tutorial.Folds> False && (delay $ length [1..10000] > 100)
False

In addition, there is function force, which forces evaluation of a Lazy value.

Lazy Evaluation and Right Folds

We will now learn how to make use of rightFold and lazy evaluation to implement folds, which can break out from iteration early. Note, that in the implementation of rightFold the result of folding over the remainder of the list is passed as an argument to the accumulator (instead of the result of invoking the accumulator being used in the recursive call):

rightFold acc st (x :: xs) = acc x (rightFold acc st xs)

If the second argument of acc were lazily evaluated, it would be possible to abort the computation of acc's result without having to iterate till the end of the list:

foldHead : List a -> Maybe a
foldHead = force . rightFold first Nothing
  where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
        first v _ = Just v

Note, how Idris takes care of the bookkeeping of laziness most of the time. (It doesn't handle the curried invocation of rightFold correctly, though, so we either must pass on the list argument of foldHead explicitly, or compose the curried function with force to get the types right.)

In order to verify that this works correctly, we need a debugging utility called trace from module Debug.Trace. This "function" allows us to print debugging messages to the console at certain points in our pure code. Please note, that this is for debugging purposes only and should never be left lying around in production code, as, strictly speaking, printing stuff to the console breaks referential transparency.

Here is an adjusted version of foldHead, which prints "folded" to standard output every time utility function first is being invoked:

foldHeadTraced : List a -> Maybe a
foldHeadTraced = force . rightFold first Nothing
  where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
        first v _ = trace "folded" (Just v)

In order to test this at the REPL, we need to know that trace uses unsafePerformIO internally and therefore will not reduce during evaluation. We have to resort to the :exec command to see this in action at the REPL:

Tutorial.Folds> :exec printLn $ foldHeadTraced [1..10]
folded
Just 1

As you can see, although the list holds ten elements, first is only called once resulting in a considerable increase of efficiency.

Let's see what happens, if we change the implementation of first to use strict evaluation:

foldHeadTracedStrict : List a -> Maybe a
foldHeadTracedStrict = rightFold first Nothing
  where first : a -> Maybe a -> Maybe a
        first v _ = trace "folded" (Just v)

Although we don't use the second argument in the implementation of first, it is still being evaluated before evaluating the body of first, because Idris - unlike Haskell! - defaults to use strict semantics. Here's how this behaves at the REPL:

Tutorial.Folds> :exec printLn $ foldHeadTracedStrict [1..10]
folded
folded
folded
folded
folded
folded
folded
folded
folded
folded
Just 1

While this technique can sometimes lead to very elegant code, always remember that rightFold is not stack safe in the general case. So, unless your accumulator is guaranteed to return a result after not too many iterations, consider implementing your function tail recursively with an explicit pattern match. Your code will be slightly more verbose, but with the guaranteed benefit of stack safety.

Folds and Monoids

Left and right folds share a common pattern: In both cases, we start with an initial state value and use an accumulator function for combining the current state with the current element. This principle of combining values after starting from an initial value lies at the heart of an interface we've already learned about: Monoid. It therefore makes sense to fold a list over a monoid:

foldMapList : Monoid m => (a -> m) -> List a -> m
foldMapList f = leftFold (\vm,va => vm <+> f va) neutral

Note how, with foldMapList, we no longer need to pass an accumulator function. All we need is a conversion from the element type to a type with an implementation of Monoid. As we have already seen in the chapter about interfaces, there are many monoids in functional programming, and therefore, foldMapList is an incredibly useful function.

We could make this even shorter: If the elements in our list already are of a type with a monoid implementation, we don't even need a conversion function to collapse the list:

concatList : Monoid m => List m -> m
concatList = foldMapList id

Stop Using List for Everything

And here we are, finally, looking at a large pile of utility functions all dealing in some way with the concept of collapsing (or folding) a list of values into a single result. But all of these folding functions are just as useful when working with vectors, with non-empty lists, with rose trees, even with single-value containers like Maybe, Either e, or Identity. Heck, for the sake of completeness, they are even useful when working with zero-value containers like Control.Applicative.Const e! And since there are so many of these functions, we'd better look out for an essential set of them in terms of which we can implement all the others, and wrap up the whole bunch in an interface. This interface is called Foldable, and is available from the Prelude. When you look at its definition in the REPL (:doc Foldable), you'll see that it consists of six essential functions:

  • foldr, for folds from the right
  • foldl, for folds from the left
  • null, for testing if the container is empty or not
  • foldlM, for effectful folds in a monad
  • toList, for converting the container to a list of values
  • foldMap, for folding over a monoid

For a minimal implementation of Foldable, it is sufficient to only implement foldr. However, consider implementing all six functions manually, because folds over container types are often performance critical operations, and each of them should be optimized accordingly. For instance, implementing toList in terms of foldr for List just makes no sense, as this is a non-tail recursive function running in linear time complexity, while a hand-written implementation can just return its argument without any modifications.

Exercises part 3

In these exercises, you are going to implement Foldable for different data types. Make sure to try and manually implement all six functions of the interface.

  1. Implement Foldable for Crud i:

    data Crud : (i : Type) -> (a : Type) -> Type where
      Create : (value : a) -> Crud i a
      Update : (id : i) -> (value : a) -> Crud i a
      Read   : (id : i) -> Crud i a
      Delete : (id : i) -> Crud i a
    
  2. Implement Foldable for Response e i:

    data Response : (e, i, a : Type) -> Type where
      Created : (id : i) -> (value : a) -> Response e i a
      Updated : (id : i) -> (value : a) -> Response e i a
      Found   : (values : List a) -> Response e i a
      Deleted : (id : i) -> Response e i a
      Error   : (err : e) -> Response e i a
    
  3. Implement Foldable for List01. Use tail recursion in the implementations of toList, foldMap, and foldl.

    data List01 : (nonEmpty : Bool) -> Type -> Type where
      Nil  : List01 False a
      (::) : a -> List01 False a -> List01 ne a
    
  4. Implement Foldable for Tree. There is no need to use tail recursion in your implementations, but your functions must be accepted by the totality checker, and you are not allowed to cheat by using assert_smaller or assert_total.

    Hint: You can test the correct behavior of your implementations by running the same folds on the result of treeToVect and verify that the outcome is the same.

  5. Like Functor and Applicative, Foldable composes: The product and composition of two foldable container types are again foldable container types. Proof this by implementing Foldable for Comp and Product:

    record Comp (f,g : Type -> Type) (a : Type) where
      constructor MkComp
      unComp  : f (g a)
    
    record Product (f,g : Type -> Type) (a : Type) where
      constructor MkProduct
      fst : f a
      snd : g a
    

Conclusion

We learned a lot about recursion, totality checking, and folds in this chapter, all of which are important concepts in pure functional programming in general. Wrapping one's head around recursion takes time and experience. Therefore - as usual - try to solve as many exercises as you can.

In the next chapter, we are taking the concept of iterating over container types one step further and look at effectful data traversals.

Effectful Traversals

In this chapter, we are going to bring our treatment of the higher-kinded interfaces in the Prelude to an end. In order to do so, we will continue developing the CSV reader we started implementing in chapter Functor and Friends. I moved some of the data types and interfaces from that chapter to their own modules, so we can import them here without the need to start from scratch.

Note that unlike in our original CSV reader, we will use Validated instead of Either for handling exceptions, since this will allow us to accumulate all errors when reading a CSV file.

Reading CSV Tables

module Tutorial.Traverse.CSV

import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV

%default total

We stopped developing our CSV reader with function hdecode, which allows us to read a single line in a CSV file and decode it to a heterogeneous list. As a reminder, here is how to use hdecode at the REPL:

Tutorial.Traverse> hdecode [Bool,String,Bits8] 1 "f,foo,12"
Valid [False, "foo", 12]

The next step will be to parse a whole CSV table, represented as a list of strings, where each string corresponds to one of the table's rows. We will go about this stepwise as there are several aspects about doing this properly. What we are looking for - eventually - is a function of the following type (we are going to implement several versions of this function, hence the numbering):

hreadTable1 :  (0 ts : List Type)
            -> CSVLine (HList ts)
            => List String
            -> Validated CSVError (List $ HList ts)

In our first implementation, we are not going to care about line numbers:

hreadTable1 _  []        = pure []
hreadTable1 ts (s :: ss) = [| hdecode ts 0 s :: hreadTable1 ts ss |]

Note, how we can just use applicative syntax in the implementation of hreadTable1. To make this clearer, I used pure [] on the first line instead of the more specific Valid []. In fact, if we used Either or Maybe instead of Validated for error handling, the implementation of hreadTable1 would look exactly the same.

The question is: Can we extract a pattern to abstract over from this observation? What we do in hreadTable1 is running an effectful computation of type String -> Validated CSVError (HList ts) over a list of strings, so that the result is a list of HList ts wrapped in a Validated CSVError. The first step of abstraction should be to use type parameters for the input and output: Run a computation of type a -> Validated CSVError b over a list List a:

traverseValidatedList :  (a -> Validated CSVError b)
                      -> List a
                      -> Validated CSVError (List b)
traverseValidatedList _ []        = pure []
traverseValidatedList f (x :: xs) = [| f x :: traverseValidatedList f xs |]

hreadTable2 :  (0 ts : List Type)
            -> CSVLine (HList ts)
            => List String
            -> Validated CSVError (List $ HList ts)
hreadTable2 ts = traverseValidatedList (hdecode ts 0)

But our observation was, that the implementation of hreadTable1 would be exactly the same if we used Either CSVError or Maybe as our effect types instead of Validated CSVError. So, the next step should be to abstract over the effect type. We note, that we used applicative syntax (idiom brackets and pure) in our implementation, so we will need to write a function with an Applicative constraint on the effect type:

traverseList :  Applicative f => (a -> f b) -> List a -> f (List b)
traverseList _ []        = pure []
traverseList f (x :: xs) = [| f x :: traverseList f xs |]

hreadTable3 :  (0 ts : List Type)
            -> CSVLine (HList ts)
            => List String
            -> Validated CSVError (List $ HList ts)
hreadTable3 ts = traverseList (hdecode ts 0)

Note, how the implementation of traverseList is exactly the same as the one of traverseValidatedList, but the types are more general and therefore, traverseList is much more powerful.

Let's give this a go at the REPL:

Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,0"]
Valid [[False, 12], [True, 0]]
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,1000"]
Invalid (FieldError 0 2 "1000")
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["1,12","t,1000"]
Invalid (Append (FieldError 0 1 "1") (FieldError 0 2 "1000"))

This works very well already, but note how our error messages do not yet print the correct line numbers. That's not surprising, as we are using a dummy constant in our call to hdecode. We will look at how we can come up with the line numbers on the fly when we talk about stateful computations later in this chapter. For now, we could just manually annotate the lines with their numbers and pass a list of pairs to hreadTable:

hreadTable4 :  (0 ts : List Type)
            -> CSVLine (HList ts)
            => List (Nat, String)
            -> Validated CSVError (List $ HList ts)
hreadTable4 ts = traverseList (uncurry $ hdecode ts)

If this is the first time you came across function uncurry, make sure you have a look at its type and try to figure out why it is used here. There are several utility functions like this in the Prelude, such as curry, uncurry, flip, or even id, all of which can be very useful when working with higher-order functions.

While not perfect, this version at least allows us to verify at the REPL that the line numbers are passed to the error messages correctly:

Tutorial.Traverse> hreadTable4 [Bool,Bits8] [(1,"t,1000"),(2,"1,100")]
Invalid (Append (FieldError 1 2 "1000") (FieldError 2 1 "1"))

Interface Traversable

Now, here is an interesting observation: We can implement a function like traverseList for other container types as well. You might think that's obvious, given that we can convert container types to lists via function toList from interface Foldable. However, while going via List might be feasible in some occasions, it is undesirable in general, as we loose typing information. For instance, here is such a function for Vect:

traverseVect' : Applicative f => (a -> f b) -> Vect n a -> f (List b)
traverseVect' fun = traverseList fun . toList

Note how we lost all information about the structure of the original container type. What we are looking for is a function like traverseVect', which keeps this type level information: The result should be a vector of the same length as the input.

traverseVect : Applicative f => (a -> f b) -> Vect n a -> f (Vect n b)
traverseVect _   []        = pure []
traverseVect fun (x :: xs) = [| fun x :: traverseVect fun xs |]

That's much better! And as I wrote above, we can easily get the same for other container types like List1, SnocList, Maybe, and so on. As usual, some derived functions will follow immediately from traverseXY. For instance:

sequenceList : Applicative f => List (f a) -> f (List a)
sequenceList = traverseList id

All of this calls for a new interface, which is called Traversable and is exported from the Prelude. Here is its definition (with primes for disambiguation):

interface Functor t => Foldable t => Traversable' t where
  traverse' : Applicative f => (a -> f b) -> t a -> f (t b)

Function traverse is one of the most abstract and versatile functions available from the Prelude. Just how powerful it is will only become clear once you start using it over and over again in your code. However, it will be the goal of the remainder of this chapter to show you several diverse and interesting use cases.

For now, we will quickly focus on the degree of abstraction. Function traverse is parameterized over no less than four parameters: The container type t (List, Vect n, Maybe, to just name a few), the effect type (Validated e, IO, Maybe, and so on), the input element type a, and the output element type b. Considering that the libraries bundled with the Idris project export more than 30 data types with an implementation of Applicative and more than ten traversable container types, there are literally hundreds of combinations for traversing a container with an effectful computation. This number gets even larger once we realize that traversable containers - like applicative functors - are closed under composition (see the exercises and the final section in this chapter).

Traversable Laws

There are two laws function traverse must obey:

  • traverse (Id . f) = Id . map f: Traversing over the Identity monad is just functor map.
  • traverse (MkComp . map f . g) = MkComp . map (traverse f) . traverse g: Traversing with a composition of effects must be the same when being done in a single traversal (left hand side) or a sequence of two traversals (right hand side).

Since map id = id (functor's identity law), we can derive from the first law that traverse Id = Id. This means, that traverse must not change the size or shape of the container type, nor is it allowed to change the order of elements.

Exercises part 1

  1. It is interesting that Traversable has a Functor constraint. Proof that every Traversable is automatically a Functor by implementing map in terms of traverse.

    Hint: Remember Control.Monad.Identity.

  2. Likewise, proof that every Traversable is a Foldable by implementing foldMap in terms of Traverse.

    Hint: Remember Control.Applicative.Const.

  3. To gain some routine, implement Traversable' for List1, Either e, and Maybe.

  4. Implement Traversable for List01 ne:

    data List01 : (nonEmpty : Bool) -> Type -> Type where
      Nil  : List01 False a
      (::) : a -> List01 False a -> List01 ne a
    
  5. Implement Traversable for rose trees. Try to satisfy the totality checker without cheating.

    record Tree a where
      constructor Node
      value  : a
      forest : List (Tree a)
    
  6. Implement Traversable for Crud i:

    data Crud : (i : Type) -> (a : Type) -> Type where
      Create : (value : a) -> Crud i a
      Update : (id : i) -> (value : a) -> Crud i a
      Read   : (id : i) -> Crud i a
      Delete : (id : i) -> Crud i a
    
  7. Implement Traversable for Response e i:

    data Response : (e, i, a : Type) -> Type where
      Created : (id : i) -> (value : a) -> Response e i a
      Updated : (id : i) -> (value : a) -> Response e i a
      Found   : (values : List a) -> Response e i a
      Deleted : (id : i) -> Response e i a
      Error   : (err : e) -> Response e i a
    
  8. Like Functor, Applicative and Foldable, Traversable is closed under composition. Proof this by implementing Traversable for Comp and Product:

    record Comp (f,g : Type -> Type) (a : Type) where
      constructor MkComp
      unComp  : f (g a)
    
    record Product (f,g : Type -> Type) (a : Type) where
      constructor MkProduct
      fst : f a
      snd : g a
    

Programming with State

module Tutorial.Traverse.State

import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV

%default total

Let's go back to our CSV reader. In order to get reasonable error messages, we'd like to tag each line with its index:

zipWithIndex : List a -> List (Nat, a)

It is, of course, very easy to come up with an ad hoc implementation for this:

zipWithIndex = go 1
  where go : Nat -> List a -> List (Nat,a)
        go _ []        = []
        go n (x :: xs) = (n,x) :: go (S n) xs

While this is perfectly fine, we should still note that we might want to do the same thing with the elements of trees, vectors, non-empty lists and so on. And again, we are interested in whether there is some form of abstraction we can use to describe such computations.

Mutable References in Idris

Let us for a moment think about how we'd do such a thing in an imperative language. There, we'd probably define a local (mutable) variable to keep track of the current index, which would then be increased while iterating over the list in a for- or while-loop.

In Idris, there is no such thing as mutable state. Or is there? Remember, how we used a mutable reference to simulate a data base connection in an earlier exercise. There, we actually used some truly mutable state. However, since accessing or modifying a mutable variable is not a referential transparent operation, such actions have to be performed within IO. Other than that, nothing keeps us from using mutable variables in our code. The necessary functionality is available from module Data.IORef from the base library.

As a quick exercise, try to implement a function, which - given an IORef Nat - pairs a value with the current index and increases the index afterwards.

Here's how I would do this:

pairWithIndexIO : IORef Nat -> a -> IO (Nat,a)
pairWithIndexIO ref va = do
  ix <- readIORef ref
  writeIORef ref (S ix)
  pure (ix,va)

Note, that every time we run pairWithIndexIO ref, the natural number stored in ref is incremented by one. Also, look at the type of pairWithIndexIO ref: a -> IO (Nat,a). We want to apply this effectful computation to each element in a list, which should lead to a new list wrapped in IO, since all of this describes a single computation with side effects. But this is exactly what function traverse does: Our input type is a, our output type is (Nat,a), our container type is List, and the effect type is IO!

zipListWithIndexIO : IORef Nat -> List a -> IO (List (Nat,a))
zipListWithIndexIO ref = traverse (pairWithIndexIO ref)

Now this is really powerful: We could apply the same function to any traversable data structure. It therefore makes absolutely no sense to specialize zipListWithIndexIO to lists only:

zipWithIndexIO : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO ref = traverse (pairWithIndexIO ref)

To please our intellectual minds even more, here is the same function in point-free style:

zipWithIndexIO' : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO' = traverse . pairWithIndexIO

All that's left to do now is to initialize a new mutable variable before passing it to zipWithIndexIO:

zipFromZeroIO : Traversable t => t a -> IO (t (Nat,a))
zipFromZeroIO ta = newIORef 0 >>= (`zipWithIndexIO` ta)

Quickly, let's give this a go at the REPL:

> :exec zipFromZeroIO {t = List} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]
> :exec zipFromZeroIO (Just 12) >>= printLn
Just (0, 12)
> :exec zipFromZeroIO {t = Vect 2} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]

Thus, we solved the problem of tagging each element with its index once and for all for all traversable container types.

The State Monad

Alas, while the solution presented above is elegant and performs very well, it still carries its IO stain, which is fine if we are already in IO land, but unacceptable otherwise. We do not want to make our otherwise pure functions much harder to test and reason about just for a simple case of stateful element tagging.

Luckily, there is an alternative to using a mutable reference, which allows us to keep our computations pure and untainted. However, it is not easy to come upon this alternative on one's own, and it can be hard to figure out what's going on here, so I'll try to introduce this slowly. We first need to ask ourselves what the essence of a "stateful" but otherwise pure computation is. There are two essential ingredients:

  1. Access to the current state. In case of a pure function, this means that the function should take the current state as one of its arguments.
  2. Ability to communicate the updated state to later stateful computations. In case of a pure function this means, that the function will return a pair of values: The computation's result plus the updated state.

These two prerequisites lead to the following generic type for a pure, stateful computation operating on state type st and producing values of type a:

Stateful : (st : Type) -> (a : Type) -> Type
Stateful st a = st -> (st, a)

Our use case is pairing elements with indices, which can be implemented as a pure, stateful computation like so:

pairWithIndex' : a -> Stateful Nat (Nat,a)
pairWithIndex' v index = (S index, (index,v))

Note, how we at the same time increment the index, returning the incremented value as the new state, while pairing the first argument with the original index.

Now, here is an important thing to note: While Stateful is a useful type alias, Idris in general does not resolve interface implementations for function types. If we want to write a small library of utility functions around such a type, it is therefore best to wrap it in a single-constructor data type and use this as our building block for writing more complex computations. We therefore introduce record State as a wrapper for pure, stateful computations:

public export
record State st a where
  constructor ST
  runST : st -> (st,a)

We can now implement pairWithIndex in terms of State like so:

export
pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = ST $ \index => (S index, (index, v))

In addition, we can define some more utility functions. Here's one for getting the current state without modifying it (this corresponds to readIORef):

get : State st st
get = ST $ \s => (s,s)

Here are two others, for overwriting the current state. These corresponds to writeIORef and modifyIORef:

put : st -> State st ()
put v = ST $ \_ => (v,())

modify : (st -> st) -> State st ()
modify f = ST $ \v => (f v,())

Finally, we can define three functions in addition to runST for running stateful computations

runState : st -> State st a -> (st, a)
runState = flip runST

export
evalState : st -> State st a -> a
evalState s = snd . runState s

execState : st -> State st a -> st
execState s = fst . runState s

All of these are useful on their own, but the real power of State s comes from the observation that it is a monad. Before you go on, please spend some time and try implementing Functor, Applicative, and Monad for State s yourself. Even if you don't succeed, you will have an easier time understanding how the implementations below work.

export
Functor (State st) where
  map f (ST run) = ST $ \s => let (s2,va) = run s in (s2, f va)

export
Applicative (State st) where
  pure v = ST $ \s => (s,v)

  ST fun <*> ST val = ST $ \s =>
    let (s2, f)  = fun s
        (s3, va) = val s2
     in (s3, f va)

export
Monad (State st) where
  ST val >>= f = ST $ \s =>
    let (s2, va) = val s
     in runST (f va) s2

This may take some time to digest, so we come back to it in a slightly advanced exercise. The most important thing to note is, that we use every state value only ever once. We must make sure that the updated state is passed to later computations, otherwise the information about state updates is being lost. This can best be seen in the implementation of Applicative: The initial state, s, is used in the computation of the function value, which will also return an updated state, s2, which is then used in the computation of the function argument. This will again return an updated state, s3, which is passed on to later stateful computations together with the result of applying f to va.

Exercises part 2

This sections consists of two extended exercise, the aim of which is to increase your understanding of the state monad. In the first exercise, we will look at random value generation, a classical application of stateful computations. In the second exercise, we will look at an indexed version of a state monad, which allows us to not only change the state's value but also its type during computations.

  1. Below is the implementation of a simple pseudo-random number generator. We call this a pseudo-random number generator, because the numbers look pretty random but are generated predictably. If we initialize a series of such computations with a truly random seed, most users of our library will not be able to predict the outcome of our computations.

    rnd : Bits64 -> Bits64
    rnd seed = fromInteger
             $ (437799614237992725 * cast seed) `mod` 2305843009213693951
    

    The idea here is that the next pseudo-random number gets calculated from the previous one. But once we think about how we can use these numbers as seeds for computing random values of other types, we realize that these are just stateful computations. We can therefore write down an alias for random value generators as stateful computations:

    Gen : Type -> Type
    Gen = State Bits64
    

    Before we begin, please note that rnd is not a very strong pseudo-random number generator. It will not generate values in the full 64bit range, nor is it safe to use in cryptographic applications. It is sufficient for our purposes in this chapter, however. Note also, that we could replace rnd with a stronger generator without any changes to the functions you will implement as part of this exercise.

    1. Implement bits64 in terms of rnd. This should return the current state, updating it afterwards by invoking function rnd. Make sure the state is properly updated, otherwise this won't behave as expected.

      bits64 : Gen Bits64
      

      This will be our only primitive generator, from which we will derived all the others. Therefore, before you continue, quickly test your implementation of bits64 at the REPL:

      Solutions.Traverse> runState 100 bits64
      (2274787257952781382, 100)
      
    2. Implement range64 for generating random values in the range [0,upper]. Hint: Use bits64 and mod in your implementation but make sure to deal with the fact that mod x upper produces values in the range [0,upper).

      range64 : (upper : Bits64) -> Gen Bits64
      

      Likewise, implement interval64 for generating values in the range [min a b, max a b]:

      interval64 : (a,b : Bits64) -> Gen Bits64
      

      Finally, implement interval for arbitrary integral types.

      interval : Num n => Cast n Bits64 => (a,b : n) -> Gen n
      

      Note, that interval will not generate all possible values in the given interval but only such values with a Bits64 representation in the the range [0,2305843009213693950].

    3. Implement a generator for random boolean values.

    4. Implement a generator for Fin n. You'll have to think carefully about getting this one to typecheck and be accepted by the totality checker without cheating. Note: Have a look at function Data.Fin.natToFin.

    5. Implement a generator for selecting a random element from a vector of values. Use the generator from exercise 4 in your implementation.

    6. Implement vect and list. In case of list, the first argument should be used to randomly determine the length of the list.

      vect : {n : _} -> Gen a -> Gen (Vect n a)
      
      list : Gen Nat -> Gen a -> Gen (List a)
      

      Use vect to implement utility function testGen for testing your generators at the REPL:

      testGen : Bits64 -> Gen a -> Vect 10 a
      
    7. Implement choice.

      choice : {n : _} -> Vect (S n) (Gen a) -> Gen a
      
    8. Implement either.

      either : Gen a -> Gen b -> Gen (Either a b)
      
    9. Implement a generator for printable ASCII characters. These are characters with ASCII codes in the interval [32,126]. Hint: Function chr from the Prelude will be useful here.

    10. Implement a generator for strings. Hint: Function pack from the Prelude might be useful for this.

      string : Gen Nat -> Gen Char -> Gen String
      
    11. We shouldn't forget about our ability to encode interesting things in the types in Idris, so, for a challenge and without further ado, implement hlist (note the distinction between HListF and HList). If you are rather new to dependent types, this might take a moment to digest, so don't forget to use holes.

      data HListF : (f : Type -> Type) -> (ts : List Type) -> Type where
        Nil  : HListF f []
        (::) : (x : f t) -> (xs : HLift f ts) -> HListF f (t :: ts)
      
      hlist : HListF Gen ts -> Gen (HList ts)
      
    12. Generalize hlist to work with any applicative functor, not just Gen.

    If you arrived here, please realize how we can now generate pseudo-random values for most primitives, as well as regular sum- and product types. Here is an example REPL session:

    > testGen 100 $ hlist [bool, printableAscii, interval 0 127]
    [[True, ';', 5],
     [True, '^', 39],
     [False, 'o', 106],
     [True, 'k', 127],
     [False, ' ', 11],
     [False, '~', 76],
     [True, 'M', 11],
     [False, 'P', 107],
     [True, '5', 67],
     [False, '8', 9]]
    

    Final remarks: Pseudo-random value generators play an important role in property based testing libraries like QuickCheck or Hedgehog. The idea of property based testing is to test predefined properties of pure functions against a large number of randomly generated arguments, to get strong guarantees about these properties to hold for all possible arguments. One example would be a test for verifying that the result of reversing a list twice equals the original list. While it is possible to proof many of the simpler properties in Idris directly without the need for tests, this is no longer possible as soon as functions are involved, which don't reduce during unification such as foreign function calls or functions not publicly exported from other modules.

  2. While State s a gives us a convenient way to talk about stateful computations, it only allows us to mutate the state's value but not its type. For instance, the following function cannot be encapsulated in State because the type of the state changes:

    uncons : Vect (S n) a -> (Vect n a, a)
    uncons (x :: xs) = (xs, x)
    

    Your task is to come up with a new state type allowing for such changes (sometimes referred to as an indexed state data type). The goal of this exercise is to also sharpen your skills in expressing things at the type level including derived function types and interfaces. Therefore, I will give only little guidance on how to go about this. If you get stuck, feel free to peek at the solutions but make sure to only look at the types at first.

    1. Come up with a parameterized data type for encapsulating stateful computations where the input and output state type can differ. It must be possible to wrap uncons in a value of this type.

    2. Implement Functor for your indexed state type.

    3. It is not possible to implement Applicative for this indexed state type (but see also exercise 2.vii). Still, implement the necessary functions to use it with idom brackets.

    4. It is not possible to implement Monad for this indexed state type. Still, implement the necessary functions to use it in do blocks.

    5. Generalize the functions from exercises 3 and 4 with two new interfaces IxApplicative and IxMonad and provide implementations of these for your indexed state data type.

    6. Implement functions get, put, modify, runState, evalState, and execState for the indexed state data type. Make sure to adjust the type parameters where necessary.

    7. Show that your indexed state type is strictly more powerful than State by implementing Applicative and Monad for it.

      Hint: Keep the input and output state identical. Note also, that you might need to implement join manually if Idris has trouble inferring the types correctly.

    Indexed state types can be useful when we want to make sure that stateful computations are combined in the correct sequence, or that scarce resources get cleaned up properly. We might get back to such use cases in later examples.

The Power of Composition

module Tutorial.Traverse.Composition

import Tutorial.Traverse.State

import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV

%default total

After our excursion into the realms of stateful computations, we will go back and combine mutable state with error accumulation to tag and read CSV lines in a single traversal. We already defined pairWithIndex for tagging lines with their indices. We also have uncurry $ hdecode ts for decoding single tagged lines. We can now combine the two effects in a single computation:

tagAndDecode :  (0 ts : List Type)
             -> CSVLine (HList ts)
             => String
             -> State Nat (Validated CSVError (HList ts))
tagAndDecode ts s = uncurry (hdecode ts) <$> pairWithIndex s

Now, as we learned before, applicative functors are closed under composition, and the result of tagAndDecode is a nesting of two applicatives: State Nat and Validated CSVError. The Prelude exports a corresponding named interface implementation (Prelude.Applicative.Compose), which we can use for traversing a list of strings with tagAndDecode. Remember, that we have to provide named implementations explicitly. Since traverse has the applicative functor as its second constraint, we also need to provide the first constraint (Traversable) explicitly. But this is going to be the unnamed default implementation! To get our hands on such a value, we can use the %search pragma:

readTable :  (0 ts : List Type)
          -> CSVLine (HList ts)
          => List String
          -> Validated CSVError (List $ HList ts)
readTable ts = evalState 1 . traverse @{%search} @{Compose} (tagAndDecode ts)

This tells Idris to use the default implementation for the Traversable constraint, and Prelude.Applicatie.Compose for the Applicative constraint. While this syntax is not very nice, it doesn't come up too often, and if it does, we can improve things by providing custom functions for better readability:

traverseComp : Traversable t
             => Applicative f
             => Applicative g
             => (a -> f (g b))
             -> t a
             -> f (g (t b))
traverseComp = traverse @{%search} @{Compose}

readTable' :  (0 ts : List Type)
           -> CSVLine (HList ts)
           => List String
           -> Validated CSVError (List $ HList ts)
readTable' ts = evalState 1 . traverseComp (tagAndDecode ts)

Note, how this allows us to combine two computational effects (mutable state and error accumulation) in a single list traversal.

But I am not yet done demonstrating the power of composition. As you showed in one of the exercises, Traversable is also closed under composition, so a nesting of traversables is again a traversable. Consider the following use case: When reading a CSV file, we'd like to allow lines to be annotated with additional information. Such annotations could be mere comments but also some formatting instructions or other custom data tags might be feasible. Annotations are supposed to be separated from the rest of the content by a single hash character (#). We want to keep track of these optional annotations so we come up with a custom data type encapsulating this distinction:

data Line : Type -> Type where
  Annotated : String -> a -> Line a
  Clean     : a -> Line a

This is just another container type and we can easily implement Traversable for Line (do this yourself as a quick exercise):

Functor Line where
  map f (Annotated s x) = Annotated s $ f x
  map f (Clean x)       = Clean $ f x

Foldable Line where
  foldr f acc (Annotated _ x) = f x acc
  foldr f acc (Clean x)       = f x acc

Traversable Line where
  traverse f (Annotated s x) = Annotated s <$> f x
  traverse f (Clean x)       = Clean <$> f x

Below is a function for parsing a line and putting it in its correct category. For simplicity, we just split the line on hashes: If the result consists of exactly two strings, we treat the second part as an annotation, otherwise we treat the whole line as untagged CSV content.

readLine : String -> Line String
readLine s = case split ('#' ==) s of
  h ::: [t] => Annotated t h
  _         => Clean s

We are now going to implement a function for reading whole CSV tables, keeping track of line annotations:

readCSV :  (0 ts : List Type)
        -> CSVLine (HList ts)
        => String
        -> Validated CSVError (List $ Line $ HList ts)
readCSV ts = evalState 1
           . traverse @{Compose} @{Compose} (tagAndDecode ts)
           . map readLine
           . lines

Let's digest this monstrosity. This is written in point-free style, so we have to read it from end to beginning. First, we split the whole string at line breaks, getting a list of strings (function Data.String.lines). Next, we analyze each line, keeping track of optional annotations (map readLine). This gives us a value of type List (Line String). Since this is a nesting of traversables, we invoke traverse with a named instance from the Prelude: Prelude.Traversable.Compose. Idris can disambiguate this based on the types, so we can drop the namespace prefix. But the effectful computation we run over the list of lines results in a composition of applicative functors, so we also need the named implementation for compositions of applicatives in the second constraint (again without need of an explicit prefix, which would be Prelude.Applicative here). Finally, we evaluate the stateful computation with evalState 1.

Honestly, I wrote all of this without verifying if it works, so let's give it a go at the REPL. I'll provide two example strings for this, a valid one without errors, and an invalid one. I use multiline string literals here, about which I'll talk in more detail in a later chapter. For the moment, note that these allow us to conveniently enter string literals with line breaks:

validInput : String
validInput = """
  f,12,-13.01#this is a comment
  t,100,0.0017
  t,1,100.8#color: red
  f,255,0.0
  f,24,1.12e17
  """

invalidInput : String
invalidInput = """
  o,12,-13.01#another comment
  t,100,0.0017
  t,1,abc
  f,256,0.0
  f,24,1.12e17
  """

And here's how it goes at the REPL:

Tutorial.Traverse> readCSV [Bool,Bits8,Double] validInput
Valid [Annotated "this is a comment" [False, 12, -13.01],
       Clean [True, 100, 0.0017],
       Annotated "color: red" [True, 1, 100.8],
       Clean [False, 255, 0.0],
       Clean [False, 24, 1.12e17]]

Tutorial.Traverse> readCSV [Bool,Bits8,Double] invalidInput
Invalid (Append (FieldError 1 1 "o")
  (Append (FieldError 3 3 "abc") (FieldError 4 2 "256")))

It is pretty amazing how we wrote dozens of lines of code, always being guided by the type- and totality checkers, arriving eventually at a function for parsing properly typed CSV tables with automatic line numbering and error accumulation, all of which just worked on first try.

Exercises part 3

The Prelude provides three additional interfaces for container types parameterized over two type parameters such as Either or Pair: Bifunctor, Bifoldable, and Bitraversable. In the following exercises we get some hands-one experience working with these. You are supposed to look up what functions they provide and how to implement and use them yourself.

  1. Assume we'd like to not only interpret CSV content but also the optional comment tags in our CSV files. For this, we could use a data type such as Tagged:

    data Tagged : (tag, value : Type) -> Type where
      Tag  : tag -> value -> Tagged tag value
      Pure : value -> Tagged tag value
    

    Implement interfaces Functor, Foldable, and Traversable but also Bifunctor, Bifoldable, and Bitraversable for Tagged.

  2. Show that the composition of a bifunctor with two functors such as Either (List a) (Maybe b) is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation of Bifunctor. Likewise for Bifoldable/Foldable and Bitraversable/Traversable.

  3. Show that the composition of a functor with a bifunctor such as List (Either a b) is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation of Bifunctor. Likewise for Bifoldable/Foldable and Bitraversable/Traversable.

  4. We are now going to adjust readCSV in such a way that it decodes comment tags and CSV content in a single traversal. We need a new error type to include invalid tags for this:

    data TagError : Type where
      CE         : CSVError -> TagError
      InvalidTag : (line : Nat) -> (tag : String) -> TagError
      Append     : TagError -> TagError -> TagError
    
    Semigroup TagError where (<+>) = Append
    

    For testing, we also define a simple data type for color tags:

    data Color = Red | Green | Blue
    

    You should now implement the following functions, but please note that while readColor will need to access the current line number in case of an error, it must not increase it, as otherwise line numbers will be wrong in the invocation of tagAndDecodeTE.

    readColor : String -> State Nat (Validated TagError Color)
    
    readTaggedLine : String -> Tagged String String
    
    tagAndDecodeTE :  (0 ts : List Type)
                   -> CSVLine (HList ts)
                   => String
                   -> State Nat (Validated TagError (HList ts))
    

    Finally, implement readTagged by using the wrapper type from exercise 3 as well as readColor and tagAndDecodeTE in a call to bitraverse. The implementation will look very similar to readCSV but with some additional wrapping and unwrapping at the right places.

    readTagged :  (0 ts : List Type)
               -> CSVLine (HList ts)
               => String
               -> Validated TagError (List $ Tagged Color $ HList ts)
    

    Test your implementation with some example strings at the REPL.

You can find more examples for functor/bifunctor compositions in Haskell's bifunctors package.

Conclusion

Interface Traversable and its main function traverse are incredibly powerful forms of abstraction - even more so, because both Applicative and Traversable are closed under composition. If you are interested in additional use cases, the publication, which introduced Traversable to Haskell, is a highly recommended read: The Essence of the Iterator Pattern

The base library provides an extended version of the state monad in module Control.Monad.State. We will look at this in more detail when we talk about monad transformers. Please note also, that IO itself is implemented as a simple state monad over an abstract, primitive state type: %World.

Here's a short summary of what we learned in this chapter:

  • Function traverse is used to run effectful computations over container types without affecting their size or shape.
  • We can use IORef as mutable references in stateful computations running in IO.
  • For referentially transparent computations with "mutable" state, the State monad is extremely useful.
  • Applicative functors are closed under composition, so we can run several effectful computations in a single traversal.
  • Traversables are also closed under composition, so we can use traverse to operate on a nesting of containers.

For now, this concludes our introduction of the Prelude's higher-kinded interfaces, which started with the introduction of Functor, Applicative, and Monad, before moving on to Foldable, and - last but definitely not least - Traversable. There's one still missing - Alternative - but this will have to wait a bit longer, because we need to first make our brains smoke with some more type-level wizardry.

Sigma Types

So far in our examples of dependently typed programming, type indices such as the length of vectors were known at compile time or could be calculated from values known at compile time. In real applications, however, such information is often not available until runtime, where values depend on the decisions made by users or the state of the surrounding world. For instance, if we store a file's content as a vector of lines of text, the length of this vector is in general unknown until the file has been loaded into memory. As a consequence, the types of values we work with depend on other values only known at runtime, and we can often only figure out these types by pattern matching on the values they depend on. To express these dependencies, we need so called sigma types: Dependent pairs and their generalization, dependent records.

Dependent Pairs

module Tutorial.DPair.DPair

import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect

import Text.CSV

%default total

We've already seen several examples of how useful the length index of a vector is to describe more precisely in the types what a function can and can't do. For instance, map or traverse operating on a vector will return a vector of exactly the same length. The types guarantee that this is true, therefore the following function is perfectly safe and provably total:

parseAndDrop : Vect (3 + n) String -> Maybe (Vect n Nat)
parseAndDrop = map (drop 3) . traverse parsePositive

Since the argument of traverse parsePositive is of type Vect (3 + n) String, its result will be of type Maybe (Vect (3 + n) Nat). It is therefore safe to use this in a call to drop 3. Note, how all of this is known at compile time: We encoded the prerequisite that the first argument is a vector of at least three elements in the length index and could derive the length of the result from this.

Vectors of Unknown Length

However, this is not always possible. Consider the following function, defined on List and exported by Data.List:

Tutorial.Relations> :t takeWhile
Data.List.takeWhile : (a -> Bool) -> List a -> List a

This will take the longest prefix of the list argument, for which the given predicate returns True. In this case, it depends on the list elements and the predicate, how long this prefix will be. Can we write such a function for vectors? Let's give it a try:

takeWhile' : (a -> Bool) -> Vect n a -> Vect m a

Go ahead, and try to implement this. Don't try too long, as you will not be able to do so in a provably total way. The question is: What is the problem here? In order to understand this, we have to realize what the type of takeWhile' promises: "For all predicates operating on values on type a, and for all vectors holding values of this type, and for all lengths m, I give you a vector of length m holding values of type a". All three arguments are said to be universally quantified: The caller of our function is free to choose the predicate, the input vector, the type of values the vector holds, and the length of the output vector. Don't believe me? See here:

-- This looks like trouble: We got a non-empty vector of `Void`...
voids : Vect 7 Void
voids = takeWhile' (const True) []

-- ...from which immediately follows a proof of `Void`
proofOfVoid : Void
proofOfVoid = head voids

See how I could freely decide on the value of m when invoking takeWhile'? Although I passed takeWhile' an empty vector (the only existing vector holding values of type Void), the function's type promises me to return a possibly non-empty vector holding values of the same type, from which I freely extracted the first one.

Luckily, Idris doesn't allow this: We won't be able to implement takeWhile' without cheating (for instance, by turning totality checking off and looping forever). So, the question remains, how to express the result of takeWhile' in a type. The answer to this is: "Use a dependent pair", a vector paired with a value corresponding to its length.

record AnyVect a where
  constructor MkAnyVect
  length : Nat
  vect   : Vect length a

This corresponds to existential quantification in predicate logic: There is a natural number, which corresponds to the length of the vector I have here. Note, how from the outside of AnyVect a, the length of the wrapped vector is no longer visible at the type level but we can still inspect it and learn something about it at runtime, since it is wrapped up together with the actual vector. We can implement takeWhile in such a way that it returns a value of type AnyVect a:

takeWhile : (a -> Bool) -> Vect n a -> AnyVect a
takeWhile f []        = MkAnyVect 0 []
takeWhile f (x :: xs) = case f x of
  False => MkAnyVect 0 []
  True  => let MkAnyVect n ys = takeWhile f xs in MkAnyVect (S n) (x :: ys)

This works in a provably total way, because callers of this function can no longer choose the length of the resulting vector themselves. Our function, takeWhile, decides on this length and returns it together with the vector, and the type checker verifies that we make no mistakes when pairing the two values. In fact, the length can be inferred automatically by Idris, so we can replace it with underscores, if we so desire:

takeWhile2 : (a -> Bool) -> Vect n a -> AnyVect a
takeWhile2 f []        = MkAnyVect _ []
takeWhile2 f (x :: xs) = case f x of
  False => MkAnyVect 0 []
  True  => let MkAnyVect _ ys = takeWhile2 f xs in MkAnyVect _ (x :: ys)

To summarize: Parameters in generic function types are universally quantified, and their values can be decided on at the call site of such functions. Dependent record types allow us to describe existentially quantified values. Callers cannot choose such values freely: They are returned as part of a function's result.

Note, that Idris allows us to be explicit about universal quantification. The type of takeWhile' can also be written like so:

takeWhile'' : forall a, n, m . (a -> Bool) -> Vect n a -> Vect m a

Universally quantified arguments are desugared to implicit erased arguments by Idris. The above is a less verbose version of the following function type, the likes of which we have seen before:

takeWhile''' :  {0 a : _}
             -> {0 n : _}
             -> {0 m : _}
             -> (a -> Bool)
             -> Vect n a
             -> Vect m a

In Idris, we are free to choose whether we want to be explicit about universal quantification. Sometimes it can help understanding what's going on at the type level. Other languages - for instance PureScript - are more strict about this: There, explicit annotations on universally quantified parameters are mandatory.

The Essence of Dependent Pairs

It can take some time and experience to understand what's going on here. At least in my case, it took many sessions programming in Idris, before I figured out what dependent pairs are about: They pair a value of some type with a second value of a type calculated from the first value. For instance, a natural number n (the value) paired with a vector of length n (the second value, the type of which depends on the first value). This is such a fundamental concept of programming with dependent types, that a general dependent pair type is provided by the Prelude. Here is its implementation (primed for disambiguation):

record DPair' (a : Type) (p : a -> Type) where
  constructor MkDPair'
  fst : a
  snd : p fst

It is essential to understand what's going on here. There are two parameters: A type a, and a function p, calculating a type from a value of type a. Such a value (fst) is then used to calculate the type of the second value (snd). For instance, here is AnyVect a represented as a DPair:

AnyVect' : (a : Type) -> Type
AnyVect' a = DPair Nat (\n => Vect n a)

Note, how \n => Vect n a is a function from Nat to Type. Idris provides special syntax for describing dependent pairs, as they are important building blocks for programming in languages with first class types:

AnyVect'' : (a : Type) -> Type
AnyVect'' a = (n : Nat ** Vect n a)

We can inspect at the REPL, that the right hand side of AnyVect'' get's desugared to the right hand side of AnyVect':

Tutorial.Relations> (n : Nat ** Vect n Int)
DPair Nat (\n => Vect n Int)

Idris can infer, that n must be of type Nat, so we can drop this information. (We still need to put the whole expression in parentheses.)

AnyVect3 : (a : Type) -> Type
AnyVect3 a = (n ** Vect n a)

This allows us to pair a natural number n with a vector of length n, which is exactly what we did with AnyVect. We can therefore rewrite takeWhile to return a DPair instead of our custom type AnyVect. Note, that like with regular pairs, we can use the same syntax (x ** y) for creating and pattern matching on dependent pairs:

takeWhile3 : (a -> Bool) -> Vect m a -> (n ** Vect n a)
takeWhile3 f []        = (_ ** [])
takeWhile3 f (x :: xs) = case f x of
  False => (_ ** [])
  True  => let (_  ** ys= takeWhile3 f xs in (_ ** x :: ys)

Just like with regular pairs, we can use the dependent pair syntax to define dependent triples and larger tuples:

AnyMatrix : (a : Type) -> Type
AnyMatrix a = (m ** n ** Vect m (Vect n a))

Erased Existentials

Sometimes, it is possible to determine the value of an index by pattern matching on a value of the indexed type. For instance, by pattern matching on a vector, we can learn about its length index. In these cases, it is not strictly necessary to carry around the index at runtime, and we can write a special version of a dependent pair where the first argument has quantity zero. Module Data.DPair from base exports data type Exists for this use case.

As an example, here is a version of takeWhile returning a value of type Exists:

takeWhileExists : (a -> Bool) -> Vect m a -> Exists (\n => Vect n a)
takeWhileExists f []        = Evidence _ []
takeWhileExists f (x :: xs) = case f x of
  True  => let Evidence _ ys = takeWhileExists f xs
            in Evidence _ (x :: ys)
  False => takeWhileExists f xs

In order to restore an erased value, data type Singleton from base module Data.Singleton can be useful: It is parameterized by the value it stores:

true : Singleton True
true = Val True

This is called a singleton type: A type corresponding to exactly one value. It is a type error to return any other value for constant true, and Idris knows this:

true' : Singleton True
true' = Val _

We can use this to conjure the (erased!) length of a vector out of thin air:

vectLength : Vect n a -> Singleton n
vectLength []        = Val 0
vectLength (x :: xs) = let Val k = vectLength xs in Val (S k)

This function comes with much stronger guarantees than Data.Vect.length: The latter claims to just return any natural number, while vectLength must return exactly n in order to type check. As a demonstration, here is a well-typed bogus implementation of length:

bogusLength : Vect n a -> Nat
bogusLength = const 0

This would not be accepted as a valid implementation of vectLength, as you may quickly verify yourself.

With the help of vectLength (but not with Data.Vect.length) we can convert an erased existential to a proper dependent pair:

toDPair : Exists (\n => Vect n a) -> (m ** Vect m a)
toDPair (Evidence _ as) = let Val m = vectLength as in (m ** as)

Again, as a quick exercise, try implementing toDPair in terms of length, and note how Idris will fail to unify the result of length with the actual length of the vector.

Exercises part 1

  1. Declare and implement a function for filtering a vector similar to Data.List.filter.

  2. Declare and implement a function for mapping a partial function over the values of a vector similar to Data.List.mapMaybe.

  3. Declare and implement a function similar to Data.List.dropWhile for vectors. Use Data.DPair.Exists as your return type.

  4. Repeat exercise 3 but return a proper dependent pair. Use the function from exercise 3 in your implementation.

Use Case: Nucleic Acids

module Tutorial.DPair.DNA

import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect

import Text.CSV

%default total

We'd like to come up with a small, simplified library for running computations on nucleic acids: RNA and DNA. These are built from five types of nucleobases, three of which are used in both types of nucleic acids and two bases specific for each type of acid. We'd like to make sure that only valid bases are in strands of nucleic acids. Here's a possible encoding:

data BaseType = DNABase | RNABase

data Nucleobase : BaseType -> Type where
  Adenine  : Nucleobase b
  Cytosine : Nucleobase b
  Guanine  : Nucleobase b
  Thymine  : Nucleobase DNABase
  Uracile  : Nucleobase RNABase

NucleicAcid : BaseType -> Type
NucleicAcid = List . Nucleobase

RNA : Type
RNA = NucleicAcid RNABase

DNA : Type
DNA = NucleicAcid DNABase

encodeBase : Nucleobase b -> Char
encodeBase Adenine  = 'A'
encodeBase Cytosine = 'C'
encodeBase Guanine  = 'G'
encodeBase Thymine  = 'T'
encodeBase Uracile  = 'U'

encode : NucleicAcid b -> String
encode = pack . map encodeBase

It is a type error to use Uracile in a strand of DNA:

failing "Mismatch between: RNABase and DNABase."
  errDNA : DNA
  errDNA = [Uracile, Adenine]

Note, how we used a variable for nucleobases Adenine, Cytosine, and Guanine: These are again universally quantified, and client code is free to choose a value here. This allows us to use these bases in strands of DNA and RNA:

dna1 : DNA
dna1 = [Adenine, Cytosine, Guanine]

rna1 : RNA
rna1 = [Adenine, Cytosine, Guanine]

With Thymine and Uracile, we are more restrictive: Thymine is only allowed in DNA, while Uracile is restricted to be used in RNA strands. Let's write parsers for strands of DNA and RNA:

readAnyBase : Char -> Maybe (Nucleobase b)
readAnyBase 'A' = Just Adenine
readAnyBase 'C' = Just Cytosine
readAnyBase 'G' = Just Guanine
readAnyBase _   = Nothing

readRNABase : Char -> Maybe (Nucleobase RNABase)
readRNABase 'U' = Just Uracile
readRNABase c   = readAnyBase c

readDNABase : Char -> Maybe (Nucleobase DNABase)
readDNABase 'T' = Just Thymine
readDNABase c   = readAnyBase c

readRNA : String -> Maybe RNA
readRNA = traverse readRNABase . unpack

readDNA : String -> Maybe DNA
readDNA = traverse readDNABase . unpack

Again, in case of the bases appearing in both kinds of strands, users of the universally quantified readAnyBase are free to choose what base type they want, but they will never get a Thymine or Uracile value.

We can now implement some simple calculations on sequences of nucleobases. For instance, we can come up with the complementary strand:

complementRNA' : RNA -> RNA
complementRNA' = map calc
  where calc : Nucleobase RNABase -> Nucleobase RNABase
        calc Guanine  = Cytosine
        calc Cytosine = Guanine
        calc Adenine  = Uracile
        calc Uracile  = Adenine

complementDNA' : DNA -> DNA
complementDNA' = map calc
  where calc : Nucleobase DNABase -> Nucleobase DNABase
        calc Guanine  = Cytosine
        calc Cytosine = Guanine
        calc Adenine  = Thymine
        calc Thymine  = Adenine

Ugh, code repetition! Not too bad here, but imagine there were dozens of bases with only few specialized ones. Surely, we can do better? Unfortunately, the following won't work:

complementBase' : Nucleobase b -> Nucleobase b
complementBase' Adenine  = ?what_now
complementBase' Cytosine = Guanine
complementBase' Guanine  = Cytosine
complementBase' Thymine  = Adenine
complementBase' Uracile  = Adenine

All goes well with the exception of the Adenine case. Remember: Parameter b is universally quantified, and the callers of our function can decide what b is supposed to be. We therefore can't just return Thymine: Idris will respond with a type error since callers might want a Nucleobase RNABase instead. One way to go about this is to take an additional unerased argument (explicit or implicit) representing the base type:

complementBase : (b : BaseType) -> Nucleobase b -> Nucleobase b
complementBase DNABase Adenine  = Thymine
complementBase RNABase Adenine  = Uracile
complementBase _       Cytosine = Guanine
complementBase _       Guanine  = Cytosine
complementBase _       Thymine  = Adenine
complementBase _       Uracile  = Adenine

This is again an example of a dependent function type (also called a pi type): The input and output types both depend on the value of the first argument. We can now use this to calculate the complement of any nucleic acid:

complement : (b : BaseType) -> NucleicAcid b -> NucleicAcid b
complement b = map (complementBase b)

Now, here is an interesting use case: We'd like to read a sequence of nucleobases from user input, accepting two strings: The first telling us, whether the user plans to enter a DNA or RNA sequence, the second being the sequence itself. What should be the type of such a function? Well, we're describing computations with side effects, so something involving IO seems about right. User input almost always needs to be validated or translated, so something might go wrong and we need an error type for this case. Finally, our users can decide whether they want to enter a strand of RNA or DNA, so this distinction should be encoded as well.

Of course, it is always possible to write a custom sum type for such a use case:

data Result : Type where
  UnknownBaseType : String -> Result
  InvalidSequence : String -> Result
  GotDNA          : DNA -> Result
  GotRNA          : RNA -> Result

This has all possible outcomes encoded in a single data type. However, it is lacking in terms of flexibility. If we want to handle errors early on and just extract a strand of RNA or DNA, we need yet another data type:

data RNAOrDNA = ItsRNA RNA | ItsDNA DNA

This might be the way to go, but for results with many options, this can get cumbersome quickly. Also: Why come up with a custom data type when we already have the tools to deal with this at our hands?

Here is how we can encode this with a dependent pair:

namespace InputError
  public export
  data InputError : Type where
    UnknownBaseType : String -> InputError
    InvalidSequence : String -> InputError

readAcid : (b : BaseType) -> String -> Either InputError (NucleicAcid b)
readAcid b str =
  let err = InvalidSequence str
   in case b of
        DNABase => maybeToEither err $ readDNA str
        RNABase => maybeToEither err $ readRNA str

getNucleicAcid : IO (Either InputError (b ** NucleicAcid b))
getNucleicAcid = do
  baseString <- getLine
  case baseString of
    "DNA" => map (MkDPair _) . readAcid DNABase <$> getLine
    "RNA" => map (MkDPair _) . readAcid RNABase <$> getLine
    _     => pure $ Left (UnknownBaseType baseString)

Note, how we paired the type of nucleobases with the nucleic acid sequence. Assume now we implement a function for transcribing a strand of DNA to RNA, and we'd like to convert a sequence of nucleobases from user input to the corresponding RNA sequence. Here's how to do this:

transcribeBase : Nucleobase DNABase -> Nucleobase RNABase
transcribeBase Adenine  = Uracile
transcribeBase Cytosine = Guanine
transcribeBase Guanine  = Cytosine
transcribeBase Thymine  = Adenine

transcribe : DNA -> RNA
transcribe = map transcribeBase

printRNA : RNA -> IO ()
printRNA = putStrLn . encode

transcribeProg : IO ()
transcribeProg = do
  Right (b ** seq<- getNucleicAcid
    | Left (InvalidSequence str) => putStrLn $ "Invalid sequence: " ++ str
    | Left (UnknownBaseType str) => putStrLn $ "Unknown base type: " ++ str
  case b of
    DNABase => printRNA $ transcribe seq
    RNABase => printRNA seq

By pattern matching on the first value of the dependent pair we could determine, whether the second value is an RNA or DNA sequence. In the first case, we had to transcribe the sequence first, in the second case, we could invoke printRNA directly.

In a more interesting scenario, we would translate the RNA sequence to the corresponding protein sequence. Still, this example shows how to deal with a simplified real world scenario: Data may be encoded differently and coming from different sources. By using precise types, we are forced to first convert values to the correct format. Failing to do so leads to a compile time exception instead of an error at runtime or - even worse - the program silently running a bogus computation.

Dependent Records vs Sum Types

Dependent records as shown for AnyVect a are a generalization of dependent pairs: We can have an arbitrary number of fields and use the values stored therein to calculate the types of other values. For very simple cases like the example with nucleobases, it doesn't matter too much, whether we use a DPair, a custom dependent record, or even a sum type. In fact, the three encodings are equally expressive:

Acid1 : Type
Acid1 = (b ** NucleicAcid b)

record Acid2 where
  constructor MkAcid2
  baseType : BaseType
  sequence : NucleicAcid baseType

data Acid3 : Type where
  SomeRNA : RNA -> Acid3
  SomeDNA : DNA -> Acid3

It is trivial to write lossless conversions between these encodings, and with each encoding we can decide with a simple pattern match, whether we currently have a sequence of RNA or DNA. However, dependent types can depend on more than one value, as we will see in the exercises. In such cases, sum types and dependent pairs quickly become unwieldy, and you should go for an encoding as a dependent record.

Exercises part 2

Sharpen your skills in using dependent pairs and dependent records! In exercises 2 to 7 you have to decide yourself, when a function should return a dependent pair or record, when a function requires additional arguments, on which you can pattern match, and what other utility functions might be necessary.

  1. Proof that the three encodings for nucleobases are isomorphic (meaning: of the same structure) by writing lossless conversion functions from Acid1 to Acid2 and back. Likewise for Acid1 and Acid3.

  2. Sequences of nucleobases can be encoded in one of two directions: Sense and antisense. Declare a new data type to describe the sense of a sequence of nucleobases, and add this as an additional parameter to type Nucleobase and types DNA and RNA.

  3. Refine the types of complement and transcribe, so that they reflect the changing of sense. In case of transcribe, a strand of antisense DNA is converted to a strand of sense RNA.

  4. Define a dependent record storing the base type and sense together with a sequence of nucleobases.

  5. Adjust readRNA and readDNA in such a way that the sense of a sequence is read from the input string. Sense strands are encoded like so: "5´-CGGTAG-3´". Antisense strands are encoded like so: "3´-CGGTAG-5´".

  6. Adjust encode in such a way that it includes the sense in its output.

  7. Enhance getNucleicAcid and transcribeProg in such a way that the sense and base type are stored together with the sequence, and that transcribeProg always prints the sense RNA strand (after transcription, if necessary).

  8. Enjoy the fruits of your labour and test your program at the REPL.

Note: Instead of using a dependent record, we could again have used a sum type of four constructors to encode the different types of sequences. However, the number of constructors required corresponds to the product of the number of values of each type level index. Therefore, this number can grow quickly and sum type encodings can lead to lengthy blocks of pattern matches in these cases.

Use Case: CSV Files with a Schema

module Tutorial.DPair.CSV

import Control.Monad.State

import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect

import Text.CSV

%default total

In this section, we are going to look at an extended example based on our previous work on CSV parsers. We'd like to write a small command-line program, where users can specify a schema for the CSV tables they'd like to parse and load into memory. Before we begin, here is a REPL session running the final program, which you will complete in the exercises:

Solutions.DPair> :exec main
Enter a command: load resources/example
Table loaded. Schema: str,str,fin2023,str?,boolean?
Enter a command: get 3
Row 3:

str   | str    | fin2023 | str? | boolean?
------------------------------------------
Floor | Jansen | 1981    |      | t

Enter a command: add Mikael,Stanne,1974,,
Row prepended:

str    | str    | fin2023 | str? | boolean?
-------------------------------------------
Mikael | Stanne | 1974    |      |

Enter a command: get 1
Row 1:

str    | str    | fin2023 | str? | boolean?
-------------------------------------------
Mikael | Stanne | 1974    |      |

Enter a command: delete 1
Deleted row: 1.
Enter a command: get 1
Row 1:

str | str     | fin2023 | str? | boolean?
-----------------------------------------
Rob | Halford | 1951    |      |

Enter a command: quit
Goodbye.

This example was inspired by a similar program used as an example in the Type-Driven Development with Idris book.

We'd like to focus on several things here:

  • Purity: With the exception of the main program loop, all functions used in the implementation should be pure, which in this context means "not running in any monad with side effects such as IO".
  • Fail early: With the exception of the command parser, all functions updating the table and handling queries should be typed and implemented in such a way that they cannot fail.

We are often well advised to adhere to these two guidelines, as they can make the majority of our functions easier to implement and test.

Since we allow users of our library to specify a schema (order and types of columns) for the table they work with, this information is not known until runtime. The same goes for the current size of the table. We will therefore store both values as fields in a dependent record.

Encoding the Schema

We need to inspect the table schema at runtime. Although theoretically possible, it is not advisable to operate on Idris types directly here. We'd rather use a closed custom data type describing the types of columns we understand. In a first try, we only support some Idris primitives:

data ColType = I64 | Str | Boolean | Float

Schema : Type
Schema = List ColType

Next, we need a way to convert a Schema to a list of Idris types, which we will then use as the index of a heterogeneous list representing the rows in our table:

IdrisType : ColType -> Type
IdrisType I64     = Int64
IdrisType Str     = String
IdrisType Boolean = Bool
IdrisType Float   = Double

Row : Schema -> Type
Row = HList . map IdrisType

We can now describe a table as a dependent record storing the table's content as a vector of rows. In order to safely index rows of the table and parse new rows to be added, the current schema and size of the table must be known at runtime:

record Table where
  constructor MkTable
  schema : Schema
  size   : Nat
  rows   : Vect size (Row schema)

Finally, we define an indexed data type describing commands operating on the current table. Using the current table as the command's index allows us to make sure that indices for accessing and deleting rows are within bounds and that new rows agree with the current schema. This is necessary to uphold our second design principle: All functions operating on tables must do so without the possibility of failure.

data Command : (t : Table) -> Type where
  PrintSchema : Command t
  PrintSize   : Command t
  New         : (newSchema : Schema) -> Command t
  Prepend     : Row (schema t) -> Command t
  Get         : Fin (size t) -> Command t
  Delete      : Fin (size t) -> Command t
  Quit        : Command t

We can now implement the main application logic: How user entered commands affect the application's current state. As promised, this comes without the risk of failure, so we don't have to wrap the return type in an Either:

applyCommand : (t : Table) -> Command t -> Table
applyCommand t                 PrintSchema = t
applyCommand t                 PrintSize   = t
applyCommand _                 (New ts)    = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r) = MkTable ts _ $ r :: rs
applyCommand t                 (Get x)     = t
applyCommand t                 Quit        = t
applyCommand (MkTable ts n rs) (Delete x)  = case n of
  S k => MkTable ts k (deleteAt x rs)
  Z   => absurd x

Please understand, that the constructors of Command t are typed in such a way that indices are always within bounds (constructors Get and Delete), and new rows adhere to the table's current schema (constructor Prepend).

One thing you might not have seen so far is the call to absurd on the last line. This is a derived function of the Uninhabited interface, which is used to describe types such as Void or - in the case above - Fin 0, of which there can be no value. Function absurd is then just another manifestation of the principle of explosion. If this doesn't make too much sense yet, don't worry. We will look at Void and its uses in the next chapter.

Parsing Commands

User input validation is an important topic when writing applications. If it happens early, you can keep larger parts of your application pure (which - in this context - means: "without the possibility of failure") and provably total. If done properly, this step encodes and handles most if not all ways in which things can go wrong in your program, allowing you to come up with clear error messages telling users exactly what caused an issue. As you surely have experienced yourself, there are few things more frustrating than a non-trivial computer program terminating with an unhelpful "There was an error" message.

So, in order to treat this important topic with all due respect, we are first going to implement a custom error type. This is not strictly necessary for small programs, but once your software gets more complex, it can be tremendously helpful for keeping track of what can go wrong where. In order to figure out what can possibly go wrong, we first need to decide on how the commands should be entered. Here, we use a single keyword for each command, together with an optional number of arguments separated from the keyword by a single space character. For instance: "new i64,boolean,str,str", for initializing an empty table with a new schema. With this settled, here is a list of things that can go wrong, and the messages we'd like to print:

  • A bogus command is entered. We repeat the input with a message that we don't know the command plus a list of commands we know about.
  • An invalid schema was entered. In this case, we list the position of the first unknown type, the string we found there, and a list of types we know about.
  • An invalid CSV encoding of a row was entered. We list the erroneous position, the string encountered there, plus the expected type. In case of a too small or too large number of fields, we also print a corresponding error message.
  • An index was out of bounds. This can happen, when users try to access or delete specific rows. We print the current number of rows plus the value entered.
  • A value not representing a natural number was entered as an index. We print an according error message.

That's a lot of stuff to keep track of, so let's encode this in a sum type:

data Error : Type where
  UnknownCommand : String -> Error
  UnknownType    : (pos : Nat) -> String -> Error
  InvalidField   : (pos : Nat) -> ColType -> String -> Error
  ExpectedEOI    : (pos : Nat) -> String -> Error
  UnexpectedEOI  : (pos : Nat) -> String -> Error
  OutOfBounds    : (size : Nat) -> (index : Nat) -> Error
  NoNat          : String -> Error

In order to conveniently construct our error messages, it is best to use Idris' string interpolation facilities: We can enclose arbitrary string expressions in a string literal by enclosing them in curly braces, the first of which must be escaped with a backslash. Like so: "foo \{myExpr a b c}". We can pair this with multiline string literals to get nicely formatted error messages.

showColType : ColType -> String
showColType I64      = "i64"
showColType Str      = "str"
showColType Boolean  = "boolean"
showColType Float    = "float"

showSchema : Schema -> String
showSchema = concat . intersperse "," . map showColType

allTypes : String
allTypes = concat
         . List.intersperse ", "
         . map showColType
         $ [I64,Str,Boolean,Float]

showError : Error -> String
showError (UnknownCommand x) = """
  Unknown command: \{x}.
  Known commands are: clear, schema, size, new, add, get, delete, quit.
  """

showError (UnknownType pos x) = """
  Unknown type at position \{show pos}: \{x}.
  Known types are: \{allTypes}.
  """

showError (InvalidField pos tpe x) = """
  Invalid value at position \{show pos}.
  Expected type: \{showColType tpe}.
  Value found: \{x}.
  """

showError (ExpectedEOI k x) = """
  Expected end of input.
  Position: \{show k}
  Input: \{x}
  """

showError (UnexpectedEOI k x) = """
  Unxpected end of input.
  Position: \{show k}
  Input: \{x}
  """

showError (OutOfBounds size index) = """
  Index out of bounds.
  Size of table: \{show size}
  Index: \{show index}
  Note: Indices start at 1.
  """

showError (NoNat x) = "Not a natural number: \{x}"

We can now write parsers for the different commands. We need facilities to parse vector indices, schemata, and CSV rows. Since we are using a CSV format for encoding and decoding rows, it makes sense to also encode the schema as a comma-separated list of values:

zipWithIndex : Traversable t => t a -> t (Nat, a)
zipWithIndex = evalState 1 . traverse pairWithIndex
  where pairWithIndex : a -> State Nat (Nat,a)
        pairWithIndex v = (,v) <$> get <* modify S

fromCSV : String -> List String
fromCSV = forget . split (',' ==)

readColType : Nat -> String -> Either Error ColType
readColType _ "i64"      = Right I64
readColType _ "str"      = Right Str
readColType _ "boolean"  = Right Boolean
readColType _ "float"    = Right Float
readColType n s          = Left $ UnknownType n s

readSchema : String -> Either Error Schema
readSchema = traverse (uncurry readColType) . zipWithIndex . fromCSV

We also need to decode CSV content based on the current schema. Note, how we can do so in a type safe manner by pattern matching on the schema, which will not be known until runtime. Unfortunately, we need to reimplement CSV-parsing, because we want to add the expected type to the error messages (a thing that would be much harder to do with interface CSVLine and error type CSVError).

decodeField : Nat -> (c : ColType) -> String -> Either Error (IdrisType c)
decodeField k c s =
  let err = InvalidField k c s
   in case c of
        I64     => maybeToEither err $ read s
        Str     => maybeToEither err $ read s
        Boolean => maybeToEither err $ read s
        Float   => maybeToEither err $ read s

decodeRow : {ts : _} -> String -> Either Error (Row ts)
decodeRow s = go 1 ts $ fromCSV s
  where go : Nat -> (cs : Schema) -> List String -> Either Error (Row cs)
        go k []       []         = Right []
        go k []       (_ :: _)   = Left $ ExpectedEOI k s
        go k (_ :: _) []         = Left $ UnexpectedEOI k s
        go k (c :: cs) (s :: ss) = [| decodeField k c s :: go (S k) cs ss |]

There is no hard and fast rule about whether to pass an index as an implicit argument or not. Some considerations:

  • Pattern matching on explicit arguments comes with less syntactic overhead.
  • If an argument can be inferred from the context most of the time, consider passing it as an implicit to make your function nicer to use in client code.
  • Use explicit (possibly erased) arguments for values that can't be inferred by Idris most of the time.

All that is missing now is a way to parse indices for accessing the current table's rows. We use the conversion for indices to start at one instead of zero, which feels more natural for most non-programmers.

readFin : {n : _} -> String -> Either Error (Fin n)
readFin s = do
  S k <- maybeToEither (NoNat s) $ parsePositive {a = Nat} s
    | Z => Left $ OutOfBounds n Z
  maybeToEither (OutOfBounds n $ S k) $ natToFin k n

We are finally able to implement a parser for user commands. Function Data.String.words is used for splitting a string at space characters. In most cases, we expect the name of the command plus a single argument without additional spaces. CSV rows can have additional space characters, however, so we use Data.String.unwords on the split string.

readCommand :  (t : Table) -> String -> Either Error (Command t)
readCommand _                "schema"  = Right PrintSchema
readCommand _                "size"    = Right PrintSize
readCommand _                "quit"    = Right Quit
readCommand (MkTable ts n _) s         = case words s of
  ["new",    str] => New     <$> readSchema str
  "add" ::   ss   => Prepend <$> decodeRow (unwords ss)
  ["get",    str] => Get     <$> readFin str
  ["delete", str] => Delete  <$> readFin str
  _               => Left $ UnknownCommand s

Running the Application

All that's left to do is to write functions for printing the results of commands to users and run the application in a loop until command "quit" is entered.

encodeField : (t : ColType) -> IdrisType t -> String
encodeField I64     x     = show x
encodeField Str     x     = show x
encodeField Boolean True  = "t"
encodeField Boolean False = "f"
encodeField Float   x     = show x

encodeRow : (ts : List ColType) -> Row ts -> String
encodeRow ts = concat . intersperse "," . go ts
  where go : (cs : List ColType) -> Row cs -> Vect (length cs) String
        go []        []        = []
        go (c :: cs) (v :: vs) = encodeField c v :: go cs vs

result :  (t : Table) -> Command t -> String
result t PrintSchema = "Current schema: \{showSchema t.schema}"
result t PrintSize   = "Current size: \{show t.size}"
result _ (New ts)    = "Created table. Schema: \{showSchema ts}"
result t (Prepend r) = "Row prepended: \{encodeRow t.schema r}"
result _ (Delete x)  = "Deleted row: \{show $ FS x}."
result _ Quit        = "Goodbye."
result t (Get x)     =
  "Row \{show $ FS x}: \{encodeRow t.schema (index x t.rows)}"

covering
runProg : Table -> IO ()
runProg t = do
  putStr "Enter a command: "
  str <- getLine
  case readCommand t str of
    Left err   => putStrLn (showError err) >> runProg t
    Right Quit => putStrLn (result t Quit)
    Right cmd  => putStrLn (result t cmd) >>
                  runProg (applyCommand t cmd)

covering
main : IO ()
main = runProg $ MkTable [] _ []

Exercises part 3

The challenges presented here all deal with enhancing our table editor in several interesting ways. Some of them are more a matter of style and less a matter of learning to write dependently typed programs, so feel free to solve these as you please. Exercises 1 to 3 should be considered to be mandatory.

  1. Add support for storing Idris types Integer and Nat in CSV columns

  2. Add support for Fin n to CSV columns. Note: We need runtime access to n in order for this to work.

  3. Add support for optional types to CSV columns. Since missing values should be encoded by empty strings, it makes no sense to allow for nested optional types, meaning that types like Maybe Nat should be allowed while Maybe (Maybe Nat) should not.

    Hint: There are several ways to encode these, one being to add a boolean index to ColType.

  4. Add a command for printing the whole table. Bonus points if all columns are properly aligned.

  5. Add support for simple queries: Given a column number and a value, list all rows where entries match the given value.

    This might be a challenge, as the types get pretty interesting.

  6. Add support for loading and saving tables from and to disk. A table should be stored in two files: One for the schema and one for the CSV content.

    Note: Reading files in a provably total way can be pretty hard and will be a topic for another day. For now, just use function readFile exported from System.File in base for reading a file as a whole. This function is partial, because it will not terminate when used with an infinite input stream such as /dev/urandom or /dev/zero. It is important to not use assert_total here. Using partial functions like readFile might well impose a security risk in a real world application, so eventually, we'd have to deal with this and allow for some way to limit the size of accepted input. It is therefore best to make this partiality visible and annotate all downstream functions accordingly.

You can find an implementation of these additions in the solutions. A small example table can be found in folder resources.

Note: There are of course tons of projects to pursue from here, such as writing a proper query language, calculating new rows from existing ones, accumulating values in a column, concatenating and zipping tables, and so on. We will stop for now, probably coming back to this in later examples.

Conclusion

Dependent pairs and records are necessary to at runtime inspect the values defining the types we work with. By pattern matching on these values, we learn about the types and possible shapes of other values, allowing us to reduce the number of potential bugs in our programs.

In the next chapter we start learning about how to write data types, which we use as proofs that certain contracts between values hold. These will eventually allow us to define pre- and post conditions for our function arguments and output types.

Propositional Equality

In the last chapter we learned, how dependent pairs and records can be used to calculate types from values only known at runtime by pattern matching on these values. We will now look at how we can describe relations - or contracts - between values as types, and how we can use values of these types as proofs that the contracts hold.

Equality as a Type

module Tutorial.Eq.Eq

import Data.Either
import Data.HList
import Data.Vect
import Data.String

%default total

Imagine, we'd like to concatenate the contents of two CSV files, both of which we stored on disk as tables together with their schemata as shown in our discussion about dependent pairs:

public export
data ColType = I64 | Str | Boolean | Float

public export
Schema : Type
Schema = List ColType

IdrisType : ColType -> Type
IdrisType I64     = Int64
IdrisType Str     = String
IdrisType Boolean = Bool
IdrisType Float   = Double

Row : Schema -> Type
Row = HList . map IdrisType

record Table where
  constructor MkTable
  schema : Schema
  size   : Nat
  rows   : Vect size (Row schema)

concatTables1 : Table -> Table -> Maybe Table

We will not be able to implement concatTables1 by appending the two row vectors, unless we can somehow verify that the two schemata are identical. "Well," I hear you say, "that shouldn't be a big issue! Just implement Eq for ColType". Let's give this a try:

Eq ColType where
  I64     == I64     = True
  Str     == Str     = True
  Boolean == Boolean = True
  Float   == Float   = True
  _       == _       = False

concatTables1 (MkTable s1 m rs1) (MkTable s2 n rs2) = case s1 == s2 of
  True  => ?what_now
  False => Nothing

Somehow, this doesn't seem to work. If we inspect the context of hole what_now, Idris still thinks that s1 and s2 are different, and if we go ahead and invoke Vect.(++) anyway in the True case, Idris will respond with a type error.

Tutorial.Relations> :t what_now
   m : Nat
   s1 : List ColType
   rs1 : Vect m (HList (map IdrisType s1))
   n : Nat
   s2 : List ColType
   rs2 : Vect n (HList (map IdrisType s2))
------------------------------
what_now : Maybe Table

The problem is, that there is no reason for Idris to unify the two values, even though (==) returned True because the result of (==) holds no other information than the type being a Bool. We think, if this is True the two values should be identical, but Idris is not convinced. In fact, the following implementation of Eq ColType would be perfectly fine as far as the type checker is concerned:

Eq ColType where
  _       == _       = True

So Idris is right in not trusting us. You might expect it to inspect the implementation of (==) and figure out on its own, what the True result means, but this is not how these things work in general, because most of the time the number of computational paths to check would be far too large. As a consequence, Idris is able to evaluate functions during unification, but it will not trace back information about function arguments from a function's result for us. We can do so manually, however, as we will see later.

A Type for equal Schemata

The problem described above is similar to what we saw when we talked about the benefit of singleton types: The types are not precise enough. What we are going to do now, is something we'll repeat time again for different use cases: We encode a contract between values in an indexed data type:

public export
data SameSchema : (s1 : Schema) -> (s2 : Schema) -> Type where
  Same : SameSchema s s

First, note how SameSchema is a family of types indexed over two values of type Schema. But note also that the sole constructor restricts the values we allow for s1 and s2: The two indices must be identical.

Why is this useful? Well, imagine we had a function for checking the equality of two schemata, which would try and return a value of type SameSchema s1 s2:

sameSchema : (s1, s2 : Schema) -> Maybe (SameSchema s1 s2)

We could then use this function to implement concatTables:

concatTables : Table -> Table -> Maybe Table
concatTables (MkTable s1 m rs1) (MkTable s2 n rs2) = case sameSchema s1 s2 of
  Just Same => Just $ MkTable s1 _ (rs1 ++ rs2)
  Nothing   => Nothing

It worked! What's going on here? Well, let's inspect the types involved:

concatTables2 : Table -> Table -> Maybe Table
concatTables2 (MkTable s1 m rs1) (MkTable s2 n rs2) = case sameSchema s1 s2 of
  Just Same => ?almost_there
  Nothing   => Nothing

At the REPL, we get the following context for almost_there:

Tutorial.Relations> :t almost_there
   m : Nat
   s2 : List ColType
   rs1 : Vect m (HList (map IdrisType s2))
   n : Nat
   rs2 : Vect n (HList (map IdrisType s2))
   s1 : List ColType
------------------------------
almost_there : Maybe Table

See, how the types of rs1 and rs2 unify? Value Same, coming as the result of sameSchema s1 s2, is a witness that s1 and s2 are actually identical, because this is what we specified in the definition of Same.

All that remains to do is to implement sameSchema. For this, we will write another data type for specifying when two values of type ColType are identical:

public export
data SameColType : (c1, c2 : ColType) -> Type where
  SameCT : SameColType c1 c1

We can now define several utility functions. First, one for figuring out if two column types are identical:

sameColType : (c1, c2 : ColType) -> Maybe (SameColType c1 c2)
sameColType I64     I64     = Just SameCT
sameColType Str     Str     = Just SameCT
sameColType Boolean Boolean = Just SameCT
sameColType Float   Float   = Just SameCT
sameColType _ _             = Nothing

This will convince Idris, because in each pattern match, the return type will be adjusted according to the values we matched on. For instance, on the first line, the output type is Maybe (SameColType I64 I64) as you can easily verify yourself by inserting a hole and checking its type at the REPL.

We will need two additional utilities: Functions for creating values of type SameSchema for the nil and cons cases. Please note, how the implementations are trivial. Still, we often have to quickly write such small proofs (I'll explain in the next section, why I call them proofs), which will then be used to convince the type checker about some fact we already take for granted but Idris does not.

sameNil : SameSchema [] []
sameNil = Same

sameCons :  SameColType c1 c2
         -> SameSchema s1 s2
         -> SameSchema (c1 :: s1) (c2 :: s2)
sameCons SameCT Same = Same

As usual, it can help understanding what's going on by replacing the right hand side of sameCons with a hole an check out its type and context at the REPL. The presence of values SameCT and Same on the left hand side forces Idris to unify c1 and c2 as well as s1 and s2, from which the unification of c1 :: s1 and c2 :: s2 immediately follows. With these, we can finally implement sameSchema:

sameSchema []        []        = Just sameNil
sameSchema (x :: xs) (y :: ys) =
  [| sameCons (sameColType x y) (sameSchema xs ys) |]
sameSchema (x :: xs) []        = Nothing
sameSchema []        (x :: xs) = Nothing

What we described here is a far stronger form of equality than what is provided by interface Eq and the (==) operator: Equality of values that is accepted by the type checker when trying to unify type level indices. This is also called propositional equality: We will see below, that we can view types as mathematical propositions, and values of these types a proofs that these propositions hold.

Type Equal

Propositional equality is such a fundamental concept, that the Prelude exports a general data type for this already: Equal, with its only data constructor Refl. In addition, there is a built-in operator for expressing propositional equality, which gets desugared to Equal: (=). This can sometimes lead to some confusion, because the equals symbol is also used for definitional equality: Describing in function implementations that the left-hand side and right-hand side are defined to be equal. If you want to disambiguate propositional from definitional equality, you can also use operator (===) for the former.

Here is another implementation of concatTables:

eqColType : (c1,c2 : ColType) -> Maybe (c1 = c2)
eqColType I64     I64     = Just Refl
eqColType Str     Str     = Just Refl
eqColType Boolean Boolean = Just Refl
eqColType Float   Float   = Just Refl
eqColType _ _             = Nothing

eqCons :  {0 c1,c2 : a}
       -> {0 s1,s2 : List a}
       -> c1 = c2 -> s1 = s2 ->  c1 :: s1 = c2 :: s2
eqCons Refl Refl = Refl

eqSchema : (s1,s2 : Schema) -> Maybe (s1 = s2)
eqSchema []        []        = Just Refl
eqSchema (x :: xs) (y :: ys) = [| eqCons (eqColType x y) (eqSchema xs ys) |]
eqSchema (x :: xs) []        = Nothing
eqSchema []        (x :: xs) = Nothing

concatTables3 : Table -> Table -> Maybe Table
concatTables3 (MkTable s1 m rs1) (MkTable s2 n rs2) = case eqSchema s1 s2 of
  Just Refl => Just $ MkTable _ _ (rs1 ++ rs2)
  Nothing   => Nothing

Exercises part 1

In the following exercises, you are going to implement some very basic properties of equality proofs. You'll have to come up with the types of the functions yourself, as the implementations will be incredibly simple.

Note: If you can't remember what the terms "reflexive", "symmetric", and "transitive" mean, quickly read about equivalence relations.

  1. Show that SameColType is a reflexive relation.

  2. Show that SameColType is a symmetric relation.

  3. Show that SameColType is a transitive relation.

  4. Let f be a function of type ColType -> a for an arbitrary type a. Show that from a value of type SameColType c1 c2 follows that f c1 and f c2 are equal.

    For (=) the above properties are available from the Prelude as functions sym, trans, and cong. Reflexivity comes from the data constructor Refl itself.

  5. Implement a function for verifying that two natural numbers are identical. Try using cong in your implementation.

  6. Use the function from exercise 5 for zipping two Tables if they have the same number of rows.

    Hint: Use Vect.zipWith. You will need to implement custom function appRows for this, since Idris will not automatically figure out that the types unify when using HList.(++):

    appRows : {ts1 : _} -> Row ts1 -> Row ts2 -> Row (ts1 ++ ts2)
    

We will later learn how to use rewrite rules to circumvent the need of writing custom functions like appRows and use (++) in zipWith directly.

Programs as Proofs

module Tutorial.Eq.ProgramsAsProofs

import Tutorial.Eq.Eq

import Data.Either
import Data.HList
import Data.Vect
import Data.String

%default total

A famous observation by mathematician Haskell Curry and logician William Alvin Howard leads to the conclusion, that we can view a type in a programming language with a sufficiently rich type system as a mathematical proposition and a total program calculating a value of this type as a proof that the proposition holds. This is also known as the Curry-Howard isomorphism.

For instance, here is a simple proof that one plus one equals two:

onePlusOne : the Nat 1 + 1 = 2
onePlusOne = Refl

The above proof is trivial, as Idris solves this by unification. But we already stated some more interesting things in the exercises. For instance, the symmetry and transitivity of SameColType:

sctSymmetric : SameColType c1 c2 -> SameColType c2 c1
sctSymmetric SameCT = SameCT

sctTransitive : SameColType c1 c2 -> SameColType c2 c3 -> SameColType c1 c3
sctTransitive SameCT SameCT = SameCT

Note, that a type alone is not a proof. For instance, we are free to state that one plus one equals three:

onePlusOneWrong : the Nat 1 + 1 = 3

We will, however, have a hard time implementing this in a provably total way. We say: "The type the Nat 1 + 1 = 3 is uninhabited", meaning, that there is no value of this type.

When Proofs replace Tests

We will see several different use cases for compile time proofs, a very straight forward one being to show that our functions behave as they should by proofing some properties about them. For instance, here is a proposition that map on list does not change the number of elements in the list:

mapListLength : (f : a -> b) -> (as : List a) -> length as = length (map f as)

Read this as a universally quantified statement: For all functions f from a to b and for all lists as holding values of type a, the length of map f as is the same the as the length of the original list.

We can implement mapListLength by pattern matching on as. The Nil case will be trivial: Idris solves this by unification. It knows the value of the input list (Nil), and since map is implemented by pattern matching on the input as well, it follows immediately that the result will be Nil as well:

mapListLength f []        = Refl

The cons case is more involved, and we will do this stepwise. First, note that we can proof that the length of a map over the tail will stay the same by means of recursion:

mapListLength f (x :: xs) = case mapListLength f xs of
  prf => ?mll1

Let's inspect the types and context we have here:

 0 b : Type
 0 a : Type
   xs : List a
   f : a -> b
   x : a
   prf : length xs = length (map f xs)
------------------------------
mll1 : S (length xs) = S (length (map f xs))

So, we have a proof of type length xs = length (map f xs), and from the implementation of map Idris concludes that what we are actually looking for is a result of type S (length xs) = S (length (map f xs)). This is exactly what function cong from the Prelude is for ("cong" is an abbreviation for congruence). We can thus implement the cons case concisely like so:

mapListLength f (x :: xs) = cong S $ mapListLength f xs

Please take a moment to appreciate what we achieved here: A proof in the mathematical sense that our function will not affect the length of our list. We no longer need a unit test or similar program to verify this.

Before we continue, please note an important thing: In our case expression, we used a variable for the result from the recursive call:

mapListLength f (x :: xs) = case mapListLength f xs of
  prf => cong S prf

Here, we did not want the two lengths to unify, because we needed the distinction in our call to cong. Therefore: If you need a proof of type x = y in order for two variables to unify, use the Refl data constructor in the pattern match. If, on the other hand, you need to run further computations on such a proof, use a variable and the left and right-hand sides will remain distinct.

Here is another example from the last chapter: We want to show that parsing and printing column types behaves correctly. Writing proofs about parsers can be very hard in general, but here it can be done with a mere pattern match:

showColType : ColType -> String
showColType I64      = "i64"
showColType Str      = "str"
showColType Boolean  = "boolean"
showColType Float    = "float"

readColType : String -> Maybe ColType
readColType "i64"      = Just I64
readColType "str"      = Just Str
readColType "boolean"  = Just Boolean
readColType "float"    = Just Float
readColType s          = Nothing

showReadColType : (c : ColType) -> readColType (showColType c) = Just c
showReadColType I64     = Refl
showReadColType Str     = Refl
showReadColType Boolean = Refl
showReadColType Float   = Refl

Such simple proofs give us quick but strong guarantees that we did not make any stupid mistakes.

The examples we saw so far were very easy to implement. In general, this is not the case, and we will have to learn about several additional techniques in order to proof interesting things about our programs. However, when we use Idris as a general purpose programming language and not as a proof assistant, we are free to choose whether some aspect of our code needs such strong guarantees or not.

A Note of Caution: Lowercase Identifiers in Function Types

When writing down the types of proofs as we did above, one has to be very careful not to fall into the following trap: In general, Idris will treat lowercase identifiers in function types as type parameters (erased implicit arguments). For instance, here is a try at proofing the identity functor law for Maybe:

mapMaybeId1 : (ma : Maybe a) -> map id ma = ma
mapMaybeId1 Nothing  = Refl
mapMaybeId1 (Just x) = ?mapMaybeId1_rhs

You will not be able to implement the Just case, because Idris treats id as an implicit argument as can easily be seen when inspecting the context of mapMaybeId1_rhs:

Tutorial.Relations> :t mapMaybeId1_rhs
 0 a : Type
 0 id : a -> a
   x : a
------------------------------
mapMaybeId1_rhs : Just (id x) = Just x

As you can see, id is an erased argument of type a -> a. And in fact, when type-checking this module, Idris will issue a warning that parameter id is shadowing an existing function:

Warning: We are about to implicitly bind the following lowercase names.
You may be unintentionally shadowing the associated global definitions:
  id is shadowing Prelude.Basics.id

The same is not true for map: Since we explicitly pass arguments to map, Idris treats this as a function name and not as an implicit argument.

You have several options here. For instance, you could use an uppercase identifier, as these will never be treated as implicit arguments:

Id : a -> a
Id = id

mapMaybeId2 : (ma : Maybe a) -> map Id ma = ma
mapMaybeId2 Nothing  = Refl
mapMaybeId2 (Just x) = Refl

As an alternative - and this is the preferred way to handle this case - you can prefix id with part of its namespace, which will immediately resolve the issue:

mapMaybeId : (ma : Maybe a) -> map Prelude.id ma = ma
mapMaybeId Nothing  = Refl
mapMaybeId (Just x) = Refl

Note: If you have semantic highlighting turned on in your editor (for instance, by using the idris2-lsp plugin), you will note that map and id in mapMaybeId1 get highlighted differently: map as a function name, id as a bound variable.

Exercises part 2

In these exercises, you are going to proof several simple properties of small functions. When writing proofs, it is even more important to use holes to figure out what Idris expects from you next. Use the tools given to you, instead of trying to find your way in the dark!

  1. Proof that map id on an Either e returns the value unmodified.

  2. Proof that map id on a list returns the list unmodified.

  3. Proof that complementing a strand of a nucleobase (see the previous chapter) twice leads to the original strand.

    Hint: Proof this for single bases first, and use cong2 from the Prelude in your implementation for sequences of nucleic acids.

  4. Implement function replaceVect:

    replaceVect : (ix : Fin n) -> a -> Vect n a -> Vect n a
    

    Now proof, that after replacing an element in a vector using replaceAt accessing the same element using index will return the value we just added.

  5. Implement function insertVect:

    insertVect : (ix : Fin (S n)) -> a -> Vect n a -> Vect (S n) a
    

    Use a similar proof as in exercise 4 to show that this behaves correctly.

Note: Functions replaceVect and insertVect are available from Data.Vect as replaceAt and insertAt.

Into the Void

module Tutorial.Eq.Void

import Tutorial.Eq.Eq

import Data.Either
import Data.HList
import Data.Vect
import Data.String

%default total

Remember function onePlusOneWrong from above? This was definitely a wrong statement: One plus one does not equal three. Sometimes, we want to express exactly this: That a certain statement is false and does not hold. Consider for a moment what it means to proof a statement in Idris: Such a statement (or proposition) is a type, and a proof of the statement is a value or expression of this type: The type is said to be inhabited. If a statement is not true, there can be no value of the given type. We say, the given type is uninhabited. If we still manage to get our hands on a value of an uninhabited type, that is a logical contradiction and from this, anything follows (remember ex falso quodlibet).

So this is how to express that a proposition does not hold: We state that if it would hold, this would lead to a contradiction. The most natural way to express a contradiction in Idris is to return a value of type Void:

onePlusOneWrongProvably : the Nat 1 + 1 = 3 -> Void
onePlusOneWrongProvably Refl impossible

See how this is a provably total implementation of the given type: A function from 1 + 1 = 3 to Void. We implement this by pattern matching, and there is only one constructor to match on, which leads to an impossible case.

We can also use contradictory statements to proof other such statements. For instance, here is a proof that if the lengths of two lists are not the same, then the two list can't be the same either:

notSameLength1 : (List.length as = length bs -> Void) -> as = bs -> Void
notSameLength1 f prf = f (cong length prf)

This is cumbersome to write and pretty hard to read, so there is function Not in the prelude to express the same thing more naturally:

notSameLength : Not (List.length as = length bs) -> Not (as = bs)
notSameLength f prf = f (cong length prf)

Actually, this is just a specialized version of the contraposition of cong: If from a = b follows f a = f b, then from not (f a = f b) follows not (a = b):

contraCong : {0 f : _} -> Not (f a = f b) -> Not (a = b)
contraCong fun x = fun $ cong f x

Interface Uninhabited

There is an interface in the Prelude for uninhabited types: Uninhabited with its sole function uninhabited. Have a look at its documentation at the REPL. You will see, that there is already an impressive number of implementations available, many of which involve data type Equal.

We can use Uninhabited, to for instance express that the empty schema is not equal to a non-empty schema:

Uninhabited (SameSchema [] (h :: t)) where
  uninhabited Same impossible

Uninhabited (SameSchema (h :: t) []) where
  uninhabited Same impossible

There is a related function you need to know about: absurd, which combines uninhabited with void:

Tutorial.Eq> :printdef absurd
Prelude.absurd : Uninhabited t => t -> a
absurd h = void (uninhabited h)

Decidable Equality

When we implemented sameColType, we got a proof that two column types are indeed the same, from which we could figure out, whether two schemata are identical. The types guarantee we do not generate any false positives: If we generate a value of type SameSchema s1 s2, we have a proof that s1 and s2 are indeed identical. However, sameColType and thus sameSchema could theoretically still produce false negatives by returning Nothing although the two values are identical. For instance, we could implement sameColType in such a way that it always returns Nothing. This would be in agreement with the types, but definitely not what we want. So, here is what we'd like to do in order to get yet stronger guarantees: We'd either want to return a proof that the two schemata are the same, or return a proof that the two schemata are not the same. (Remember that Not a is an alias for a -> Void).

We call a property, which either holds or leads to a contradiction a decidable property, and the Prelude exports data type Dec prop, which encapsulates this distinction.

Here is a way to encode this for ColType:

decSameColType :  (c1,c2 : ColType) -> Dec (SameColType c1 c2)
decSameColType I64 I64         = Yes SameCT
decSameColType I64 Str         = No $ \case SameCT impossible
decSameColType I64 Boolean     = No $ \case SameCT impossible
decSameColType I64 Float       = No $ \case SameCT impossible

decSameColType Str I64         = No $ \case SameCT impossible
decSameColType Str Str         = Yes SameCT
decSameColType Str Boolean     = No $ \case SameCT impossible
decSameColType Str Float       = No $ \case SameCT impossible

decSameColType Boolean I64     = No $ \case SameCT impossible
decSameColType Boolean Str     = No $ \case SameCT impossible
decSameColType Boolean Boolean = Yes SameCT
decSameColType Boolean Float   = No $ \case SameCT impossible

decSameColType Float I64       = No $ \case SameCT impossible
decSameColType Float Str       = No $ \case SameCT impossible
decSameColType Float Boolean   = No $ \case SameCT impossible
decSameColType Float Float     = Yes SameCT

First, note how we could use a pattern match in a single argument lambda directly. This is sometimes called the lambda case style, named after an extension of the Haskell programming language. If we use the SameCT constructor in the pattern match, Idris is forced to try and unify for instance Float with I64. This is not possible, so the case as a whole is impossible.

Yet, this was pretty cumbersome to implement. In order to convince Idris we did not miss a case, there is no way around treating every possible pairing of constructors explicitly. However, we get much stronger guarantees out of this: We can no longer create false positives or false negatives, and therefore, decSameColType is provably correct.

Doing the same thing for schemata requires some utility functions, the types of which we can figure out by placing some holes:

decSameSchema' :  (s1, s2 : Schema) -> Dec (SameSchema s1 s2)
decSameSchema' []        []        = Yes Same
decSameSchema' []        (y :: ys) = No ?decss1
decSameSchema' (x :: xs) []        = No ?decss2
decSameSchema' (x :: xs) (y :: ys) = case decSameColType x y of
  Yes SameCT => case decSameSchema' xs ys of
    Yes Same => Yes Same
    No  contra => No $ \prf => ?decss3
  No  contra => No $ \prf => ?decss4

The first two cases are not too hard. The type of decss1 is SameSchema [] (y :: ys) -> Void, which you can easily verify at the REPL. But that's just uninhabited, specialized to SameSchema [] (y :: ys), and this we already implemented further above. The same goes for decss2.

The other two cases are harder, so I already filled in as much stuff as possible. We know that we want to return a No, if either the heads or tails are provably distinct. The No holds a function, so I already added a lambda, leaving a hole only for the return value. Here are the type and - more important - context of decss3:

Tutorial.Relations> :t decss3
   y : ColType
   xs : List ColType
   ys : List ColType
   x : ColType
   contra : SameSchema xs ys -> Void
   prf : SameSchema (y :: xs) (y :: ys)
------------------------------
decss3 : Void

The types of contra and prf are what we need here: If xs and ys are distinct, then y :: xs and y :: ys must be distinct as well. This is the contraposition of the following statement: If x :: xs is the same as y :: ys, then xs and ys are the same as well. We must therefore implement a lemma, which proves that the cons constructor is injective:

consInjective :  SameSchema (c1 :: cs1) (c2 :: cs2)
              -> (SameColType c1 c2, SameSchema cs1 cs2)
consInjective Same = (SameCT, Same)

We can now pass prf to consInjective to extract a value of type SameSchema xs ys, which we then pass to contra in order to get the desired value of type Void. With these observations and utilities, we can now implement decSameSchema:

decSameSchema :  (s1, s2 : Schema) -> Dec (SameSchema s1 s2)
decSameSchema []        []        = Yes Same
decSameSchema []        (y :: ys) = No absurd
decSameSchema (x :: xs) []        = No absurd
decSameSchema (x :: xs) (y :: ys) = case decSameColType x y of
  Yes SameCT => case decSameSchema xs ys of
    Yes Same   => Yes Same
    No  contra => No $ contra . snd . consInjective
  No  contra => No $ contra . fst . consInjective

There is an interface called DecEq exported by module Decidable.Equality for types for which we can implement a decision procedure for propositional equality. We can implement this to figure out if two values are equal or not.

Exercises part 3

  1. Show that there can be no non-empty vector of Void by writing a corresponding implementation of uninhabited

  2. Generalize exercise 1 for all uninhabited element types.

  3. Show that if a = b cannot hold, then b = a cannot hold either.

  4. Show that if a = b holds, and b = c cannot hold, then a = c cannot hold either.

  5. Implement Uninhabited for Crud i a. Try to be as general as possible.

    data Crud : (i : Type) -> (a : Type) -> Type where
      Create : (value : a) -> Crud i a
      Update : (id : i) -> (value : a) -> Crud i a
      Read   : (id : i) -> Crud i a
      Delete : (id : i) -> Crud i a
    
  6. Implement DecEq for ColType.

  7. Implementations such as the one from exercise 6 are cumbersome to write as they require a quadratic number of pattern matches with relation to the number of data constructors. Here is a trick how to make this more bearable.

    1. Implement a function ctNat, which assigns every value of type ColType a unique natural number.

    2. Proof that ctNat is injective. Hint: You will need to pattern match on the ColType values, but four matches should be enough to satisfy the coverage checker.

    3. In your implementation of DecEq for ColType, use decEq on the result of applying both column types to ctNat, thus reducing it to only two lines of code.

    We will later talk about with rules: Special forms of dependent pattern matches, that allow us to learn something about the shape of function arguments by performing computations on them. These will allow us to use a similar technique as shown here to implement DecEq requiring only n pattern matches for arbitrary sum types with n data constructors.

Rewrite Rules

module Tutorial.Eq.Rewrite

import Data.Either
import Data.HList
import Data.Vect
import Data.String

%default total

One of the most important use cases of propositional equality is to replace or rewrite existing types, which Idris can't unify automatically otherwise. For instance, the following is no problem: Idris know that 0 + n equals n, because plus on natural numbers is implemented by pattern matching on the first argument. The two vector lengths therefore unify just fine.

leftZero :  List (Vect n Nat)
         -> List (Vect (0 + n) Nat)
         -> List (Vect n Nat)
leftZero = (++)

However, the example below can't be implemented as easily (try it!), because Idris can't figure out on its own that the two lengths unify.

rightZero' :  List (Vect n Nat)
           -> List (Vect (n + 0) Nat)
           -> List (Vect n Nat)

Probably for the first time we realize, just how little Idris knows about the laws of arithmetics. Idris is able to unify values when

  • all values in a computation are known at compile time
  • one expression follows directly from the other due to the pattern matches used in a function's implementation.

In expression n + 0, not all values are known (n is a variable), and (+) is implemented by pattern matching on the first argument, about which we know nothing here.

However, we can teach Idris. If we can proof that the two expressions are equivalent, we can replace one expression for the other, so that the two unify again. Here is a lemma and its proof, that n + 0 equals n, for all natural numbers n.

addZeroRight : (n : Nat) -> n + 0 = n
addZeroRight 0     = Refl
addZeroRight (S k) = cong S $ addZeroRight k

Note, how the base case is trivial: Since there are no variables left, Idris can immediately figure out that 0 + 0 = 0. In the recursive case, it can be instructive to replace cong S with a hole and look at its type and context to figure out how to proceed.

The Prelude exports function replace for substituting one variable in a term by another, based on a proof of equality. Make sure to inspect its type first before looking at the example below:

replaceVect : Vect (n + 0) a -> Vect n a
replaceVect as = replace {p = \k => Vect k a} (addZeroRight n) as

As you can see, we replace a value of type p x with a value of type p y based on a proof that x = y, where p is a function from some type t to Type, and x and y are values of type t. In our replaceVect example, t equals Nat, x equals n + 0, y equals n, and p equals \k => Vect k a.

Using replace directly is not very convenient, because Idris can often not infer the value of p on its own. Indeed, we had to give its type explicitly in replaceVect. Idris therefore provides special syntax for such rewrite rules, which will get desugared to calls to replace with all the details filled in for us. Here is an implementation of replaceVect with a rewrite rule:

rewriteVect : Vect (n + 0) a -> Vect n a
rewriteVect as = rewrite sym (addZeroRight n) in as

One source of confusion is that rewrite uses proofs of equality the other way round: Given an y = x it replaces p x with p y. Hence the need to call sym in our implementation above.

Use Case: Reversing Vectors

Rewrite rules are often required when we perform interesting type-level computations. For instance, we have already seen many interesting examples of functions operating on Vect, which allowed us to keep track of the exact lengths of the vectors involved, but one key functionality has been missing from our discussions so far, and for good reasons: Function reverse. Here is a possible implementation, which is how reverse is implemented for lists:

revOnto' : Vect m a -> Vect n a -> Vect (m + n) a
revOnto' xs []        = xs
revOnto' xs (x :: ys) = revOnto' (x :: xs) ys


reverseVect' : Vect n a -> Vect n a
reverseVect' = revOnto' []

As you might have guessed, this will not compile as the length indices in the two clauses of revOnto' do not unify.

The nil case is a case we've already seen above: Here n is zero, because the second vector is empty, so we have to convince Idris once again that m + 0 = m:

revOnto : Vect m a -> Vect n a -> Vect (m + n) a
revOnto xs [] = rewrite addZeroRight m in xs

The second case is more complex. Here, Idris fails to unify S (m + len) with m + S len, where len is the length of ys, the tail of the second vector. Module Data.Nat provides many proofs about arithmetic operations on natural numbers, one of which is plusSuccRightSucc. Here's its type:

Tutorial.Eq> :t plusSuccRightSucc
Data.Nat.plusSuccRightSucc :  (left : Nat)
                           -> (right : Nat)
                           -> S (left + right) = left + S right

In our case, we want to replace S (m + len) with m + S len, so we will need the version with arguments flipped. However, there is one more obstacle: We need to invoke plusSuccRightSucc with the length of ys, which is not given as an implicit function argument of revOnto. We therefore need to pattern match on n (the length of the second vector), in order to bind the length of the tail to a variable. Remember, that we are allowed to pattern match on an erased argument only if the constructor used follows from a match on another, unerased, argument (ys in this case). Here's the implementation of the second case:

revOnto {n = S len} xs (x :: ys) =
  rewrite sym (plusSuccRightSucc m len) in revOnto (x :: xs) ys

I know from my own experience that this can be highly confusing at first. If you use Idris as a general purpose programming language and not as a proof assistant, you probably will not have to use rewrite rules too often. Still, it is important to know that they exist, as they allow us to teach complex equivalences to Idris.

A Note on Erasure

Single value data types like Unit, Equal, or SameSchema have not runtime relevance, as values of these types are always identical. We can therefore always use them as erased function arguments while still being able to pattern match on these values. For instance, when you look at the type of replace, you will see that the equality proof is an erased argument. This allows us to run arbitrarily complex computations to produce such values without fear of these computations slowing down the compiled Idris program.

Exercises part 4

  1. Implement plusSuccRightSucc yourself.

  2. Proof that minus n n equals zero for all natural numbers n.

  3. Proof that minus n 0 equals n for all natural numbers n

  4. Proof that n * 1 = n and 1 * n = n for all natural numbers n.

  5. Proof that addition of natural numbers is commutative.

  6. Implement a tail-recursive version of map for vectors.

  7. Proof the following proposition:

    mapAppend :  (f : a -> b)
              -> (xs : List a)
              -> (ys : List a)
              -> map f (xs ++ ys) = map f xs ++ map f ys
    
  8. Use the proof from exercise 7 to implement again a function for zipping two Tables, this time using a rewrite rule plus Data.HList.(++) instead of custom function appRows.

Conclusion

The concept of types as propositions, values as proofs is a very powerful tool for writing provably correct programs. We will therefore spend some more time defining data types for describing contracts between values, and values of these types as proofs that the contracts hold. This will allow us to describe necessary pre- and postconditions for our functions, thus reducing the need to return a Maybe or other failure type, because due to the restricted input, our functions can no longer fail.

Predicates and Proof Search

In the last chapter we learned about propositional equality, which allowed us to proof that two values are equal. Equality is a relation between values, and we used an indexed data type to encode this relation by limiting the degrees of freedom of the indices in the sole data constructor. There are other relations and contracts we can encode this way. This will allow us to restrict the values we accept as a function's arguments or the values returned by functions.

Preconditions

module Tutorial.Predicates.Preconditions

import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality

import Text.CSV
import System.File

%default total

Often, when we implement functions operating on values of a given type, not all values are considered to be valid arguments for the function in question. For instance, we typically do not allow division by zero, as the result is undefined in the general case. This concept of putting a precondition on a function argument comes up pretty often, and there are several ways to go about this.

A very common operation when working with lists or other container types is to extract the first value in the sequence. This function, however, cannot work in the general case, because in order to extract a value from a list, the list must not be empty. Here are a couple of ways to encode and implement this, each with its own advantages and disadvantages:

  • Wrap the result in a failure type, such as a Maybe or Either e with some custom error type e. This makes it immediately clear that the function might not be able to return a result. It is a natural way to deal with unvalidated input from unknown sources. The drawback of this approach is that results will carry the Maybe stain, even in situations when we know that the nil case is impossible, for instance because we know the value of the list argument at compile-time, or because we already refined the input value in such a way that we can be sure it is not empty (due to an earlier pattern match, for instance).

  • Define a new data type for non-empty lists and use this as the function's argument. This is the approach taken in module Data.List1. It allows us to return a pure value (meaning "not wrapped in a failure type" here), because the function cannot possibly fail, but it comes with the burden of reimplementing many of the utility functions and interfaces we already implemented for List. For a very common data structure this can be a valid option, but for rare use cases it is often too cumbersome.

  • Use an index to keep track of the property we are interested in. This was the approach we took with type family List01, which we saw in several examples and exercises in this guide so far. This is also the approach taken with vectors, where we use the exact length as our index, which is even more expressive. While this allows us to implement many functions only once and with greater precision at the type level, it also comes with the burden of keeping track of changes in the types, making for more complex function types and forcing us to at times return existentially quantified wrappers (for instance, dependent pairs), because the outcome of a computation is not known until runtime.

  • Fail with a runtime exception. This is a popular solution in many programming languages (even Haskell), but in Idris we try to avoid this, because it breaks totality in a way, which also affects client code. Luckily, we can make use of our powerful type system to avoid this situation in general.

  • Take an additional (possibly erased) argument of a type we can use as a witness that the input value is of the correct kind or shape. This is the solution we will discuss in this chapter in great detail. It is an incredibly powerful way to talk about restrictions on values without having to replicate a lot of already existing functionality.

There is a time and place for most if not all of the solutions listed above in Idris, but we will often turn to the last one and refine function arguments with predicates (so called preconditions), because it makes our functions nice to use at runtime and compile time.

Example: Non-empty Lists

Remember how we implemented an indexed data type for propositional equality: We restricted the valid values of the indices in the constructors. We can do the same thing for a predicate for non-empty lists:

data NotNil : (as : List a) -> Type where
  IsNotNil : NotNil (h :: t)

This is a single-value data type, so we can always use it as an erased function argument and still pattern match on it. We can now use this to implement a safe and pure head function:

head1 : (as : List a) -> (0 _ : NotNil as) -> a
head1 (h :: _) _ = h
head1 [] IsNotNil impossible

Note, how value IsNotNil is a witness that its index, which corresponds to our list argument, is indeed non-empty, because this is what we specified in its type. The impossible case in the implementation of head1 is not strictly necessary here. It was given above for completeness.

We call NotNil a predicate on lists, as it restricts the values allowed in the index. We can express a function's preconditions by adding additional (possibly erased) predicates to the function's list of arguments.

The first really cool thing is how we can safely use head1, if we can at compile-time show that our list argument is indeed non-empty:

headEx1 : Nat
headEx1 = head1 [1,2,3] IsNotNil

It is a bit cumbersome that we have to pass the IsNotNil proof manually. Before we scratch that itch, we will first discuss what to do with lists, the values of which are not known until runtime. For these cases, we have to try and produce a value of the predicate programmatically by inspecting the runtime list value. In the most simple case, we can wrap the proof in a Maybe, but if we can show that our predicate is decidable, we can get even stronger guarantees by returning a Dec:

Uninhabited (NotNil []) where
  uninhabited IsNotNil impossible

nonEmpty : (as : List a) -> Dec (NotNil as)
nonEmpty (x :: xs) = Yes IsNotNil
nonEmpty []        = No uninhabited

With this, we can implement function headMaybe, which is to be used with lists of unknown origin:

headMaybe1 : List a -> Maybe a
headMaybe1 as = case nonEmpty as of
  Yes prf => Just $ head1 as prf
  No  _   => Nothing

Of course, for trivial functions like headMaybe it makes more sense to implement them directly by pattern matching on the list argument, but we will soon see examples of predicates the values of which are more cumbersome to create.

Auto Implicits

Having to manually pass a proof of being non-empty to head1 makes this function unnecessarily verbose to use at compile time. Idris allows us to define implicit function arguments, the values of which it tries to assemble on its own by means of a technique called proof search. This is not to be confused with type inference, which means inferring values or types from the surrounding context. It's best to look at some examples to explain the difference.

Let us first have a look at the following implementation of replicate for vectors:

replicate' : {n : _} -> a -> Vect n a
replicate' {n = 0}   _ = []
replicate' {n = S _} v = v :: replicate' v

Function replicate' takes an unerased implicit argument. The value of this argument must be derivable from the surrounding context. For instance, in the following example it is immediately clear that n equals three, because that is the length of the vector we want:

replicateEx1 : Vect 3 Nat
replicateEx1 = replicate' 12

In the next example, the value of n is not known at compile time, but it is available as an unerased implicit, so this can again be passed as is to replicate':

replicateEx2 : {n : _} -> Vect n Nat
replicateEx2 = replicate' 12

However, in the following example, the value of n can't be inferred, as the intermediary vector is immediately converted to a list of unknown length. Although Idris could try and insert any value for n here, it won't do so, because it can't be sure that this is the length we want. We therefore have to pass the length explicitly:

replicateEx3 : List Nat
replicateEx3 = toList $ replicate' {n = 17} 12

Note, how the value of n had to be inferable in these examples, which means it had to make an appearance in the surrounding context. With auto implicit arguments, this works differently. Here is the head example, this time with an auto implicit:

head : (as : List a) -> {auto 0 prf : NotNil as} -> a
head (x :: _) = x
head [] impossible

Note the auto keyword before the quantity of implicit argument prf. This means, we want Idris to construct this value on its own, without it being visible in the surrounding context. In order to do so, Idris will have to at compile time know the structure of the list argument as. It will then try and build such a value from the data type's constructors. If it succeeds, this value will then be automatically filled in as the desired argument, otherwise, Idris will fail with a type error.

Let's see this in action:

headEx3 : Nat
headEx3 = Preconditions.head [1,2,3]

The following example fails with an error:

failing "Can't find an implementation\nfor NotNil []."
  errHead : Nat
  errHead = Preconditions.head []

Wait! "Can't find an implementation for..."? Is this not the error message we get for missing interface implementations? That's correct, and I'll show you that interface resolution is just proof search at the end of this chapter. What I can show you already, is that writing the lengthy {auto prf : t} -> all the times can be cumbersome. Idris therefore allows us to use the same syntax as for constrained functions instead: (prf : t) =>, or even t =>, if we don't need to name the constraint. As usual, we can then access a constraint in the function body by its name (if any). Here is another implementation of head:

head' : (as : List a) -> (0 _ : NotNil as) => a
head' (x :: _) = x
head' [] impossible

During proof search, Idris will also look for values of the required type in the current function context. This allows us to implement headMaybe without having to pass on the NotNil proof manually:

headMaybe : List a -> Maybe a
headMaybe as = case nonEmpty as of
  -- `prf` is available during proof seach
  Yes prf => Just $ Preconditions.head as
  No  _   => Nothing

To conclude: Predicates allow us to restrict the values a function accepts as arguments. At runtime, we need to build such witnesses by pattern matching on the function arguments. These operations can typically fail. At compile time, we can let Idris try and build these values for us using a technique called proof search. This allows us to make functions safe and convenient to use at the same time.

Exercises part 1

In these exercises, you'll have to implement several functions making use of auto implicits, to constrain the values accepted as function arguments. The results should be pure, that is, not wrapped in a failure type like Maybe.

  1. Implement tail for lists.

  2. Implement concat1 and foldMap1 for lists. These should work like concat and foldMap, but taking only a Semigroup constraint on the element type.

  3. Implement functions for returning the largest and smallest element in a list.

  4. Define a predicate for strictly positive natural numbers and use it to implement a safe and provably total division function on natural numbers.

  5. Define a predicate for a non-empty Maybe and use it to safely extract the value stored in a Just. Show that this predicate is decidable by implementing a corresponding conversion function.

  6. Define and implement functions for safely extracting values from a Left and a Right by using suitable predicates. Show again that these predicates are decidable.

The predicates you implemented in these exercises are already available in the base library: Data.List.NonEmpty, Data.Maybe.IsJust, Data.Either.IsLeft, Data.Either.IsRight, and Data.Nat.IsSucc.

Contracts between Values

module Tutorial.Predicates.Contracts

import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality

import Text.CSV
import System.File

%default total

The predicates we saw so far restricted the values of a single type, but it is also possible to define predicates describing contracts between several values of possibly distinct types.

The Elem Predicate

Assume we'd like to extract a value of a given type from a heterogeneous list:

get' : (0 t : Type) -> HList ts -> t

This can't work in general: If we could implement this we would immediately have a proof of void:

voidAgain : Void
voidAgain = get' Void []

The problem is obvious: The type of which we'd like to extract a value must be an element of the index of the heterogeneous list. Here is a predicate, with which we can express this:

public export
data Elem : (elem : a) -> (as : List a) -> Type where
  Here  : Elem x (x :: xs)
  There : Elem x xs -> Elem x (y :: xs)

This is a predicate describing a contract between two values: A value of type a and a list of as. Values of this predicate are witnesses that the value is an element of the list. Note, how this is defined recursively: The case where the value we look for is at the head of the list is handled by the Here constructor, where the same variable (x) is used for the element and the head of the list. The case where the value is deeper within the list is handled by the There constructor. This can be read as follows: If x is an element of xs, then x is also an element of y :: xs for any value y. Let's write down some examples to get a feel for these:

MyList : List Nat
MyList = [1,3,7,8,4,12]

oneElemMyList : Elem 1 MyList
oneElemMyList = Here

sevenElemMyList : Elem 7 MyList
sevenElemMyList = There $ There Here

Now, Elem is just another way of indexing into a list of values. Instead of using a Fin index, which is limited by the list's length, we use a proof that a value can be found at a certain position.

We can use the Elem predicate to extract a value from the desired type of a heterogeneous list:

get : (0 t : Type) -> HList ts -> (prf : Elem t ts) => t

It is important to note that the auto implicit must not be erased in this case. This is no longer a single value data type, and we must be able to pattern match on this value in order to figure out, how far within the heterogeneous list our value is stored:

get t (v :: vs) {prf = Here}    = v
get t (v :: vs) {prf = There p} = get t vs
get _ [] impossible

It can be instructive to implement get yourself, using holes on the right hand side to see the context and types of values Idris infers based on the value of the Elem predicate.

Let's give this a spin at the REPL:

Tutorial.Predicates> get Nat ["foo", Just "bar", S Z]
1
Tutorial.Predicates> get Nat ["foo", Just "bar"]
Error: Can't find an implementation for Elem Nat [String, Maybe String].

(Interactive):1:1--1:28
 1 | get Nat ["foo", Just "bar"]
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^

With this example we start to appreciate what proof search actually means: Given a value v and a list of values vs, Idris tries to find a proof that v is an element of vs. Now, before we continue, please note that proof search is not a silver bullet. The search algorithm has a reasonably limited search depth, and will fail with the search if this limit is exceeded. For instance:

Tps : List Type
Tps = List.replicate 50 Nat ++ [Maybe String]

hlist : HList Tps
hlist = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        , Nothing ]

And at the REPL:

Tutorial.Predicates> get (Maybe String) hlist
Error: Can't find an implementation for Elem (Maybe String) [Nat,...

As you can see, Idris fails to find a proof that Maybe String is an element of Tps. The search depth can be increased with the %auto_implicit_depth directive, which will hold for the rest of the source file or until set to a different value. The default value is set at 25. In general, it is not advisable to set this to a too large value as this can drastically increase compile times.

%auto_implicit_depth 100
aMaybe : Maybe String
aMaybe = get _ hlist

%auto_implicit_depth 25

Use Case: A nicer Schema

In the chapter about sigma types, we introduced a schema for CSV files. This was not very nice to use, because we had to use natural numbers to access a certain column. Even worse, users of our small library had to do the same. There was no way to define a name for each column and access columns by name. We are going to change this. Here is an encoding for this use case:

public export
data ColType = I64 | Str | Boolean | Float

public export
IdrisType : ColType -> Type
IdrisType I64     = Int64
IdrisType Str     = String
IdrisType Boolean = Bool
IdrisType Float   = Double

public export
record Column where
  constructor MkColumn
  name : String
  type : ColType

infixr 8 :>

public export
(:>) : String -> ColType -> Column
(:>) = MkColumn

public export
Schema : Type
Schema = List Column

export
Show ColType where
  show I64     = "I64"
  show Str     = "Str"
  show Boolean = "Boolean"
  show Float   = "Float"

Show Column where
  show (MkColumn n ct) = "\{n}:\{show ct}"

export
showSchema : Schema -> String
showSchema = concat . intersperse "," . map show

As you can see, in a schema we now pair a column's type with its name. Here is an example schema for a CSV file holding information about employees in a company:

EmployeeSchema : Schema
EmployeeSchema = [ "firstName"  :> Str
                 , "lastName"   :> Str
                 , "email"      :> Str
                 , "age"        :> I64
                 , "salary"     :> Float
                 , "management" :> Boolean
                 ]

Such a schema could of course again be read from user input, but we will wait with implementing a parser until later in this chapter. Using this new schema with an HList directly led to issues with type inference, therefore I quickly wrote a custom row type: A heterogeneous list indexed over a schema.

public export
data Row : Schema -> Type where
  Nil  : Row []

  (::) :  {0 name : String}
       -> {0 type : ColType}
       -> (v : IdrisType type)
       -> Row ss
       -> Row (name :> type :: ss)

In the signature of cons, I list the erased implicit arguments explicitly. This is good practice, as otherwise Idris will often issue shadowing warnings when using such data constructors in client code.

We can now define a type alias for CSV rows representing employees:

0 Employee : Type
Employee = Row EmployeeSchema

hock : Employee
hock = [ "Stefan", "Höck", "hock@foo.com", 46, 5443.2, False ]

Note, how I gave Employee a zero quantity. This means, we are only ever allowed to use this function at compile time but never at runtime. This is a safe way to make sure our type-level functions and aliases do not leak into the executable when we build our application. We are allowed to use zero-quantity functions and values in type signatures and when computing other erased values, but not for runtime-relevant computations.

We would now like to access a value in a row based on the name given. For this, we write a custom predicate, which serves as a witness that a column with the given name is part of the schema. Now, here is an important thing to note: In this predicate we include an index for the type of the column with the given name. We need this, because when we access a column by name, we need a way to figure out the return type. But during proof search, this type will have to be derived by Idris based on the column name and schema in question (otherwise, the proof search will fail unless the return type is known in advance). We therefore must tell Idris, that it can't include this type in the list of search criteria, otherwise it will try and infer the column type from the context (using type inference) before running the proof search. This can be done by listing the indices to be used in the search like so: [search name schema].

public export
data InSchema :  (name    : String)
              -> (schema  : Schema)
              -> (colType : ColType)
              -> Type where
  [search name schema]
  IsHere  : InSchema n (n :> t :: ss) t
  IsThere : InSchema n ss t -> InSchema n (fld :: ss) t

export
Uninhabited (InSchema n [] c) where
  uninhabited IsHere impossible
  uninhabited (IsThere _) impossible

With this, we are now ready to access the value at a given column based on the column's name:

export
getAt :  {0 ss : Schema}
      -> (name : String)
      -> (row  : Row ss)
      -> (prf  : InSchema name ss c)
      => IdrisType c
getAt name (v :: vs) {prf = IsHere}    = v
getAt name (_ :: vs) {prf = IsThere p} = getAt name vs

Below is an example how to use this at compile time. Note the amount of work Idris performs for us: It first comes up with proofs that firstName, lastName, and age are indeed valid names in the Employee schema. From these proofs it automatically figures out the return types of the calls to getAt and extracts the corresponding values from the row. All of this happens in a provably total and type safe way.

shoeck : String
shoeck =  getAt "firstName" hock
       ++ " "
       ++ getAt "lastName" hock
       ++ ": "
       ++ show (getAt "age" hock)
       ++ " years old."

In order to at runtime specify a column name, we need a way for computing values of type InSchema by comparing the column names with the schema in question. Since we have to compare two string values for being propositionally equal, we use the DecEq implementation for String here (Idris provides DecEq implementations for all primitives). We extract the column type at the same time and pair this (as a dependent pair) with the InSchema proof:

export
inSchema : (ss : Schema) -> (n : String) -> Maybe (c ** InSchema n ss c)
inSchema []                    _ = Nothing
inSchema (MkColumn cn t :: xs) n = case decEq cn n of
  Yes Refl   => Just (t ** IsHere)
  No  contra => case inSchema xs n of
    Just (t ** prf=> Just $ (t ** IsThere prf)
    Nothing         => Nothing

At the end of this chapter we will use InSchema in our CSV command-line application to list all values in a column.

Exercises part 2

  1. Show that InSchema is decidable by changing the output type of inSchema to Dec (c ** InSchema n ss c).

  2. Declare and implement a function for modifying a field in a row based on the column name given.

  3. Define a predicate to be used as a witness that one list contains only elements in the second list in the same order and use this predicate to extract several columns from a row at once.

    For instance, [2,4,5] contains elements from [1,2,3,4,5,6] in the correct order, but [4,2,5] does not.

  4. Improve the functionality from exercise 3 by defining a new predicate, witnessing that all strings in a list correspond to column names in a schema (in arbitrary order). Use this to extract several columns from a row at once in arbitrary order.

    Hint: Make sure to include the resulting schema as an index, but search only based on the list of names and the input schema.

Use Case: Flexible Error Handling

module Tutorial.Predicates.ErrorHandling

import Tutorial.Predicates.Contracts

import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality

import Text.CSV
import System.File

%default total

A recurring pattern when writing larger applications is the combination of different parts of a program each with their own failure types in a larger effectful computation. We saw this, for instance, when implementing a command-line tool for handling CSV files. There, we read and wrote data from and to files, we parsed column types and schemata, we parsed row and column indices and command-line commands. All these operations came with the potential of failure and might be implemented in different parts of our application. In order to unify these different failure types, we wrote a custom sum type encapsulating each of them, and wrote a single handler for this sum type. This approach was alright then, but it does not scale well and is lacking in terms of flexibility. We are therefore trying a different approach here. Before we continue, we quickly implement a couple of functions with the potential of failure plus some custom error types:

public export
record NoNat where
  constructor MkNoNat
  str : String

readNat' : String -> Either NoNat Nat
readNat' s = maybeToEither (MkNoNat s) $ parsePositive s

public export
record NoColType where
  constructor MkNoColType
  str : String

readColType' : String -> Either NoColType ColType
readColType' "I64"     = Right I64
readColType' "Str"     = Right Str
readColType' "Boolean" = Right Boolean
readColType' "Float"   = Right Float
readColType' s         = Left $ MkNoColType s

However, if we wanted to parse a Fin n, there'd be already two ways how this could fail: The string in question could not represent a natural number (leading to a NoNat error), or it could be out of bounds (leading to an OutOfBounds error). We have to somehow encode these two possibilities in the return type, for instance, by using an Either as the error type:

public export
record OutOfBounds where
  constructor MkOutOfBounds
  size  : Nat
  index : Nat

readFin' : {n : _} -> String -> Either (Either NoNat OutOfBounds) (Fin n)
readFin' s = do
  ix <- mapFst Left (readNat' s)
  maybeToEither (Right $ MkOutOfBounds n ix) $ natToFin ix n

This is incredibly ugly. A custom sum type might have been slightly better, but we still would have to use mapFst when invoking readNat', and writing custom sum types for every possible combination of errors will get cumbersome very quickly as well. What we are looking for, is a generalized sum type: A type indexed by a list of types (the possible choices) holding a single value of exactly one of the types in question. Here is a first naive try:

data Sum : List Type -> Type where
  MkSum : (val : t) -> Sum ts

However, there is a crucial piece of information missing: We have not verified that t is an element of ts, nor which type it actually is. In fact, this is another case of an erased existential, and we will have no way to at runtime learn something about t. What we need to do is to pair the value with a proof, that its type t is an element of ts. We could use Elem again for this, but for some use cases we will require access to the number of types in the list. We will therefore use a vector instead of a list as our index. Here is a predicate similar to Elem but for vectors:

public export
data Has :  (v : a) -> (vs  : Vect n a) -> Type where
  Z : Has v (v :: vs)
  S : Has v vs -> Has v (w :: vs)

export
Uninhabited (Has v []) where
  uninhabited Z impossible
  uninhabited (_) impossible

A value of type Has v vs is a witness that v is an element of vs. With this, we can now implement an indexed sum type (also called an open union):

public export
data Union : Vect n Type -> Type where
  U : (ix : Has t ts) -> (val : t) -> Union ts

export
Uninhabited (Union []) where
  uninhabited (U ix _) = absurd ix

Note the difference between HList and Union. HList is a generalized product type: It holds a value for each type in its index. Union is a generalized sum type: It holds only a single value, which must be of a type listed in the index. With this we can now define a much more flexible error type:

public export
0 Err : Vect n Type -> Type -> Type
Err ts t = Either (Union ts) t

A function returning an Err ts a describes a computation, which can fail with one of the errors listed in ts. We first need some utility functions.

inject : (prf : Has t ts) => (v : t) -> Union ts
inject v = U prf v

export
fail : Has t ts => (err : t) -> Err ts a
fail err = Left $ inject err

failMaybe : Has t ts => (err : Lazy t) -> Maybe a -> Err ts a
failMaybe err = maybeToEither (inject err)

Next, we can write more flexible versions of the parsers we wrote above:

readNat : Has NoNat ts => String -> Err ts Nat
readNat s = failMaybe (MkNoNat s) $ parsePositive s

readColType : Has NoColType ts => String -> Err ts ColType
readColType "I64"     = Right I64
readColType "Str"     = Right Str
readColType "Boolean" = Right Boolean
readColType "Float"   = Right Float
readColType s         = fail $ MkNoColType s

Before we implement readFin, we introduce a short cut for specifying that several error types must be present:

public export
0 Errs : List Type -> Vect n Type -> Type
Errs []        _  = ()
Errs (x :: xs) ts = (Has x ts, Errs xs ts)

Function Errs returns a tuple of constraints. This can be used as a witness that all listed types are present in the vector of types: Idris will automatically extract the proofs from the tuple as needed.

export
readFin : {n : _} -> Errs [NoNat, OutOfBounds] ts => String -> Err ts (Fin n)
readFin s = do
  S ix <- readNat s | Z => fail (MkOutOfBounds n Z)
  failMaybe (MkOutOfBounds n (S ix)) $ natToFin ix n

As a last example, here are parsers for schemata and CSV rows:

fromCSV : String -> List String
fromCSV = forget . split (',' ==)

public export
record InvalidColumn where
  constructor MkInvalidColumn
  str : String

readColumn : Errs [InvalidColumn, NoColType] ts => String -> Err ts Column
readColumn s = case forget $ split (':' ==) s of
  [n,ct] => MkColumn n <$> readColType ct
  _      => fail $ MkInvalidColumn s

export
readSchema : Errs [InvalidColumn, NoColType] ts => String -> Err ts Schema
readSchema = traverse readColumn . fromCSV

public export
data RowError : Type where
  InvalidField  : (row, col : Nat) -> (ct : ColType) -> String -> RowError
  UnexpectedEOI : (row, col : Nat) -> RowError
  ExpectedEOI   : (row, col : Nat) -> RowError

decodeField :  Has RowError ts
            => (row,col : Nat)
            -> (c : ColType)
            -> String
            -> Err ts (IdrisType c)
decodeField row col c s =
  let err = InvalidField row col c s
   in case c of
        I64     => failMaybe err $ read s
        Str     => failMaybe err $ read s
        Boolean => failMaybe err $ read s
        Float   => failMaybe err $ read s

export
decodeRow :  Has RowError ts
          => {s : _}
          -> (row : Nat)
          -> (str : String)
          -> Err ts (Row s)
decodeRow row = go 1 s . fromCSV
  where go : Nat -> (cs : Schema) -> List String -> Err ts (Row cs)
        go k []       []                    = Right []
        go k []       (_ :: _)              = fail $ ExpectedEOI row k
        go k (_ :: _) []                    = fail $ UnexpectedEOI row k
        go k (MkColumn n c :: cs) (s :: ss) =
          [| decodeField row k c s :: go (S k) cs ss |]

Here is an example REPL session, where I test readSchema. I defined variable ts using the :let command to make this more convenient. Note, how the order of error types is of no importance, as long as types InvalidColumn and NoColType are present in the list of errors:

Tutorial.Predicates> :let ts = the (Vect 3 _) [NoColType,NoNat,InvalidColumn]
Tutorial.Predicates> readSchema {ts} "foo:bar"
Left (U Z (MkNoColType "bar"))
Tutorial.Predicates> readSchema {ts} "foo:Float"
Right [MkColumn "foo" Float]
Tutorial.Predicates> readSchema {ts} "foo Float"
Left (U (S (S Z)) (MkInvalidColumn "foo Float"))

Error Handling

There are several techniques for handling errors, all of which are useful at times. For instance, we might want to handle some errors early on and individually, while dealing with others much later in our application. Or we might want to handle them all in one fell swoop. We look at both approaches here.

First, in order to handle a single error individually, we need to split a union into one of two possibilities: A value of the error type in question or a new union, holding one of the other error types. We need a new predicate for this, which not only encodes the presence of a value in a vector but also the result of removing that value:

data Rem : (v : a) -> (vs : Vect (S n) a) -> (rem : Vect n a) -> Type where
  [search v vs]
  RZ : Rem v (v :: rem) rem
  RS : Rem v vs rem -> Rem v (w :: vs) (w :: rem)

Once again, we want to use one of the indices (rem) in our functions' return types, so we only use the other indices during proof search. Here is a function for splitting off a value from an open union:

split : (prf : Rem t ts rem) => Union ts -> Either t (Union rem)
split {prf = RZ}   (U Z     val) = Left val
split {prf = RZ}   (U (S x) val) = Right (U x val)
split {prf = RS p} (U Z     val) = Right (U Z val)
split {prf = RS p} (U (S x) val) = case split {prf = p} (U x val) of
  Left vt        => Left vt
  Right (U ix y) => Right $ U (S ix) y

This tries to extract a value of type t from a union. If it works, the result is wrapped in a Left, otherwise a new union is returned in a Right, but this one has t removed from its list of possible types.

With this, we can implement a handler for single errors. Error handling often happens in an effectful context (we might want to print a message to the console or write the error to a log file), so we use an applicative effect type to handle errors in.

handle :  Applicative f
       => Rem t ts rem
       => (h : t -> f a)
       -> Err ts a
       -> f (Err rem a)
handle h (Left x)  = case split x of
  Left v    => Right <$> h v
  Right err => pure $ Left err
handle _ (Right x) = pure $ Right x

For handling all errors at once, we can use a handler type indexed by the vector of errors, and parameterized by the output type:

namespace Handler
  public export
  data Handler : (ts : Vect n Type) -> (a : Type) -> Type where
    Nil  : Handler [] a
    (::) : (t -> a) -> Handler ts a -> Handler (t :: ts) a

extract : Handler ts a -> Has t ts -> t -> a
extract (f :: _)  Z     val = f val
extract (_ :: fs) (S y) val = extract fs y val
extract []        ix    _   = absurd ix

handleAll : Applicative f => Handler ts (f a) -> Err ts a -> f a
handleAll _ (Right v)       = pure v
handleAll h (Left $ U ix v) = extract h ix v

Below, we will see an additional way of handling all errors at once by defining a custom interface for error handling.

Exercises part 3

  1. Implement the following utility functions for Union:

    project : (0 t : Type) -> (prf : Has t ts) => Union ts -> Maybe t
    
    project1 : Union [t] -> t
    
    safe : Err [] a -> a
    
  2. Implement the following two functions for embedding an open union in a larger set of possibilities. Note the unerased implicit in extend!

    weaken : Union ts -> Union (ts ++ ss)
    
    extend : {m : _} -> {0 pre : Vect m _} -> Union ts -> Union (pre ++ ts)
    
  3. Find a general way to embed a Union ts in a Union ss, so that the following is possible:

    embedTest :  Err [NoNat,NoColType] a
              -> Err [FileError, NoColType, OutOfBounds, NoNat] a
    embedTest = mapFst embed
    
  4. Make handle more powerful, by letting the handler convert the error in question to an f (Err rem a).

The Truth about Interfaces

module Tutorial.Predicates.Truth

import Tutorial.Predicates.Contracts
import Tutorial.Predicates.ErrorHandling

import Data.Either
import Data.List1
import Data.String
import Data.Vect
import Data.HList
import Decidable.Equality

import Text.CSV
import System.File

%default total

Well, here it finally is: The truth about interfaces. Internally, an interface is just a record data type, with its fields corresponding to the members of the interface. An interface implementation is a value of such a record, annotated with a %hint pragma (see below) to make the value available during proof search. Finally, a constrained function is just a function with one or more auto implicit arguments. For instance, here is the same function for looking up an element in a list, once with the known syntax for constrained functions, and once with an auto implicit argument. The code produced by Idris is the same in both cases:

isElem1 : Eq a => a -> List a -> Bool
isElem1 v []        = False
isElem1 v (x :: xs) = x == v || isElem1 v xs

isElem2 : {auto _ : Eq a} -> a -> List a -> Bool
isElem2 v []        = False
isElem2 v (x :: xs) = x == v || isElem2 v xs

Being mere records, we can also take interfaces as regular function arguments and dissect them with a pattern match:

eq : Eq a -> a -> a -> Bool
eq (MkEq feq fneq) = feq

A manual Interface Definition

I'll now demonstrate how we can achieve the same behavior with proof search as with a regular interface definition plus implementations. Since I want to finish the CSV example with our new error handling tools, we are going to implement some error handlers. First, an interface is just a record:

record Print a where
  constructor MkPrint
  print' : a -> String

In order to access the record in a constrained function, we use the %search keyword, which will try to conjure a value of the desired type (Print a in this case) by means of a proof search:

print : Print a => a -> String
print = print' %search

As an alternative, we could use a named constraint, and access it directly via its name:

print2 : (impl : Print a) => a -> String
print2 = print' impl

As yet another alternative, we could use the syntax for auto implicit arguments:

print3 : {auto impl : Print a} -> a -> String
print3 = print' impl

All three versions of print behave exactly the same at runtime. So, whenever we write {auto x : Foo} -> we can just as well write (x : Foo) => and vice versa.

Interface implementations are just values of the given record type, but in order to be available during proof search, these need to be annotated with a %hint pragma:

%hint
noNatPrint : Print NoNat
noNatPrint = MkPrint $ \e => "Not a natural number: \{e.str}"

%hint
noColTypePrint : Print NoColType
noColTypePrint = MkPrint $ \e => "Not a column type: \{e.str}"

%hint
outOfBoundsPrint : Print OutOfBounds
outOfBoundsPrint = MkPrint $ \e => "Index is out of bounds: \{show e.index}"

%hint
rowErrorPrint : Print RowError
rowErrorPrint = MkPrint $
  \case InvalidField r c ct s =>
          "Not a \{show ct} in row \{show r}, column \{show c}. \{s}"
        UnexpectedEOI r c =>
          "Unexpected end of input in row \{show r}, column \{show c}."
        ExpectedEOI r c =>
          "Expected end of input in row \{show r}, column \{show c}."

We can also write an implementation of Print for a union or errors. For this, we first come up with a proof that all types in the union's index come with an implementation of Print:

0 All : (f : a -> Type) -> Vect n a -> Type
All f []        = ()
All f (x :: xs) = (f x, All f xs)

unionPrintImpl : All Print ts => Union ts -> String
unionPrintImpl (U Z val)     = print val
unionPrintImpl (U (S x) val) = unionPrintImpl $ U x val

%hint
unionPrint : All Print ts => Print (Union ts)
unionPrint = MkPrint unionPrintImpl

Defining interfaces this way can be an advantage, as there is much less magic going on, and we have more fine grained control over the types and values of our fields. Note also, that all of the magic comes from the search hints, with which our "interface implementations" were annotated. These made the corresponding values and functions available during proof search.

Parsing CSV Commands

To conclude this chapter, we reimplement our CSV command parser, using the flexible error handling approach from the last section. While not necessarily less verbose than the original parser, this approach decouples the handling of errors and printing of error messages from the rest of the application: Functions with a possibility of failure are reusable in different contexts, as are the pretty printers we use for the error messages.

First, we repeat some stuff from earlier chapters. I sneaked in a new command for printing all values in a column:

record Table where
  constructor MkTable
  schema : Schema
  size   : Nat
  rows   : Vect size (Row schema)

data Command : (t : Table) -> Type where
  PrintSchema :  Command t
  PrintSize   :  Command t
  New         :  (newSchema : Schema) -> Command t
  Prepend     :  Row (schema t) -> Command t
  Get         :  Fin (size t) -> Command t
  Delete      :  Fin (size t) -> Command t
  Col         :  (name : String)
              -> (tpe  : ColType)
              -> (prf  : InSchema name t.schema tpe)
              -> Command t
  Quit        : Command t

applyCommand : (t : Table) -> Command t -> Table
applyCommand t                 PrintSchema = t
applyCommand t                 PrintSize   = t
applyCommand _                 (New ts)    = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r) = MkTable ts _ $ r :: rs
applyCommand t                 (Get x)     = t
applyCommand t                 Quit        = t
applyCommand t                 (Col _ _ _) = t
applyCommand (MkTable ts n rs) (Delete x)  = case n of
  S k => MkTable ts k (deleteAt x rs)
  Z   => absurd x

Next, below is the command parser reimplemented. In total, it can fail in seven different was, at least some of which might also be possible in other parts of a larger application.

record UnknownCommand where
  constructor MkUnknownCommand
  str : String

%hint
unknownCommandPrint : Print UnknownCommand
unknownCommandPrint = MkPrint $ \v => "Unknown command: \{v.str}"

record NoColName where
  constructor MkNoColName
  str : String

%hint
noColNamePrint : Print NoColName
noColNamePrint = MkPrint $ \v => "Unknown column: \{v.str}"

0 CmdErrs : Vect 7 Type
CmdErrs = [ InvalidColumn
          , NoColName
          , NoColType
          , NoNat
          , OutOfBounds
          , RowError
          , UnknownCommand ]

readCommand : (t : Table) -> String -> Err CmdErrs (Command t)
readCommand _                "schema"  = Right PrintSchema
readCommand _                "size"    = Right PrintSize
readCommand _                "quit"    = Right Quit
readCommand (MkTable ts n _) s         = case words s of
  ["new",    str] => New     <$> readSchema str
  "add" ::   ss   => Prepend <$> decodeRow 1 (unwords ss)
  ["get",    str] => Get     <$> readFin str
  ["delete", str] => Delete  <$> readFin str
  ["column", str] => case inSchema ts str of
    Just (ct ** prf=> Right $ Col str ct prf
    Nothing          => fail $ MkNoColName str
  _               => fail $ MkUnknownCommand s

Note, how we could invoke functions like readFin or readSchema directly, because the necessary error types are part of our list of possible errors.

To conclude this sections, here is the functionality for printing the result of a command plus the application's main loop. Most of this is repeated from earlier chapters, but note how we can handle all errors at once with a single call to print:

encodeField : (t : ColType) -> IdrisType t -> String
encodeField I64     x     = show x
encodeField Str     x     = show x
encodeField Boolean True  = "t"
encodeField Boolean False = "f"
encodeField Float   x     = show x

encodeRow : (s : Schema) -> Row s -> String
encodeRow s = concat . intersperse "," . go s
  where go : (s' : Schema) -> Row s' -> Vect (length s') String
        go []        []        = []
        go (MkColumn _ c :: cs) (v :: vs) = encodeField c v :: go cs vs

encodeCol :  (name : String)
          -> (c    : ColType)
          -> InSchema name s c
          => Vect n (Row s)
          -> String
encodeCol name c = unlines . toList . map (\r => encodeField c $ getAt name r)

result :  (t : Table) -> Command t -> String
result t PrintSchema   = "Current schema: \{showSchema t.schema}"
result t PrintSize     = "Current size: \{show t.size}"
result _ (New ts)      = "Created table. Schema: \{showSchema ts}"
result t (Prepend r)   = "Row prepended: \{encodeRow t.schema r}"
result _ (Delete x)    = "Deleted row: \{show $ FS x}."
result _ Quit          = "Goodbye."
result t (Col n c prf) = "Column \{n}:\n\{encodeCol n c t.rows}"
result t (Get x)       =
  "Row \{show $ FS x}: \{encodeRow t.schema (index x t.rows)}"

covering
runProg : Table -> IO ()
runProg t = do
  putStr "Enter a command: "
  str <- getLine
  case readCommand t str of
    Left err   => putStrLn (print err) >> runProg t
    Right Quit => putStrLn (result t Quit)
    Right cmd  => putStrLn (result t cmd) >>
                  runProg (applyCommand t cmd)

covering
main : IO ()
main = runProg $ MkTable [] _ []

Here is an example REPL session:

Tutorial.Predicates> :exec main
Enter a command: new name:Str,age:Int64,salary:Float
Not a column type: Int64
Enter a command: new name:Str,age:I64,salary:Float
Created table. Schema: name:Str,age:I64,salary:Float
Enter a command: add John Doe,44,3500
Row prepended: "John Doe",44,3500.0
Enter a command: add Jane Doe,50,4000
Row prepended: "Jane Doe",50,4000.0
Enter a command: get 1
Row 1: "Jane Doe",50,4000.0
Enter a command: column salary
Column salary:
4000.0
3500.0

Enter a command: quit
Goodbye.

Conclusion

Predicates allow us to describe contracts between types and to refine the values we accept as valid function arguments. They allow us to make a function safe and convenient to use at runtime and compile time by using them as auto implicit arguments, which Idris should try to construct on its own if it has enough information about the structure of a function's arguments.

Primitives

In the topics we covered so far, we hardly ever talked about primitive types in Idris. They were around and we used them in some computations, but I never really explained how they work and where they come from, nor did I show in detail what we can and can't do with them.

How Primitives are Implemented

module Tutorial.Prim.Prim

import Data.Bits
import Data.String

%default total

A Short Note on Backends

According to Wikipedia, a compiler is "a computer program that translates computer code written in one programming language (the source language) into another language (the target language)". The Idris compiler is exactly that: A program translating programs written in Idris into programs written in Chez Scheme. This scheme code is then parsed and interpreted by a Chez Scheme interpreter, which must be installed on the computers we use to run compiled Idris programs.

But that's only part of the story. Idris 2 was from the beginning designed to support different code generators (so called backends), which allows us to write Idris code to target different platforms, and your Idris installation comes with several additional backends available. You can specify the backend to use with the --cg command-line argument (cg stands for code generator). For instance:

idris2 --cg racket

Here is a non-comprehensive list of the backends available with a standard Idris installation (the name to be used in the command-line argument is given in parentheses):

  • Racket Scheme (racket): This is a different flavour of the scheme programming language, which can be useful to use when Chez Scheme is not available on your operating system.
  • Node.js (node): This converts an Idris program to JavaScript.
  • Browser (javascript): Another JavaScript backend which allows you to write web applications which run in the browser in Idris.
  • RefC (refc): A backend compiling Idris to C code, which is then further compiled by a C compiler.

I plan to at least cover the JavaScript backends in some more detail in another part of this Idris guide, as I use them pretty often myself.

There are also several external backends not officially supported by the Idris project, amongst which are backends for compiling Idris code to Java and Python. You can find a list of external backends on the Idris Wiki.

The Idris Primitives

A primitive data type is a type that is built into the Idris compiler together with a set of primitive functions, which are used to perform calculations on the primitives. You will therefore not find a definition of a primitive type or function in the source code of the Prelude.

Here is again the list of primitive types in Idris:

  • Signed, fixed precision integers:
    • Int8: Integer in the range [-128,127]
    • Int16: Integer in the range [-32768,32767]
    • Int32: Integer in the range [-2147483648,2147483647]
    • Int64: Integer in the range [-9223372036854775808,9223372036854775807]
  • Unsigned, fixed precision integers:
    • Bits8: Integer in the range [0,255]
    • Bits16: Integer in the range [0,65535]
    • Bits32: Integer in the range [0,4294967295]
    • Bits64: Integer in the range [0,18446744073709551615]
  • Integer: A signed, arbitrary precision integer.
  • Double: A double precision (64 bit) floating point number.
  • Char: A unicode character.
  • String: A sequence of unicode characters.
  • %World: A symbolic representation of the current world state. We learned about this when I showed you how IO is implemented. Most of the time, you will not handle values of this type in your own code.
  • Int: This one is special. It is a fixed precision, signed integer, but the bit size is somewhat dependent on the backend and (maybe) platform we use. For instance, if you use the default Chez Scheme backend, Int is a 64 bit signed integer, while on the JavaScript backends it is a 32 bit signed integer for performance reasons. Therefore, Int comes with very few guarantees, and you should use one of the well specified integer types listed above whenever possible.

It can be instructive to learn, where in the compiler's source code the primitive types and functions are defined. This source code can be found in folder src of the Idris project and the primitive types are the constant constructors of data type Core.TT.Constant.

Primitive Functions

All calculations operating on primitives are based on two kinds of primitive functions: The ones built into the compiler (see below) and the ones defined by programmers via the foreign function interface (FFI), about which I'll talk in another chapter.

Built-in primitive functions are functions known to the compiler the definition of which can not be found in the Prelude. They define the core functionality available for the primitive types. Typically, you do not invoke these directly (although it is perfectly fine to do so in most cases) but via functions and interfaces exported by the Prelude or the base library.

For instance, the primitive function for adding two eight bit unsigned integers is prim__add_Bits8. You can inspect its type and behavior at the REPL:

Tutorial.Prim> :t prim__add_Bits8
prim__add_Bits8 : Bits8 -> Bits8 -> Bits8
Tutorial.Prim> prim__add_Bits8 12 100
112

If you look at the source code implementing interface Num for Bits8, you will see that the plus operator just invokes prim__add_Bits8 internally. The same goes for most of the other functions in primitive interface implementations. For instance, every primitive type with the exception of %World comes with primitive comparison functions. For Bits8, these are: prim__eq_Bits8, prim__gt_Bits8, prim__lt_Bits8, prim__gte_Bits8, and prim__lte_Bits8. Note, that these functions do not return a Bool (which is not a primitive type in Idris), but an Int. They are therefore not as safe or convenient to use as the corresponding operator implementations from interfaces Eq and Comp. On the other hand, they do not go via a conversion to Bool and might therefore perform slightly better in performance critical code (which you can only identify after some serious profiling).

As with primitive types, the primitive functions are listed as constructors in a data type (Core.TT.PrimFn) in the compiler sources. We will look at most of these in the following sections.

Consequences of being Primitive

Primitive functions and types are opaque to the compiler in most regards: They have to be defined and implemented by each backend individually, therefore, the compiler knows nothing about the inner structure of a primitive value nor about the inner workings of primitive functions. For instance, in the following recursive function, we know that the argument in the recursive call must be converging towards the base case (unless there is a bug in the backend we use), but the compiler does not:

covering
replicateBits8' : Bits8 -> a -> List a
replicateBits8' 0 _ = []
replicateBits8' n v = v :: replicateBits8' (n - 1) v

In these cases, we either must be content with just a covering function, or we use assert_smaller to convince the totality checker (the preferred way):

replicateBits8 : Bits8 -> a -> List a
replicateBits8 0 _ = []
replicateBits8 n v = v :: replicateBits8 (assert_smaller n $ n - 1) v

I have shown you the risks of using assert_smaller before, so we must be extra careful in making sure that the new function argument is indeed smaller with relation to the base case.

While Idris knows nothing about the internal workings of primitives and related functions, most of these functions still reduce during evaluation when fed with values known at compile time. For instance, we can trivially proof that for Bits8 the following equation holds:

zeroBits8 : the Bits8 0 = 255 + 1
zeroBits8 = Refl

Having no clue about the internal structure of a primitive nor about the implementations of primitive functions, Idris can't help us proofing any general properties of such functions and values. Here is an example to demonstrate this. Assume we'd like to wrap a list in a data type indexed by the list's length:

data LenList : (n : Nat) -> Type -> Type where
  MkLenList : (as : List a) -> LenList (length as) a

When we concatenate two LenLists, the length indices should be added. That's how list concatenation affects the length of lists. We can safely teach Idris that this is true:

0 concatLen : (xs,ys : List a) -> length xs + length ys = length (xs ++ ys)
concatLen []        ys = Refl
concatLen (x :: xs) ys = cong S $ concatLen xs ys

With the above lemma, we can implement concatenation of LenList:

(++) : LenList m a -> LenList n a -> LenList (m + n) a
MkLenList xs ++ MkLenList ys =
  rewrite concatLen xs ys in MkLenList (xs ++ ys)

The same is not possible for strings. There are applications where pairing a string with its length would be useful (for instance, if we wanted to make sure that strings are getting strictly shorter during parsing and will therefore eventually be wholly consumed), but Idris cannot help us getting these things right. There is no way to implement and thus proof the following lemma in a safe way:

0 concatLenStr : (a,b : String) -> length a + length b = length (a ++ b)

Believe Me!

In order to implement concatLenStr, we have to abandon all safety and use the ten ton wrecking ball of type coercion: believe_me. This primitive function allows us to freely coerce a value of any type into a value of any other type. Needless to say, this is only safe if we really know what we are doing:

concatLenStr a b = believe_me $ Refl {x = length a + length b}

The explicit assignment of variable x in {x = length a + length b} is necessary, because otherwise Idris will complain about an unsolved hole: It can't infer the type of parameter x in the Refl constructor. We could assign any type to x here, because we are passing the result to believe_me anyway, but I consider it to be good practice to assign one of the two sides of the equality to make our intention clear.

The higher the complexity of a primitive type, the riskier it is to assume even the most basic properties for it to hold. For instance, we might act under the delusion that floating point addition is associative:

0 doubleAddAssoc : (x,y,z : Double) -> x + (y + z) = (x + y) + z
doubleAddAssoc x y z = believe_me $ Refl {x = x + (y + z)}

Well, guess what: That's a lie. And lies lead us straight into the Void:

Tiny : Double
Tiny = 0.0000000000000001

One : Double
One = 1.0

wrong : (0 _ : 1.0000000000000002 = 1.0) -> Void
wrong Refl impossible

boom : Void
boom = wrong (doubleAddAssoc One Tiny Tiny)

Here's what happens in the code above: The call to doubleAddAssoc returns a proof that One + (Tiny + Tiny) is equal to (One + Tiny) + Tiny. But One + (Tiny + Tiny) equals 1.0000000000000002, while (One + Tiny) + Tiny equals 1.0. We can therefore pass our (wrong) proof to wrong, because it is of the correct type, and from this follows a proof of Void.

Working with Strings

module Tutorial.Prim.Strings

import Data.Bits
import Data.String

%default total

Module Data.String in base offers a rich set of functions for working with strings. All these are based on the following primitive operations built into the compiler:

  • prim__strLength: Returns the length of a string.
  • prim__strHead: Extracts the first character from a string.
  • prim__strTail: Removes the first character from a string.
  • prim__strCons: Prepends a character to a string.
  • prim__strAppend: Appends two strings.
  • prim__strIndex: Extracts a character at the given position from a string.
  • prim__strSubstr: Extracts the substring between the given positions.

Needless to say, not all of these functions are total. Therefore, Idris must make sure that invalid calls do not reduce during compile time, as otherwise the compiler would crash. If, however we force the evaluation of a partial primitive function by compiling and running the corresponding program, this program will crash with an error:

Tutorial.Prim> prim__strTail ""
prim__strTail ""
Tutorial.Prim> :exec putStrLn (prim__strTail "")
Exception in substring: 1 and 0 are not valid start/end indices for ""

Note, how prim__strTail "" is not reduced at the REPL and how the same expression leads to a runtime exception if we compile and execute the program. Valid calls to prim__strTail are reduced just fine, however:

tailExample : prim__strTail "foo" = "oo"
tailExample = Refl

Pack and Unpack

Two of the most important functions for working with strings are unpack and pack, which convert a string to a list of characters and vice versa. This allows us to conveniently implement many string operations by iterating or folding over the list of characters instead. This might not always be the most efficient thing to do, but unless you plan to handle very large amounts of text, they work and perform reasonably well.

String Interpolation

Idris allows us to include arbitrary string expressions in a string literal by wrapping them in curly braces, the first of which has to be escaped with a backslash. For instance:

interpEx1 : Bits64 -> Bits64 -> String
interpEx1 x y = "\{show x} + \{show y} = \{show $ x + y}"

This is a very convenient way to assemble complex strings from values of different types. In addition, there is interface Interpolation, which allows us to use values in interpolated strings without having to convert them to strings first:

data Element = H | He | C | N | O | F | Ne

Formula : Type
Formula = List (Element,Nat)

Interpolation Element where
  interpolate H  = "H"
  interpolate He = "He"
  interpolate C  = "C"
  interpolate N  = "N"
  interpolate O  = "O"
  interpolate F  = "F"
  interpolate Ne = "Ne"

Interpolation (Element,Nat) where
  interpolate (_, 0) = ""
  interpolate (x, 1) = "\{x}"
  interpolate (x, k) = "\{x}\{show k}"

Interpolation Formula where
  interpolate = foldMap interpolate

ethanol : String
ethanol = "The formulat of ethanol is: \{[(C,2),(H,6),(O, the Nat 1)]}"

Raw and Multiline String Literals

In string literals, we have to escape certain characters like quotes, backslashes or new line characters. For instance:

escapeExample : String
escapeExample = "A quote: \". \nThis is on a new line.\nA backslash: \\"

Idris allows us to enter raw string literals, where there is no need to escape quotes and backslashes, by pre- and postfixing the wrapping quote characters with the same number of hash characters. For instance:

rawExample : String
rawExample = #"A quote: ". A blackslash: \"#

rawExample2 : String
rawExample2 = ##"A quote: ". A blackslash: \"##

With raw string literals, it is still possible to use string interpolation, but the opening curly brace has to be prefixed with a backslash and the same number of hashes as are being used for opening and closing the string literal:

rawInterpolExample : String
rawInterpolExample = ##"An interpolated "string": \##{rawExample}"##

Finally, Idris also allows us to conveniently write multiline strings. These can be pre- and postfixed with hashes if we want raw multiline string literals, and they also can be combined with string interpolation. Multiline literals are opened and closed with triple quote characters. Indenting the closing triple quotes allows us to indent the whole multiline literal. Whitespace used for indentation will not appear in the resulting string. For instance:

multiline1 : String
multiline1 = """
  And I raise my head and stare
  Into the eyes of a stranger
  I've always known that the mirror never lies
  People always turn away
  From the eyes of a stranger
  Afraid to see what hides behind the stare
  """

multiline2 : String
multiline2 = #"""
  An example for a simple expression:
  "foo" ++ "bar".
  This is reduced to "\#{"foo" ++ "bar"}".
  """#

Make sure to look at the example strings at the REPL to see the effect of interpolation and raw string literals and compare it with the syntax we used.

Exercises part 1

In these exercises, you are supposed to implement a bunch of utility functions for consuming and converting strings. I don't give the expected types here, because you are supposed to come up with those yourself.

  1. Implement functions similar to map, filter, and mapMaybe for strings. The output type of these should always be a string.

  2. Implement functions similar to foldl and foldMap for strings.

  3. Implement a function similar to traverse for strings. The output type should be a wrapped string.

  4. Implement the bind operator for strings. The output type should again be a string.

Integers

module Tutorial.Prim.Integers

import Data.Bits
import Data.String

%default total

As listed at the beginning of this chapter, Idris provides different fixed-precision signed and unsigned integer types as well as Integer, an arbitrary precision signed integer type. All of them come with the following primitive functions (given here for Bits8 as an example):

  • prim__add_Bits8: Integer addition.
  • prim__sub_Bits8: Integer subtraction.
  • prim__mul_Bits8: Integer multiplication.
  • prim__div_Bits8: Integer division.
  • prim__mod_Bits8: Modulo function.
  • prim__shl_Bits8: Bitwise left shift.
  • prim__shr_Bits8: Bitwise right shift.
  • prim__and_Bits8: Bitwise and.
  • prim__or_Bits8: Bitwise or.
  • prim__xor_Bits8: Bitwise xor.

Typically, you use the functions for addition and multiplication through the operators from interface Num, the function for subtraction through interface Neg, and the functions for division (div and mod) through interface Integral. The bitwise operations are available through interfaces Data.Bits.Bits and Data.Bits.FiniteBits.

For all integral types, the following laws are assumed to hold for numeric operations (x, y, and z are arbitrary value of the same primitive integral type):

  • x + y = y + x: Addition is commutative.
  • x + (y + z) = (x + y) + z: Addition is associative.
  • x + 0 = x: Zero is the neutral element of addition.
  • x - x = x + (-x) = 0: -x is the additive inverse of x.
  • x * y = y * x: Multiplication is commutative.
  • x * (y * z) = (x * y) * z: Multiplication is associative.
  • x * 1 = x: One is the neutral element of multiplication.
  • x * (y + z) = x * y + x * z: The distributive law holds.
  • y * (x `div` y) + (x `mod` y) = x (for y /= 0).

Please note, that the officially supported backends use Euclidian modulus for calculating mod: For y /= 0, x `mod` y is always a non-negative value strictly smaller than abs y, so that the law given above does hold. If x or y are negative numbers, this is different to what many other languages do but for good reasons as explained in the following article.

Unsigned Integers

The unsigned fixed precision integer types (Bits8, Bits16, Bits32, and Bits64) come with implementations of all integral interfaces (Num, Neg, and Integral) and the two interfaces for bitwise operations (Bits and FiniteBits). All functions with the exception of div and mod are total. Overflows are handled by calculating the remainder modulo 2^bitsize. For instance, for Bits8, all operations calculate their results modulo 256:

Main> the Bits8 255 + 1
0
Main> the Bits8 255 + 255
254
Main> the Bits8 128 * 2 + 7
7
Main> the Bits8 12 - 13
255

Signed Integers

Like the unsigned integer types, the signed fixed precision integer types (Int8, Int16, Int32, and Int64) come with implementations of all integral interfaces and the two interfaces for bitwise operations (Bits and FiniteBits). Overflows are handled by calculating the remainder modulo 2^bitsize and subtracting 2^bitsize if the result is still out of range. For instance, for Int8, all operations calculate their results modulo 256, subtracting 256 if the result is still out of bounds:

Main> the Int8 2 * 127
-2
Main> the Int8 3 * 127
125

Bitwise Operations

Module Data.Bits exports interfaces for performing bitwise operations on integral types. I'm going to show a couple of examples on unsigned 8-bit numbers (Bits8) to explain the concept to readers new to bitwise arithmetics. Note, that this is much easier to grasp for unsigned integer types than for the signed versions. Those have to include information about the sign of numbers in their bit pattern, and it is assumed that signed integers in Idris use a two's complement representation, about which I will not go into the details here.

An unsigned 8-bit binary number is represented internally as a sequence of eight bits (with values 0 or 1), each of which corresponds to a power of 2. For instance, the number 23 (= 16 + 4 + 2 + 1) is represented as 0001 0111:

23 in binary:    0  0  0  1    0  1  1  1

Bit number:      7  6  5  4    3  2  1  0
Decimal value: 128 64 32 16    8  4  2  1

We can use function testBit to check if the bit at the given position is set or not:

Tutorial.Prim> testBit (the Bits8 23) 0
True
Tutorial.Prim> testBit (the Bits8 23) 1
True
Tutorial.Prim> testBit (the Bits8 23) 3
False

Likewise, we can use functions setBit and clearBit to set or unset a bit at a certain position:

Tutorial.Prim> setBit (the Bits8 23) 3
31
Tutorial.Prim> clearBit (the Bits8 23) 2
19

There are also operators (.&.) (bitwise and) and (.|.) (bitwise or) as well as function xor (bitwise exclusive or) for performing boolean operations on integral values. For instance x .&. y has exactly those bits set, which both x and y have set, while x .|. y has all bits set that are either set in x or y (or both), and x `xor` y has those bits set that are set in exactly one of the two values:

23 in binary:          0  0  0  1    0  1  1  1
11 in binary:          0  0  0  0    1  0  1  1

23 .&. 11 in binary:   0  0  0  0    0  0  1  1
23 .|. 11 in binary:   0  0  0  1    1  1  1  1
23 `xor` 11 in binary: 0  0  0  1    1  1  0  0

And here are the examples at the REPL:

Tutorial.Prim> the Bits8 23 .&. 11
3
Tutorial.Prim> the Bits8 23 .|. 11
31
Tutorial.Prim> the Bits8 23 `xor` 11
28

Finally, it is possible to shift all bits to the right or left by a certain number of steps by using functions shiftR and shiftL, respectively (overflowing bits will just be dropped). A left shift can therefore be viewed as a multiplication by a power of two, while a right shift can be seen as a division by a power of two:

22 in binary:            0  0  0  1    0  1  1  0

22 `shiftL` 2 in binary: 0  1  0  1    1  0  0  0
22 `shiftR` 1 in binary: 0  0  0  0    1  0  1  1

And at the REPL:

Tutorial.Prim> the Bits8 22 `shiftL` 2
88
Tutorial.Prim> the Bits8 22 `shiftR` 1
11

Bitwise operations are often used in specialized code or certain high-performance applications. As programmers, we have to know they exist and how they work.

Integer Literals

So far, we always required an implementation of Num in order to be able to use integer literals for a given type. However, it is actually only necessary to implement a function fromInteger converting an Integer to the type in question. As we will see in the last section, such a function can even restrict the values allowed as valid literals.

For instance, assume we'd like to define a data type for representing the charge of a chemical molecule. Such a value can be positive or negative and (theoretically) of almost arbitrary magnitude:

record Charge where
  constructor MkCharge
  value : Integer

It makes sense to be able to sum up charges, but not to multiply them. They should therefore have an implementation of Monoid but not of Num. Still, we'd like to have the convenience of integer literals when using constant charges at compile time. Here's how to do this:

fromInteger : Integer -> Charge
fromInteger = MkCharge

Semigroup Charge where
  x <+> y = MkCharge $ x.value + y.value

Monoid Charge where
  neutral = 0

Alternative Bases

In addition to the well known decimal literals, it is also possible to use integer literals in binary, octal, or hexadecimal representation. These have to be prefixed with a zero following by a b, o, or x for binary, octal, and hexadecimal, respectively:

Tutorial.Prim> 0b1101
13
Tutorial.Prim> 0o773
507
Tutorial.Prim> 0xffa2
65442

Exercises part 2

  1. Define a wrapper record for integral values and implement Monoid so that (<+>) corresponds to (.&.).

    Hint: Have a look at the functions available from interface Bits to find a value suitable as the neutral element.

  2. Define a wrapper record for integral values and implement Monoid so that (<+>) corresponds to (.|.).

  3. Use bitwise operations to implement a function, which tests if a given value of type Bits64 is even or not.

  4. Convert a value of type Bits64 to a string in binary representation.

  5. Convert a value of type Bits64 to a string in hexadecimal representation.

    Hint: Use shiftR and (.&. 15) to access subsequent packages of four bits.

Refined Primitives

module Tutorial.Prim.Refined

import Data.Bits
import Data.String

%default total

We often do not want to allow all values of a type in a certain context. For instance, String as an arbitrary sequence of UTF-8 characters (several of which are not even printable), is too general most of the time. Therefore, it is usually advisable to rule out invalid values early on, by pairing a value with an erased proof of validity.

We have learned how we can write elegant predicates, with which we can proof our functions to be total, and from which we can - in the ideal case - derive other, related predicates. However, when we define predicates on primitives they are to a certain degree doomed to live in isolation, unless we come up with a set of primitive axioms (implemented most likely using believe_me), with which we can manipulate our predicates.

Use Case: ASCII Strings

String encodings is a difficult topic, so in many low level routines it makes sense to rule out most characters from the beginning. Assume therefore, we'd like to make sure the strings we accept in our application only consist of ASCII characters:

isAsciiChar : Char -> Bool
isAsciiChar c = ord c <= 127

isAsciiString : String -> Bool
isAsciiString = all isAsciiChar . unpack

We can now refine a string value by pairing it with an erased proof of validity:

record Ascii where
  constructor MkAscii
  value : String
  0 prf : isAsciiString value === True

It is now impossible to at runtime or compile time create a value of type Ascii without first validating the wrapped string. With this, it is already pretty easy to safely wrap strings at compile time in a value of type Ascii:

hello : Ascii
hello = MkAscii "Hello World!" Refl

And yet, it would be much more convenient to still use string literals for this, without having to sacrifice the comfort of safety. To do so, we can't use interface FromString, as its function fromString would force us to convert any string, even an invalid one. However, we actually don't need an implementation of FromString to support string literals, just like we didn't require an implementation of Num to support integer literals. What we really need is a function named fromString. Now, when string literals are desugared, they are converted to invocations of fromString with the given string value as its argument. For instance, literal "Hello" gets desugared to fromString "Hello". This happens before type checking and filling in of (auto) implicit values. It is therefore perfectly fine, to define a custom fromString function with an erased auto implicit argument as a proof of validity:

fromString : (s : String) -> {auto 0 prf : isAsciiString s === True} -> Ascii
fromString s = MkAscii s prf

With this, we can use (valid) string literals for coming up with values of type Ascii directly:

hello2 : Ascii
hello2 = "Hello World!"

In order to at runtime create values of type Ascii from strings of an unknown source, we can use a refinement function returning some kind of failure type:

test : (b : Bool) -> Dec (b === True)
test True  = Yes Refl
test False = No absurd

ascii : String -> Maybe Ascii
ascii x = case test (isAsciiString x) of
  Yes prf   => Just $ MkAscii x prf
  No contra => Nothing

Disadvantages of Boolean Proofs

For many use cases, what we described above for ASCII strings can take us very far. However, one drawback of this approach is that we can't safely perform any computations with the proofs at hand.

For instance, we know it will be perfectly fine to concatenate two ASCII strings, but in order to convince Idris of this, we will have to use believe_me, because we will not be able to proof the following lemma otherwise:

0 allAppend :  (f : Char -> Bool)
            -> (s1,s2 : String)
            -> (p1 : all f (unpack s1) === True)
            -> (p2 : all f (unpack s2) === True)
            -> all f (unpack (s1 ++ s2)) === True
allAppend f s1 s2 p1 p2 = believe_me $ Refl {x = True}

namespace Ascii
  export
  (++) : Ascii -> Ascii -> Ascii
  MkAscii s1 p1 ++ MkAscii s2 p2 =
    MkAscii (s1 ++ s2) (allAppend isAsciiChar s1 s2 p1 p2)

The same goes for all operations extracting a substring from a given string: We will have to implement according rules using believe_me. Finding a reasonable set of axioms to conveniently deal with refined primitives can therefore be challenging at times, and whether such axioms are even required very much depends on the use case at hand.

Use Case: Sanitized HTML

Assume you write a simple web application for scientific discourse between registered users. To keep things simple, we only consider unformatted text input here. Users can write arbitrary text in a text field and upon hitting Enter, the message is displayed to all other registered users.

Assume now a user decides to enter the following text:

<script>alert("Hello World!")</script>

Well, it could have been (much) worse. Still, unless we take measures to prevent this from happening, this might embed a JavaScript program in our web page we never intended to have there! What I described here, is a well known security vulnerability called cross-site scripting. It allows users of web pages to enter malicious JavaScript code in text fields, which will then be included in the page's HTML structure and executed when it is being displayed to other users.

We want to make sure, that this cannot happen on our own web page. In order to protect us from this attack, we could for instance disallow certain characters like '<' or '>' completely (although this might not be enough!), but if our chat service is targeted at programmers, this will be overly restrictive. An alternative is to escape certain characters before rendering them on the page.

escape : String -> String
escape = concat . map esc . unpack
  where esc : Char -> String
        esc '<'  = "&lt;"
        esc '>'  = "&gt;"
        esc '"'  = "&quot;"
        esc '&'  = "&amp;"
        esc '\'' = "&apos;"
        esc c    = singleton c

What we now want to do is to store a string together with a proof that is was properly escaped. This is another form of existential quantification: "Here is a string, and there once existed another string, which we passed to escape and arrived at the string we have now". Here's how to encode this:

record Escaped where
  constructor MkEscaped
  value    : String
  0 origin : String
  0 prf    : escape origin === value

Whenever we now embed a string of unknown origin in our web page, we can request a value of type Escaped and have the very strong guarantee that we are no longer vulnerable to cross-site scripting attacks. Even better, it is also possible to safely embed string literals known at compile time without the need to escape them first:

namespace Escaped
  export
  fromString : (s : String) -> {auto 0 prf : escape s === s} -> Escaped
  fromString s = MkEscaped s s prf

escaped : Escaped
escaped = "Hello World!"

Exercises part 3

In this massive set of exercises, you are going to build a small library for working with predicates on primitives. We want to keep the following goals in mind:

  • We want to use the usual operations of propositional logic to combine predicates: Negation, conjuction (logical and), and disjunction (logical or).
  • All predicates should be erased at runtime. If we proof something about a primitive number, we want to make sure not to carry around a huge proof of validity.
  • Calculations on predicates should make no appearance at runtime (with the exception of decide; see below).
  • Recursive calculations on predicates should be tail recursive if they are used in implementations of decide. This might be tough to achieve. If you can't find a tail recursive solution for a given problem, use what feels most natural instead.

A note on efficiency: In order to be able to run computations on our predicates, we try to convert primitive values to algebraic data types as often and as soon as possible: Unsigned integers will be converted to Nat using cast, and strings will be converted to List Char using unpack. This allows us to work with proofs on Nat and List most of the time, and such proofs can be implemented without resorting to believe_me or other cheats. However, the one advantage of primitive types over algebraic data types is that they often perform much better. This is especially critical when comparing integral types with Nat: Operations on natural numbers often run with O(n) time complexity, where n is the size of one of the natural numbers involved, while with Bits64, for instance, many operations run in fast constant time (O(1)). Luckily, the Idris compiler optimizes many functions on natural number to use the corresponding Integer operations at runtime. This has the advantage that we can still use proper induction to proof stuff about natural numbers at compile time, while getting the benefit of fast integer operations at runtime. However, operations on Nat do run with O(n) time complexity and compile time. Proofs working on large natural number will therefore drastically slow down the compiler. A way out of this is discussed at the end of this section of exercises.

Enough talk, let's begin! To start with, you are given the following utilities:

-- Like `Dec` but with erased proofs. Constructors `Yes0`
-- and `No0` will be converted to constants `0` and `1` by
-- the compiler!
data Dec0 : (prop : Type) -> Type where
  Yes0 : (0 prf : prop) -> Dec0 prop
  No0  : (0 contra : prop -> Void) -> Dec0 prop

-- For interfaces with more than one parameter (`a` and `p`
-- in this example) sometimes one parameter can be determined
-- by knowing the other. For instance, if we know what `p` is,
-- we will most certainly also know what `a` is. We therefore
-- specify that proof search on `Decidable` should only be
-- based on `p` by listing `p` after a vertical bar: `| p`.
-- This is like specifing the search parameter(s) of
-- a data type with `[search p]` as was shown in the chapter
-- about predicates.
-- Specifying a single search parameter as shown here can
-- drastically help with type inference.
interface Decidable (0 a : Type) (0 p : a -> Type) | p where
  decide : (v : a) -> Dec0 (p v)

-- We often have to pass `p` explicitly in order to help Idris with
-- type inference. In such cases, it is more convenient to use
-- `decideOn pred` instead of `decide {p = pred}`.
decideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> Dec0 (p v)
decideOn _ = decide

-- Some primitive predicates can only be reasonably implemented
-- using boolean functions. This utility helps with decidability
-- on such proofs.
test0 : (b : Bool) -> Dec0 (b === True)
test0 True  = Yes0 Refl
test0 False = No0 absurd

We also want to run decidable computations at compile time. This is often much more efficient than running a direct proof search on an inductive type. We therefore come up with a predicate witnessing that a Dec0 value is actually a Yes0 together with two utility functions:

data IsYes0 : (d : Dec0 prop) -> Type where
  ItIsYes0 : {0 prf : _} -> IsYes0 (Yes0 prf)

0 fromYes0 : (d : Dec0 prop) -> (0 prf : IsYes0 d) => prop
fromYes0 (Yes0 x) = x
fromYes0 (No0 contra) impossible

0 safeDecideOn :  (0 p : a -> Type)
               -> Decidable a p
               => (v : a)
               -> (0 prf : IsYes0 (decideOn p v))
               => p v
safeDecideOn p v = fromYes0 $ decideOn p v

Finally, as we are planning to refine mostly primitives, we will at times require some sledge hammer to convince Idris that we know what we are doing:

-- only use this if you are sure that `decideOn p v`
-- will return a `Yes0`!
0 unsafeDecideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> p v
unsafeDecideOn p v = case decideOn p v of
  Yes0 prf => prf
  No0  _   =>
    assert_total $ idris_crash "Unexpected refinement failure in `unsafeRefineOn`"
  1. We start with equality proofs. Implement Decidable for Equal v.

    Hint: Use DecEq from module Decidable.Equality as a constraint and make sure that v is available at runtime.

  2. We want to be able to negate a predicate:

    data Neg : (p : a -> Type) -> a -> Type where
      IsNot : {0 p : a -> Type} -> (contra : p v -> Void) -> Neg p v
    

    Implement Decidable for Neg p using a suitable constraint.

  3. We want to describe the conjunction of two predicates:

    data (&&) : (p,q : a -> Type) -> a -> Type where
      Both : {0 p,q : a -> Type} -> (prf1 : p v) -> (prf2 : q v) -> (&&) p q v
    

    Implement Decidable for (p && q) using suitable constraints.

  4. Come up with a data type called (||) for the disjunction (logical or) of two predicates and implement Decidable using suitable constraints.

  5. Proof De Morgan's laws by implementing the following propositions:

    negOr : Neg (p || q) v -> (Neg p && Neg q) v
    
    andNeg : (Neg p && Neg q) v -> Neg (p || q) v
    
    orNeg : (Neg p || Neg q) v -> Neg (p && q) v
    

    The last of De Morgan's implications is harder to type and proof as we need a way to come up with values of type p v and q v and show that not both can exist. Here is a way to encode this (annotated with quantity 0 as we will need to access an erased contraposition):

    0 negAnd :  Decidable a p
             => Decidable a q
             => Neg (p && q) v
             -> (Neg p || Neg q) v
    

    When you implement negAnd, remember that you can freely access erased (implicit) arguments, because negAnd itself can only be used in an erased context.

    So far, we implemented the tools to algebraically describe and combine several predicate. It is now time to come up with some examples. As a first use case, we will focus on limiting the valid range of natural numbers. For this, we use the following data type:

    -- Proof that m <= n
    data (<=) : (m,n : Nat) -> Type where
      ZLTE : 0 <= n
      SLTE : m <= n -> S m <= S n
    

    This is similar to Data.Nat.LTE but I find operator notation often to be clearer. We also can define and use the following aliases:

    (>=) : (m,n : Nat) -> Type
    m >= n = n <= m
    
    (<) : (m,n : Nat) -> Type
    m < n = S m <= n
    
    (>) : (m,n : Nat) -> Type
    m > n = n < m
    
    LessThan : (m,n : Nat) -> Type
    LessThan m = (< m)
    
    To : (m,n : Nat) -> Type
    To m = (<= m)
    
    GreaterThan : (m,n : Nat) -> Type
    GreaterThan m = (> m)
    
    From : (m,n : Nat) -> Type
    From m = (>= m)
    
    FromTo : (lower,upper : Nat) -> Nat -> Type
    FromTo l u = From l && To u
    
    Between : (lower,upper : Nat) -> Nat -> Type
    Between l u = GreaterThan l && LessThan u
    
  6. Coming up with a value of type m <= n by pattern matching on m and n is highly inefficient for large values of m, as it will require m iterations to do so. However, while in an erased context, we don't need to hold a value of type m <= n. We only need to show, that such a value follows from a more efficient computation. Such a computation is compare for natural numbers: Although this is implemented in the Prelude with a pattern match on its arguments, it is optimized by the compiler to a comparison of integers which runs in constant time even for very large numbers. Since Prelude.(<=) for natural numbers is implemented in terms of compare, it runs just as efficiently.

    We therefore need to proof the following two lemmas (make sure to not confuse Prelude.(<=) with Prim.(<=) in these declarations):

    0 fromLTE : (n1,n2 : Nat) -> (n1 <= n2) === True -> n1 <= n2
    
    0 toLTE : (n1,n2 : Nat) -> n1 <= n2 -> (n1 <= n2) === True
    

    They come with a quantity of 0, because they are just as inefficient as the other computations we discussed above. We therefore want to make absolutely sure that they will never be used at runtime!

    Now, implement Decidable Nat (<= n), making use of test0, fromLTE, and toLTE. Likewise, implement Decidable Nat (m <=), because we require both kinds of predicates.

    Note: You should by now figure out yourself that n must be available at runtime and how to make sure that this is the case.

  7. Proof that (<=) is reflexive and transitive by declaring and implementing corresponding propositions. As we might require the proof of transitivity to chain several values of type (<=), it makes sense to also define a short operator alias for this.

  8. Proof that from n > 0 follows IsSucc n and vise versa.

  9. Declare and implement safe division and modulo functions for Bits64, by requesting an erased proof that the denominator is strictly positive when cast to a natural number. In case of the modulo function, return a refined value carrying an erased proof that the result is strictly smaller than the modulus:

    safeMod :  (x,y : Bits64)
            -> (0 prf : cast y > 0)
            => Subset Bits64 (\v => cast v < cast y)
    
  10. We will use the predicates and utilities we defined so far to convert a value of type Bits64 to a string of digits in base b with 2 <= b && b <= 16. To do so, implement the following skeleton definitions:

    -- this will require some help from `assert_total`
    -- and `idris_crash`.
    digit : (v : Bits64) -> (0 prf : cast v < 16) => Char
    
    record Base where
      constructor MkBase
      value : Bits64
      0 prf : FromTo 2 16 (cast value)
    
    base : Bits64 -> Maybe Base
    
    namespace Base
      public export
      fromInteger : (v : Integer) -> {auto 0 _ : IsJust (base $ cast v)} -> Base
    

    Finally, implement digits, using safeDiv and safeMod in your implementation. This might be challenging, as you will have to manually transform some proofs to satisfy the type checker. You might also require assert_smaller in the recursive step.

    digits : Bits64 -> Base -> String
    

    We will now turn our focus on strings. Two of the most obvious ways in which we can restrict the strings we accept are by limiting the set of characters and limiting their lengths. More advanced refinements might require strings to match a certain pattern or regular expression. In such cases, we might either go for a boolean check or use a custom data type representing the different parts of the pattern, but we will not cover these topics here.

  11. Implement the following aliases for useful predicates on characters.

    Hint: Use cast to convert characters to natural numbers, use (<=) and InRange to specify regions of characters, and use (||) to combine regions of characters.

    -- Characters <= 127
    IsAscii : Char -> Type
    
    -- Characters <= 255
    IsLatin : Char -> Type
    
    -- Characters in the interval ['A','Z']
    IsUpper : Char -> Type
    
    -- Characters in the interval ['a','z']
    IsLower : Char -> Type
    
    -- Lower or upper case characters
    IsAlpha : Char -> Type
    
    -- Characters in the range ['0','9']
    IsDigit : Char -> Type
    
    -- Digits or characters from the alphabet
    IsAlphaNum : Char -> Type
    
    -- Characters in the ranges [0,31] or [127,159]
    IsControl : Char -> Type
    
    -- An ASCII character that is not a control character
    IsPlainAscii : Char -> Type
    
    -- A latin character that is not a control character
    IsPlainLatin : Char -> Type
    
  12. The advantage of this more modular approach to predicates on primitives is that we can safely run calculations on our predicates and get the strong guarantees from the existing proofs on inductive types like Nat and List. Here are some examples of such calculations and conversions, all of which can be implemented without cheating:

    0 plainToAscii : IsPlainAscii c -> IsAscii c
    
    0 digitToAlphaNum : IsDigit c -> IsAlphaNum c
    
    0 alphaToAlphaNum : IsAlpha c -> IsAlphaNum c
    
    0 lowerToAlpha : IsLower c -> IsAlpha c
    
    0 upperToAlpha : IsUpper c -> IsAlpha c
    
    0 lowerToAlphaNum : IsLower c -> IsAlphaNum c
    
    0 upperToAlphaNum : IsUpper c -> IsAlphaNum c
    

    The following (asciiToLatin) is trickier. Remember that (<=) is transitive. However, in your invocation of the proof of transitivity, you will not be able to apply direct proof search using %search because the search depth is too small. You could increase the search depth, but it is much more efficient to use safeDecideOn instead.

    0 asciiToLatin : IsAscii c -> IsLatin c
    
    0 plainAsciiToPlainLatin : IsPlainAscii c -> IsPlainLatin c
    

    Before we turn our full attention to predicates on strings, we have to cover lists first, because we will often treat strings as lists of characters.

  13. Implement Decidable for Head:

    data Head : (p : a -> Type) -> List a -> Type where
      AtHead : {0 p : a -> Type} -> (0 prf : p v) -> Head p (v :: vs)
    
  14. Implement Decidable for Length:

    data Length : (p : Nat -> Type) -> List a -> Type where
      HasLength :  {0 p : Nat -> Type}
                -> (0 prf : p (List.length vs))
                -> Length p vs
    
  15. The following predicate is a proof that all values in a list of values fulfill the given predicate. We will use this to limit the valid set of characters in a string.

    data All : (p : a -> Type) -> (as : List a) -> Type where
      Nil  : All p []
      (::) :  {0 p : a -> Type}
           -> (0 h : p v)
           -> (0 t : All p vs)
           -> All p (v :: vs)
    

    Implement Decidable for All.

    For a real challenge, try to make your implementation of decide tail recursive. This will be important for real world applications on the JavaScript backends, where we might want to refine strings of thousands of characters without overflowing the stack at runtime. In order to come up with a tail recursive implementation, you will need an additional data type AllSnoc witnessing that a predicate holds for all elements in a SnocList.

  16. It's time to come to an end here. An identifier in Idris is a sequence of alphanumeric characters, possibly separated by underscore characters (_). In addition, all identifiers must start with a letter. Given this specification, implement predicate IdentChar, from which we can define a new wrapper type for identifiers:

    0 IdentChars : List Char -> Type
    
    record Identifier where
      constructor MkIdentifier
      value : String
      0 prf : IdentChars (unpack value)
    

    Implement a factory method identifier for converting strings of unknown source at runtime:

    identifier : String -> Maybe Identifier
    

    In addition, implement fromString for Identifier and verify, that the following is a valid identifier:

    testIdent : Identifier
    testIdent = "fooBar_123"
    

Final remarks: Proofing stuff about the primitives can be challenging, both when deciding on what axioms to use and when trying to make things perform well at runtime and compile time. I'm experimenting with a library, which deals with these issues. It is not yet finished, but you can have a look at it here.

Getting Started with pack and Idris2

Here I describe what I find to be the most convenient way to get up and running with Idris2. We are going to install the pack package manager, which will install a recent version of the Idris compiler along the way. However, this means that you need access to a Unix-like operating system such as Linux or macOS. Windows users can make use of WSL to get access to a Linux environment on their system. As a prerequisite, it is assumed that readers know how to start a terminal session on their system, and how to run commands from the terminal's command-line. In addition, readers need to know how to add directories to the $PATH variable on their system.

Installing pack

In order to install the pack package manager together with a recent version of the Idris2 compiler, follow the instructions on pack's GitHub page.

If all goes well, I suggest you take a moment to inspect the default settings available in your global pack.toml file, which can be found at $HOME/.pack/user/pack.toml (unless you explicitly set the $PACK_DIR environment variable to a different directory). If possible, I suggest you install the rlwrap tool and change the following setting in your global pack.toml file to true:

repl.rlwrap = true

This will lead to a nicer experience when running REPL sessions. You might also want to set up your editor to make use of the interactive editing features provided by Idris. Instruction to do this for Neovim can be found here.

Updating pack and Idris

Both projects, pack and the Idris compiler, are still being actively developed. It is therefore a good idea to update them at regular occasions. To update pack itself, just run the following command:

pack update

To build and install the latest commit of the Idris compiler and use the latest package collection, run

pack switch latest

Setting up your Playground

If you are going to solve the exercises in this tutorial (you should!), you'll have to write a lot of code. It is best to setup a small playground project for tinkering with Idris. In a directory of your choice, run the following command:

pack new lib tut

This will setup a minimal Idris package in directory tut together with an .ipkg file called tut.ipkg, a directory to put your Idris sources called src, and a minimal Idris module at src/Tut.idr.

In addition, it sets up a minimal test suite in directory test. All of this is put together and made accessible to pack in a pack.toml file in the project's root directory. Take your time and quickly inspect the content of every file created by pack: The .idr files contain Idris source code. The .ipkg files contain detailed descriptions of packages for the Idris compiler including where the sources are located, the modules a package makes available to other projects, and a list of packages the project itself depends on. Finally, the pack.toml file informs pack about the local packages in the current project.

With this, here is a bunch of things you can do, but first, make sure you are in the project's root directory (called tut if you followed my suggestion) or one of its child folders when running these commands.

To typecheck the library sources, run

pack typecheck tut

To build and execute the test suite, run

pack test tut

To start a REPL session with src/Tut.idr loaded, run

pack repl src/Tut.idr

Conclusion

In this very short tutorial you set up an environment for working on Idris projects and following along with the main part of the tutorial. You are now ready to start with the first chapter, or - if you already wrote some Idris code - to learn about the details of the Idris module system.

Please note that this tutorial itself is setup as a pack project: It contains a pack.toml and tutorial.ipkg file in its root directory (have a look at them to get a feel for how such projects are setup) and a lot of Idris sources in the subfolders of directory src.

Interactive Editing in Neovim

Idris provides extensive capabilities to interactively analyze the types of values and expressions in our programs and fill out skeleton implementations and sometimes even whole programs for us based on the types provided. These interactive editing features are available via plugins in different editors. Since I am a Neovim user, I explain the Idris related parts of my own setup in detail here.

The main component required to get all these features to run in Neovim is an executable provided by the idris2-lsp project. This executable makes use of the Idris compiler API (application programming interface) internally and can check the syntax and types of the source code we are working on. It communicates with Neovim via the language server protocol (LSP). This communication is setup through the idris2-nvim plugin.

As we will see in this tutorial, the idris2-lsp executable not only supports syntax and type checking, but comes also with additional interactive editing features. Finally, the Idris compiler API supports semantic highlighting of Idris source code: Identifiers and keywords are highlighted not only based on the language's syntax (that would be syntax highlighting, a feature expected from all modern programming environments and editors), but also based on their semantics. For instance, a local variable in a function implementation gets highlighted differently than the name of a top level function, although syntactically these are both just identifiers.

module Appendices.Neovim

import Data.Vect

%default total

Setup

In order to make full use of interactive Idris editing in Neovim, at least the following tools need to be installed:

  • A recent version of Neovim (version 0.5 or later).
  • A recent version of the Idris compiler (at least version 0.5.1).
  • The Idris compiler API.
  • The idris2-lsp package.
  • The following Neovim plugins:

The idris2-lsp project gives detailed instructions about how to install Idris 2 together with its standard libraries and compiler API. Make sure to follow these instructions so that your compiler and idris2-lsp executable are in sync.

If you are new to Neovim, you might want to use the init.vim file provided in the resources folder. In that case, the necessary Neovim plugins are already included, but you need to install vim-plug, a plugin manager. Afterwards, copy all or parts of resources/init.vim to your own init.vim file. (Use :help init.vim from within Neovim in order to find out where to look for this file.). After setting up your init.vim file, restart Neovim and run :PlugUpdate to install the necessary plugins.

A Typical Workflow

In order to checkout the interactive editing features available to us, we will reimplement some small utilities from the Prelude. To follow along, you should have already worked through the Introduction, Functions Part 1, and at least parts of Algebraic Data Types, otherwise it will be hard to understand what's going on here.

Before we begin, note that the commands and actions shown in this tutorial might not work correctly after you edited a source file but did not write your changes to disk. Therefore, the first thing you should try if the things described here do not work, is to quickly save the current file (:w).

Let's start with negation of a boolean value:

negate1 : Bool -> Bool

Typically, when writing Idris code we follow the mantra "types first". Although you might already have an idea about how to implement a certain piece of functionality, you still need to provide an accurate type before you can start writing your implementation. This means, when programming in Idris, we have to mentally keep track of the implementation of an algorithm and the types involved at the same time, both of which can become arbitrarily complex. Or do we? Remember that Idris knows at least as much about the variables and their types available in the current context of a function implementation as we do, so we probably should ask it for guidance instead of trying to do everything on our own.

So, in order to proceed, we ask Idris for a skeleton function body: In normal editor mode, move your cursor on the line where negate1 is declared and enter <LocalLeader>a in quick succession. <LocalLeader> is a special key that can be specified in the init.vim file. If you use the init.vim from the resources folder, it is set to the comma character (,), in which case the above command consists of a comma quickly followed by the lowercase letter "a". See also :help leader and :help localleader in Neovim

Idris will generate a skeleton implementation similar to the following:

negate2 : Bool -> Bool
negate2 x = ?negate2_rhs

Note, that on the left hand side a new variable with name x was introduced, while on the right hand side Idris added a metavariable (also called a hole). This is an identifier prefixed with a question mark. It signals to Idris, that we will implement this part of the function at a later time. The great thing about holes is, that we can hover over them and inspect their types and the types of values in the surrounding context. You can do so by placing the cursor on the identifier of a hole and entering K (the uppercase letter) in normal mode. This will open a popup displaying the type of the variable under the cursor plus the types and quantities of the variables in the surrounding context. You can also have this information displayed in a separate window: Enter <LocalLeader>so to open this window and repeat the hovering. The information will appear in the new window and as an additional benefit, it will be semantically highlighted. Enter <LocalLeader>sc to close this window again. Go ahead and checkout the type and context of ?negate2_rhs.

Most functions in Idris are implemented by pattern matching on one or more of the arguments. Idris, knowing the data constructors of all non-primitive data types, can write such pattern matches for us (a process also called case splitting). To give this a try, move the cursor onto the x in the skeleton implementation of negate2, and enter <LocalLeader>c in normal mode. The result will look as follows:

negate3 : Bool -> Bool
negate3 False = ?negate3_rhs_0
negate3 True = ?negate3_rhs_1

As you can see, Idris inserted a hole for each of the cases on the right hand side. We can again inspect their types or replace them with a proper implementation directly.

This concludes the introduction of the (in my opinion) core features of interactive editing: Hovering on metavariables, adding skeleton function implementations, and case splitting (which also works in case blocks and for nested pattern matches). You should start using these all the time now!

Sometimes, Idris knows enough about the types involved to come up with a function implementation on its own. For instance, let us implement function either from the Prelude. After giving its type, creating a skeleton implementation, and case splitting on the Either argument, we arrive at something similar to the following:

either2 : (a -> c) -> (b -> c) -> Either a b -> c
either2 f g (Left x) = ?either2_rhs_0
either2 f g (Right x) = ?either2_rhs_1

Idris can come up with expressions for the two metavariables on its own, because the types are specific enough. Move the cursor onto one of the metavariables and enter <LocalLeader>o in normal mode. You will be given a selection of possible expressions (only one in this case), of which you can choose a fitting one (or abort with q).

Here is another example: A reimplementation of function maybe. If you run an expression search on ?maybe2_rhs1, you will get a larger list of choices.

maybe2 : b -> (a -> b) -> Maybe a -> b
maybe2 x f Nothing = x
maybe2 x f (Just y) = ?maybe2_rhs_1

Idris is also sometimes capable of coming up with complete function implementations based on a function's type. For this to work well in practice, the number of possible implementations satisfying the type checker must be pretty small. As an example, here is function zipWith for vectors. You might not have heard about vectors yet: They will be introduced in the chapter about dependent types. You can still give this a go to check out its effect. Just move the cursor on the line declaring zipWithV, enter <LocalLeader>gd and select the first option. This will automatically generate the whole function body including case splits and implementations.

zipWithV : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c

Expression search only works well if the types are specific enough. If you feel like that might be the case, go ahead and give it a go, either by running <LocalLeader>o on a metavariable, or by trying <LocalLeader>gd on a function declaration.

More Code Actions

There are other shortcuts available for generating part of your code, two of which I'll explain here.

First, it is possible to add a new case block by entering <LocalLeader>mc in normal mode when on a metavariable. For instance, here is part of an implementation of filterList, which appears in an exercise in the chapter about algebraic data types. I arrived at this by letting Idris generate a skeleton implementation followed by a case split and an expression search on the first metavariable:

filterList : (a -> Bool) -> List a -> List a
filterList f [] = []
filterList f (x :: xs) = ?filterList_rhs_1

We will next have to pattern match on the result of applying x to f. Idris can introduce a new case block for us, if we move the cursor onto metavariable ?filterList_rhs_1 and enter <LocalLeader>mc in normal mode. We can then continue with our implementation by first giving the expression to use in the case block (f x) followed by a case split on the new variable in the case block. This will lead us to an implementation similar to the following (I had to fix the indentation, though):

filterList2 : (a -> Bool) -> List a -> List a
filterList2 f [] = []
filterList2 f (x :: xs) = case f x of
  False => ?filterList2_rhs_2
  True => ?filterList2_rhs_3

Sometimes, we want to extract a utility function from an implementation we are working on. For instance, this is often useful or even necessary when we write proofs about our code (see chapters Propositional Equality and Predicates, for instance). In order to do so, we can move the cursor on a metavariable, and enter <LocalLeader>ml. Give this a try with ?whatNow in the following example (this will work better in a regular Idris source file instead of the literate file I use for this tutorial):

traverseEither : (a -> Either e b) -> List a -> Either e (List b)
traverseEither f [] = Right []
traverseEither f (x :: xs) = ?whatNow x xs f (f x) (traverseEither f xs)

Idris will create a new function declaration with the type and name of ?whatNow, which takes as arguments all variables currently in scope. It also replaces the hole in traverseEither with a call to this new function. Typically, you will have to manually remove unneeded arguments afterwards. This led me to the following version:

whatNow2 : Either e b -> Either e (List b) -> Either e (List b)

traverseEither2 : (a -> Either e b) -> List a -> Either e (List b)
traverseEither2 f [] = Right []
traverseEither2 f (x :: xs) = whatNow2 (f x) (traverseEither f xs)

Getting Information

The idris2-lsp executable and through it, the idris2-nvim plugin, not only supports the code actions described above. Here is a non-comprehensive list of other capabilities. I suggest you try out each of them from within this source file.

  • Typing K when on an identifier or operator in normal mode shows its type and namespace (if any). In case of a metavariable, variables in the current context are displayed as well together with their types and quantities (quantities will be explained in Functions Part 2). If you don't like popups, enter <LocalLeader>so to open a new window where this information is displayed and semantically highlighted instead.
  • Typing gd on a function, operator, data constructor or type constructor in normal mode jumps to the item's definition. For external modules, this works only if the module in question has been installed together with its source code (by using the idris2 --install-with-src command).
  • Typing <LocalLeader>mm opens a popup window listing all metavariables in the current module. You can place the cursor on an entry and jump to its location by pressing <Enter>.
  • Typing <LocalLeader>mn (or <LocalLeader>mp) jumps to the next (or previous) metavariable in the current module.
  • Typing <LocalLeader>br opens a popup where you can enter a namespace. Idris will then show all functions (plus their types) exported from that namespace in a popup window, and you can jump to a function's definition by pressing enter on one of the entries. Note: The module in question must be imported in the current source file.
  • Typing <LocalLeader>x opens a popup where you can enter a REPL command or Idris expression, and the plugin will reply with a response from the REPL. Whenever REPL examples are shown in the main part of this guide, you can try them from within Neovim with this shortcut if you like.
  • Typing <LocalLeader><LocalLeader>e will display the error message from the current line in a popup window. This can be highly useful, if error messages are too long to fit on a single line. Likewise, <LocalLeader><LocalLeader>el will list all error messages from the current buffer in a new window. You can then select an error message and jump to its origin by pressing <Enter>.

Other use cases and examples are described on the GitHub page of the idris2-nvim plugin and can be included as described there.

The %name Pragma

When you ask Idris for a skeleton implementation with <LocalLeader>a or a case split with <LocalLeader>c, it has to decide on what names to use for the new variables it introduces. If these variables already have predefined names (from the function's signature, record fields, or named data constructor arguments), those names will be used, but otherwise Idris will as a default use names x, y, and z, followed by other letters. You can change this default behavior by specifying a list of names to use for such occasions for any data type.

For instance:

data Element = H | He | C | N | O | F | Ne

%name Element e,f

Idris will then use these names (followed by these names postfixed with increasing integers), when it has to come up with variable names of this type on its own. For instance, here is a test function and the result of adding a skeleton definition to it:

test : Element -> Element -> Element -> Element -> Element -> Element
test e f e1 f1 e2 = ?test_rhs

Conclusion

Neovim, together with the idris2-lsp executable and the idris2-nvim editor plugin, provides extensive utilities for interactive editing when programming in Idris. Similar functionality is available for some other editors, so feel free to ask what's available for your editor of choice, for instance on the Idris 2 Discord channel.

Structuring Idris Projects

In this section I'm going to show how to organize, install, and depend on larger Idris projects. We will have a look at Idris packages, the module system, visibility of types and functions, writing comments and doc strings, and using pack for managing our libraries.

This section should be useful for all readers who have already written a bit of Idris code. We will not do any fancy type level wizardry in here, but I'll demonstrate several concepts using failing code blocks, which you might not have seen before. This rather new addition to the language allows us to write code that is expected to fail during elaboration (type checking). For instance:

failing "Can't find an implementation for FromString Bits8."
  ohno : Bits8
  ohno = "Oh no!"

As part of a failing block, we can give a substring of the compiler's error message for documentation purposes and to make sure the block fails with the expected error.

Modules

Every Idris source file defines a module, typically starting with a module header like the one below:

module Appendices.Projects

A module's name consists of several upper case identifiers separated by dots, which must reflect the path of the .idr file where the module is stored. For instance, this module is stored in file Appendices/Projects.md, so the module's name is Appendices.Projects.

"But wait!", I hear you say, "What about the parent folder(s) of Appendices? Why aren't those part of the module's name?" In order to understand this, we must talk about the concept of the source directory. The source directory is where Idris is looking for source files. It defaults to the directory, from which the Idris executable is run. For instance, when in folder src of this project, you can open this source file like so:

idris2 Appendices/Projects.md

This will not work, however, if you try the same thing from this project's root folder:

$ idris2 src/Appendices/Projects.md
...
Error: Module name Appendices.Projects does not match file name "src/Appendices/Projects.md"
...

So, which folder names to include in a module name depends on the parent folder we consider to be our source directory. It is common practice to name the source directory src, although this is not mandatory (as I said above, the default is actually the directory, from which we run Idris). It is possible to change the source directory with the --source-dir command-line option. The following works from within this project's root directory:

idris2 --source-dir src src/Appendices/Projects.md

And the following would work from a parent directory (assuming this tutorial is stored in folder tutorial):

idris2 --source-dir tutorial/src tutorial/src/Appendices/Projects.md

Most of the time, however, you will specify an .ipkg file for your project (see later in this section) and define the source directory there. Afterwards, you can use pack (instead of the idris2 executable) to start REPL sessions and load your source files.

Module Imports

You often need to import functions and data types from other modules when writing Idris code. This can be done with an import statement. Here are several examples showing how these might look like:

import Data.String
import Data.List
import Text.CSV
import public Appendices.Neovim
import Data.Vect as V
import public Data.List1 as L

The first two lines import modules from another package (we will learn about packages below): Data.List from the base package, which will be installed as part of your Idris installation.

The second line imports module Text.CSV from within our own source directory src. It is always possible to import modules that are part of the same source directory as the file we are working on.

The third line imports module Appendices.Neovim, again from our own source directory. Note, however, that this import statement comes with an additional public keyword. This allows us to re-export a module, so that it is available from within other modules in addition to the current module: If another module imports Appendices.Projects, module Appendices.Neovim will be imported as well without the need of an additional import statement. This is useful when we split some complex functionality across different modules and want to import the lot via a single catch-all module See module Control.Monad.State in base for an example. You can look at the Idris sources on GitHub or locally after cloning the Idris2 project. The base library can be found in the libs/base subfolder.

It often happens that in order to make use of functions from some module A we also require utilities from another module B, so A should re-export B. For instance, Data.Vect in base re-exports Data.Fin, because the latter is often required when working with vectors.

The fourth line imports module Data.Vect, giving it a new name V, to be used as a shorter prefix. If you often need to disambiguate identifiers by prefixing them with a module's name, this can help making your code more concise:

vectSum : Nat
vectSum = sum $ V.fromList [1..10]

Finally, on the fifth line we publicly import a module and give it a new name. This name will then be the one seen when we transitively import Data.List1 via Appendices.Projects. To see this, start a REPL session (after type checking the tutorial) without loading a source file from this project's root folder:

pack typecheck tutorial
pack repl

Now load module Appendices.Projects and checkout the type of singleton:

Main> :module Appendices.Projects
Imported module Appendices.Projects
Main> :t singleton
Data.String.singleton : Char -> String
Data.List.singleton : a -> List a
L.singleton : a -> List1 a

As you can see, the List1 version of singleton is now prefixed with L instead of Data.List1. It is still possible to use the "official" prefix, though:

Main> List1.singleton 12
12 ::: []
Main> L.singleton 12
12 ::: []

Namespaces

At times, we want to define several functions or data types with the same name in a single module. Idris does not allow this, because every name must be unique in its namespace, and the namespace of a module is just the fully qualified module name. However, it is possible to define additional namespaces within a module by using the namespace keyword followed by the name of the namespace. All functions which should belong to this namespace must then be indented by the same amount of whitespace.

Here's an example:

data HList : List Type -> Type where
  Nil  : HList []
  (::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)

head : HList (t :: ts) -> t
head (v :: _) = v

tail : HList (t :: ts) -> HList ts
tail (_ :: vs) = vs

namespace HVect
  public export
  data HVect : Vect n Type -> Type where
    Nil  : HVect []
    (::) : (v : t) -> (vs : HVect ts) -> HVect (t :: ts)

  public export
  head : HVect (t :: ts) -> t
  head (v :: _) = v

  public export
  tail : HVect (t :: ts) -> HVect ts
  tail (_ :: vs) = vs

Function names HVect.head and HVect.tail as well as constructors HVect.Nil and HVect.(::) would clash with functions and constructors of the same names from the outer namespace (Appendices.Projects), so we had to put them in their own namespace. In order to be able to use them from outside their namespace, they need to be exported (see the section on visibility below). In case we need to disambiguate between these names, we can prefix them with part of their namespace. For instance, the following fails with a disambiguation error, because there are several functions called head in scope and it is not clear from head's argument (some data type supporting list syntax, of which again several are in scope), which version we want:

failing "Ambiguous elaboration."
  whatHead : Nat
  whatHead = head [12,"foo"]

By prefixing head with part of its namespace, we can resolve both ambiguities. It is now immediately clear, that [12,"foo"] must be an HVect, because that's the type of HVect.head's argument:

thisHead : Nat
thisHead = HVect.head [12,"foo"]

In the following subsection I'll make use of namespaces to demonstrate the principles of visibility.

Visibility

In order to use functions and data types outside of the module or namespace they were defined in, we need to change their visibility. The default visibility is private: Such a function or data type is not visible from outside its module or namespace:

namespace Foo
  foo : Nat
  foo = 12

failing "Name Appendices.Projects.Foo.foo is private."
  bar : Nat
  bar = 2 * foo

To make a function visible, annotate it with the export keyword:

namespace Square
  export
  square : Num a => a -> a
  square v = v * v

This will allow us to invoke function square from within other modules or namespaces (after importing Appendices.Projects):

OneHundred : Bits8
OneHundred = square 10

However, the implementation of square will not be exported, so square will not reduce during elaboration:

failing "Can't solve constraint between: 100 and square 10."
  checkOneHundred : OneHundred === 100
  checkOneHundred = Refl

For this to work, we need to publicly export square:

namespace SquarePub
  public export
  squarePub : Num a => a -> a
  squarePub v = v * v

OneHundredAgain : Bits8
OneHundredAgain = squarePub 10

checkOneHundredAgain : OneHundredAgain === 100
checkOneHundredAgain = Refl

Therefore, if you need a function to reduce during elaboration, annotate it with public export instead of export. This is especially important if you use a function to compute a type. Such function's must reduce during elaboration, otherwise they are completely useless:

namespace Stupid
  export
  0 NatOrString : Type
  NatOrString = Either String Nat

failing "Can't solve constraint between: Either String ?b and NatOrString."
  natOrString : NatOrString
  natOrString = Left "foo"

If we publicly export our type alias, everything type checks fine:

namespace Better
  public export
  0 NatOrString : Type
  NatOrString = Either String Nat

natOrString : Better.NatOrString
natOrString = Left "bar"

Visibility of Data Types

Visibility of data types behaves slightly differently. If set to private (the default), neither the type constructor nor the data constructors are visible outside of the namespace they where defined in. If annotated with export, the type constructor is exported but not the data constructors:

namespace Export
  export
  data Foo : Type where
    Foo1 : String -> Foo
    Foo2 : Nat -> Foo

  export
  mkFoo1 : String -> Export.Foo
  mkFoo1 = Foo1

foo1 : Export.Foo
foo1 = mkFoo1 "foo"

As you can see, we can use the type Foo as well as function mkFoo1 outside of namespace Export. However, we cannot use the Foo1 constructor to create a value of type Foo directly:

failing "Export.Foo1 is private."
  foo : Export.Foo
  foo = Foo1 "foo"

This changes when we publicly export the data type:

namespace PublicExport
  public export
  data Foo : Type where
    Foo1 : String -> PublicExport.Foo
    Foo2 : Nat -> PublicExport.Foo

foo2 : PublicExport.Foo
foo2 = Foo2 12

The same goes for interfaces: If they are publicly exported, the interface (a type constructor) plus all its functions are exported and you can write implementations outside the namespace where they where defined:

namespace PEI
  public export
  interface Sized a where
    size : a -> Nat

Sized Nat where size = id

sumSizes : Foldable t => Sized a => t a -> Nat
sumSizes = foldl (\n,e => n + size e) 0

If they are not publicly exported, you will not be able to write implementations outside the namespace they were defined in (but you can still use the type and its functions in your code):

namespace EI
  export
  interface Empty a where
    empty : a -> Bool

  export
  Empty (List a) where
    empty [] = True
    empty _  = False

failing
  Empty Nat where
    empty Z = True
    empty (_) = False

nonEmpty : Empty a => a -> Bool
nonEmpty = not . empty

Child Namespaces

Sometimes, it is necessary to access a private function in another module or namespace. This is possible from within child namespaces (for want of a better name): Modules and namespaces sharing the parent module's or namespace's prefix. For instance:

namespace Inner
  testEmpty : Bool
  testEmpty = nonEmpty (the (List Nat) [12])

As you can see, we can access function nonEmpty from within namespace Appendices.Projects.Inner, although it is a private function of module Appendices.Projects. This is even possible for modules: If we were to write a module Data.List.Magic, we'd have access to private utility functions defined in module Data.List in base. Actually, I did just that and added module Data.List.Magic demonstrating this quirk of the Idris module system (go have a look!). In general, this is a rather hacky way to work around visibility constraints, but it can be useful at times.

Parameter Blocks

In this subsection, we are going to have a look at a language construct called a parameters block, which enables us to share a set of common read-only arguments (parameters) across several functions, thus allowing us to write more concise function signatures. I'm going to demonstrate their usability with a small example program.

The most basic way to make some piece of external information available to a function is by passing it as an additional argument. In object-orientied programming, this principle is sometimes called dependency injection, and a lot of fuss is being made about it, and whole libraries and frameworks have been built around it.

In functional programming, we can be perfectly relaxed about all of this: Need access to some configuration data for your application? Pass it as an additional argument to your functions. Want to use some local mutable state? Pass the corresponding IORef as an additional argument to your functions. This is both highly efficient and incredibly simple. The only drawback it has: It can blow up our function signatures. There is even a monad for abstracting over this concept, called the Reader monad. It can be found in module Control.Monad.Reader, in the base library.

In Idris, however, there is an even simpler approach: We can use proof search with auto implicit arguments for dependency injection. Here's some example code:

data Error : Type where
  NoNat  : String -> Error
  NoBool : String -> Error

record Console where
  constructor MkConsole
  read : IO String
  put  : String -> IO ()

record ErrorHandler where
  constructor MkHandler
  handle : Error -> IO ()

getCount' : (h : ErrorHandler) => (c : Console) => IO Nat
getCount' = do
  str <- c.read
  case parsePositive str of
    Nothing => h.handle (NoNat str) $> 0
    Just n  => pure n

getText' : (h : ErrorHandler) => (c : Console) => (n : Nat) -> IO (Vect n String)
getText' n = sequence $ replicate n c.read

prog' : ErrorHandler => (c : Console) => IO ()
prog' = do
  c.put "Please enter the number of lines to read."
  n  <- getCount'
  c.put "Please enter \{show n} lines of text."
  ls <- getText' n
  c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."

The example program reads input from and prints output to some Console type, the implementation of which is left to the caller of the function. This is a typical example of dependency injection: Our IO actions know nothing about how to read and write lines of text (they do, for instance, not invoke putStrLn or getLine directly), but rely on an external object to handle these tasks for us. This allows us to use a simple mock object during testing, while using - for instance - two file handles or data base connections when running the application for real. These are typical techniques often found in object-oriented programming, and in fact, this example emulates typical object-oriented patterns in a purely functional programming language: A type like Console can be viewed as a class providing pieces of functionality (methods read and put), and a value of type Console can be viewed as an object of this class, on which we can invoke those methods.

The same goes for error handling: Our error handler could just silently ignore any error that occurs, or it could print it to stderr and write it to a log file at the same time. Whatever it does, our functions need not care.

Note, however, that even in this very simple example we already introduced two additional function arguments, and we can easily see how in a real-world application we might need many more of those and how this would quickly blow up our function signatures. Luckily, there is a very clean and simple solution to this in Idris: parameter blocks. These allow us to specify lists of parameters (unchanging function arguments) shared by all functions listed inside the block. These arguments need then no longer be listed with each function, thus decluttering our function signatures. Here's the example from above in a parameter block:

parameters {auto c : Console} {auto h : ErrorHandler}
  getCount : IO Nat
  getCount = do
    str <- c.read
    case parsePositive str of
      Nothing => h.handle (NoNat str) $> 0
      Just n  => pure n

  getText : (n : Nat) -> IO (Vect n String)
  getText n = sequence $ replicate n c.read

  prog : IO ()
  prog = do
    c.put "Please enter the number of lines to read."
    n  <- getCount
    c.put "Please enter \{show n} lines of text."
    ls <- getText n
    c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."

We are free to list arbitrary arguments (implicit, explicit, auto-implicit, named and unnamed) of any quantity as the parameters in a parameters block, but it works best with implicit and auto implicit arguments. Explicit arguments will have to be passed explicitly to functions in a parameter block, even when invoking them from other parameter blocks with the same explicit argument. This can be rather confusing.

To complete this example, here is a main function for running the program. Note, how we explicitly assemble the Console and ErrorHandler to be used when invoking prog.

main : IO ()
main =
  let cons := MkConsole (trim <$> getLine) putStrLn
      err  := MkHandler (const $ putStrLn "It didn't work")
   in prog

Dependency injection via auto-implicit arguments is only one possible application of parameter blocks. They are useful in general whenever we have repeating argument lists for several functions.

Documentation

Documentation is key. Be it for other programmers using a library we wrote, or for people (including our future selves) trying to understand our code, it is important to annotate our code with comments explaining non-trivial implementation details and docstrings describing the intent and functionality of exported data types and functions.

Comments

Writing a comment in an Idris source file is as simple as adding some text after two hyphens:

-- this is a truly boring comment
boring : Bits8 -> Bits8
boring a = a -- probably I should just use `id` from the Prelude

Whenever a line contains two hyphens that are not part of a string literal, the remainder of the line will be interpreted as a comment by Idris.

It is also possible to write multiline comments using delimiters {- and -}:

{-
  This is a multiline comment. It can be used to comment
  out whole blocks of code, for instance if we get several
  type errors in a larger source file.
-}

Doc Strings

While comments are targeted at programmers reading and trying to understand our source code, doc strings provide documentation for exported functions and data types, explaining their intent and behavior to others.

Here's and example of a documented function:

||| Tries to extract the first two elements from the beginning
||| of a list.
|||
||| Returns a pair of values wrapped in a `Just` if the list has
||| two elements or more. Returns `Nothing` if the list has fewer
||| than two elements.
export
firstTwo : List a -> Maybe (a,a)
firstTwo (x :: y :: _) = Just (x,y)
firstTwo _             = Nothing

We can view a doc string at the REPL:

Appendices.Projects> :doc firstTwo
Appendices.Projects.firstTwo : List a -> Maybe (a,a)
  Tries to extract the first two elements from the beginning
  of a list.

  Returns a pair of values wrapped in a `Just` if the list has
  two elements or more. Returns `Nothing` if the list has fewer
  than two elements.
  Visibility: export

We can document data types and their constructors in a similar manner:

||| A binary tree index by the number of values it holds.
|||
||| @param `n` : Number of values stored in the `Tree`
||| @param `a` : Type of values stored in the `Tree`
public export
data Tree : (n : Nat) -> (a : Type) -> Type where
  ||| A single value stored at the leaf of a binary tree.
  Leaf   : (v : a) -> Tree 1 a

  ||| A branch unifying two subtrees.
  Branch : Tree m a -> Tree n a -> Tree (m + n) a

Go ahead and have a look at the doc strings this generates at the REPL.

Documenting our code is very important. You will realize this, once you try to understand other people's code, or when you come back to a non-trivial piece of source code you wrote yourself a couple of months a ago and since then haven't looked at. If it is not well documented, this can be an unpleasant experience. Idris provides us with the tools necessary to document and annotate our code, so should take our time and do so. It is time well spent.

Packages

Idris packages allow us to assemble several modules into a logical unit and make them available to other Idris projects by installing the packages. In this section, we are going to learn about the structure of an Idris package and how to depend on other packages in our projects.

The .ipkg File

At the heart of an Idris package lies its .ipkg file, which is usually but not necessarily stored at a project's root directory. For instance, for this Idris tutorial, there is file tutorial.ipkg at the tutorial's root directory.

An .ipkg file consists of several key-value pairs (most of them optional), the most important of which I'll describe here. By far the easiest way to setup a new Idris project is by letting pack or Idris itself do it for you. Just run

pack new lib pkgname

to create the skeleton of a new library or

pack new bin appname

to setup a new application. In addition to creating a new directory plus a suitable .ipkg file, these commands will also add a pack.toml file, which we will discuss further below.

Dependencies

One of the most important aspects of an .ipkg file is listing the packages the library depends on in the depends field. Here is an example from the hedgehog package, a framework for writing property tests in Idris:

depends    = base         >= 0.5.1
           , contrib      >= 0.5.1
           , elab-util    >= 0.5.0
           , pretty-show  >= 0.5.0
           , sop          >= 0.5.0

As you can see, hedgehog depends on base and contrib, both of which are part of every Idris installation, but also on elab-util, a library of utilities for writing elaborator scripts (a powerful technique for creating Idris declarations by writing Idris code; it comes with its own lengthy tutorial if you are interested), sop, a library for generically deriving interface implementations via a sum of products representation (this is a useful thing you might want to check out some day), and pretty-show, a library for pretty printing Idris values (hedgehog makes use of this in case a test fails).

So, before you actually can use hedgehog to write some property tests for your own project, you will need to install the packages it depends on before installing hedgehog itself. Since this can be tedious to do manually, it is best let a package manager like pack handle this task for you.

Dependency Versions

You might want to specify a certain version (or a range) Idris should use for your dependencies. This might be useful if you have several versions of the same package installed and not all of them are compatible with your project. Here are several examples:

depends    = base         == 0.5.1
           , contrib      == 0.5.1
           , elab-util    >= 0.5.0
           , pretty-show
           , sop          >= 0.5.0 && < 0.6.0

This will look for packages base and contrib of exactly the given version, package elab-util of a version greater than or equal to 0.5.0, package pretty-show of any version, and package sop of a version in the given range. In all cases, if several installed versions of a package match the specified range, the latest version will be used.

In order to make use of this for your own packages, every .ipkg file should give the package's name and current version:

package tutorial

version    = 0.1.0

As I'll show below, package versions play a much less crucial role when using pack and its curated package collection. But even then you might want to consider restricting the versions of packages you accept in order to make sure you catch any braking changes introduced upstream.

Library Modules

Many if not most Idris packages available on GitHub are programming libraries: They implement some piece of functionality and make it available to all projects depending on the given package. This is unlike Idris applications, which are supposed to be compiled to an executable that can then be run on your computer. The Idris project itself provides both: The Idris compiler application, which we use to type check and build other Idris libraries and applications, and several libraries like prelude, base, and contrib, which provide basic data types and functions useful in most Idris projects.

In order to type check and install the modules you wrote in a library, you must list them in the .ipkg file's modules field. Here is an excerpt from the sop package:

modules = Data.Lazy
        , Data.SOP
        , Data.SOP.Interfaces
        , Data.SOP.NP
        , Data.SOP.NS
        , Data.SOP.POP
        , Data.SOP.SOP
        , Data.SOP.Utils

Modules missing from this list will not be installed and hence will not be available for other packages depending on the sop library.

Pack and its curated Collection of Packages

When the dependency graph of your project is getting large and complex, that is, when your project depends on many libraries, which themselves depend on yet other libraries, it can happen that two packages depend both on different - and, possibly, incompatible - versions of a third package. This situation can be nigh to impossible to resolve, and can lead to a lot of frustration when working with conflicting libraries.

It is therefore the philosophy of the pack project to avoid such a situation from the very beginning by making use of curated package collections. A pack collection consists of a specific Git commit of the Idris compiler and a set of packages, again each at a specific Git commit, all of which have been tested to work well and without issues together. You can see a list of packages available to pack here.

Whenever a project you are working on depends on one of the libraries listed in pack's package collection, pack will automatically install it and all of its dependencies for you. However, you might also want to depend on a library that is not yet part of pack's collection. In that case, you must specify the library in question in one of your pack.toml files - the global one found at $HOME/.pack/user/pack.toml, or one local to your current project or one of its parent directories (if any). There, you can either specify a dependency local to your system or a Git project (local or remote). An example for each is shown below:

[custom.all.foo]
type = "local"
path = "/path/to/foo"
ipkg = "foo.ipkg"

[custom.all.bar]
type   = "github"
url    = "https://github.com/me/bar"
commit = "latest:main"
ipkg   = "bar.ipkg"

As you can see, in both cases you have to specify where the project can be found as well as the name and location of its .ipkg file. In case of a Git project, you also need to tell pack the commit it should use. In the example above, we want to use the latest commit from the main branch. We can use pack fetch to fetch and store the currently latest commit hash.

Entries like the ones given above are all that is needed to add support to custom libraries to pack. You can now list these libraries as dependencies in your own project's .ipkg file and pack will automatically install them for you.

Conclusion

This concludes our section about structuring Idris projects. We have learned about several types of code blocks - failing blocks for showing that a piece of code fails to elaborate, namespaces for having overloaded names in the same source file, and parameter blocks for sharing lists of parameters between functions - and how to group several source files into an Idris library or application. Finally, we learned how to include external libraries in an Idris project and how to use pack to help us keep track of these dependencies.

A Deep Dive into Quantitative Type Theory

This section was guest-written by Kiana Sheibani.

In the tutorial proper, when discussing functions, Idris 2's quantity system was introduced. The description was intentionally a bit simplified - the inner workings of quantities are complicated, and that complication would have only confused any newcomers to Idris 2.

Here, I'll provide a more proper and thorough treatment of Quantitative Type Theory (QTT), including how quantity checking is performed and the theory behind it. Most of the information here will be unnecessary for understanding and writing Idris programs, and you are free to keep thinking about quantities like they were explained before. When working with quantities in their full complexity, however, a better understanding of how they work can be helpful to avoid misconceptions.

The Quantity Semiring

Quantitative Type Theory, as you probably already know, uses a set of quantities. The core theory allows for any quantities to be used, but Idris 2 in particular has three: erased, linear, and unrestricted. These are usually written as 0, 1, and ω (the Greek lowercase omega) respectively.

As QTT requires, these three quantities are equipped with the structure of an ordered semiring. The exact mathematical details of what that means aren't important; what it means for us is that quantities can be added and multiplied together, and that there is an ordering relation on them. Here are the tables for each of these operations, where the first argument is on the left and the second is on the top:

Addition

+01ω
001ω
11ωω
ωωωω

Multiplication

*01ω
0000
101ω
ω0ωω

Order

01ω
0truefalsetrue
1falsetruetrue
ωfalsefalsetrue

These operations behave mostly how you might expect, with 0 and 1 being the usual numbers and ω being a sort of "infinity" value. (We have 1 + 1 = ω instead of 2 because there isn't a 2 quantity in our system.)

There is one big difference in our ordering, though: 0 ≤ 1 is false! We have that 0 ≤ ω and 1 ≤ ω, but not 0 ≤ 1, or 1 ≤ 0 for that matter. In the language of mathematics, we say that 0 and 1 are incomparable. We'll get into why this is the case later, when we talk about what these operations mean and how they're used.

Variables and Contexts

In QTT, each variable in each context has an associated quantity. These quantities can be plainly seen when inspecting holes in the REPL. Here's an example from the tutorial:

 0 b : Type
 0 a : Type
   xs : List a
   f : a -> b
   x : a
   prf : length xs = length (map f xs)
------------------------------
mll1 : S (length xs) = S (length (map f xs))

In this hole's context, The type variables a and b have 0 quantity, while the others have ω quantity.

Since the context is what stores quantities, only names that appear in the context can have a quantity, including:

  • Function/lambda parameters
  • Pattern matching bindings
  • let bindings

These do not appear in the context, and thus do NOT have quantities:

  • Top-level definitions
  • where definitions
  • All non-variable expressions

A Change in Perspective

When writing Idris programs using holes, we tend to use a top-to-bottom approach: we start with looking at the context for the whole function, and then we look at smaller and smaller sub-expressions as we fill in the code. This means that quantities in the context tend to decrease over time - if the variable x has quantity 1 and you use it once, the quantity will decrease to 0.

When looking at how typechecking works, however, it's more natural to look at contexts in the other direction, from smaller sub-expressions to larger ones. This means that the quantities we're looking at will tend to increase instead. As an example, let's look at this simple function:

square : Num a => a -> a
square x = x * x

Let's first look at the context for the smallest sub-expression of this function, just the variable x:

 0 a : Type
 1 x : a
------------------------------
x : a

Now let's look at the context for the larger expression x * x:

 0 a : Type
   x : a
------------------------------
(x * x) : a

The quantity of the parameter x increased from 1 to ω, since we went from using it once to using it multiple times. When looking at expressions like this, we can think of the quantity q as saying that the variable is "used q times" in the expression.

Quantity Checking

With all of that background information established, we can finally see how quantity checking actually works. Let's follow what happens to a single variable x in our context as we perform different operations.

To illustrate how quantities evolve, I will provide Idris-style context diagrams showing the various cases. In these, capital-letter names T, E, etc. stand for any expression, and q, r, etc. stand for any quantity.

Variables and Literals

 1 x : T
------------------------------
x : T

In the simplest case, an expression is just a single variable. That variable will have quantity 1 in the context, while all others have quantity 0. (Other variables may also be missing entirely, which for quantity checking is equivalent to them having 0 quantity.)

 0 x : T
------------------------------
True : Bool

For literals such as 1, or constructors such as True, all variables in the context have quantity 0, since all variables are used 0 times in a constructor.

Function Application

 qf x : T
------------------------------
F : (r _ : A) -> B

 qe x : T
------------------------------
E : A

 (qf + r*qe) x : T
------------------------------
(F E) : B

This is the most complicated of QTT's rules. We have a function F whose parameter has r quantity, and we're applying it to E. If our variable x is used qf times in F and qe times in E, then it is used qf + r*qe times in the full expression.

To better understand this rule, let's look at some simpler cases. First, let's assume that x is not used in the function F, so that qf = 0. Then, x's full quantity is r * qe. For example, let's look at these two functions:

f x = id x

g x = id 1

Here, id has type a -> a, where its input is unrestricted (ω). In the first function, we can see that x is used once in the input of id, so the quantity of x in the whole expression is ω * 1 = ω. In the second function, x is used zero times in the input of id, so its quantity in the whole expression is ω * 0 = 0. The function g will typecheck if you mark its input as erased, but not f.

As another simplified case, let's assume that F is a linear function, meaning that r = 1. Then x's full quantity is qf + qe, the simple sum of the quantities of each part. Here's a function that demonstrates this:

ldup x = (#) x x

The linear pair constructor (#) is linear in both arguments, so to find the quantity of x in the full expression we can just add up the quantities in each part. x is used zero times in (#) and one time in x, so the total quantity is 0 + 1 + 1 = ω. If the second x were replaced by something else, like a literal, the quantity would only be 0 + 1 + 0 = 1. Intuitively, you can think of these as "parallel expressions", and the addition operation tells you how quantities combine in parallel.

Subusaging

 q x : T
------------------------------
E : T'

(q ≤ r)

 r x : T
------------------------------
E : T'

This rule is where the order relation on quantities comes in. It allows us to convert a quantity in our context to another one, given that the new context is greater than or equal to the old one. Type theorists call this subusaging, as it lets us use variables less often than we claim in our types.

Subusaging is why this function definition is allowed:

ignore : a -> Int
ignore x = 42

The input x is used zero times, which would normally mean its quantity would have to be 0; however, since 0 ≤ ω, we can use subusaging to increase the quantity to ω.

This also explains the mysterious fact we pointed out earlier, that 0 ≰ 1 in our quantity ordering. If it were true that 0 ≤ 1, then we could also increase the quantity of x from 0 to 1:

ignoreLinear : (1 x : a) -> Int
ignoreLinear x = 42

This would mean that the quantity 1 would be for variables used at most once, rather than exactly once. Idris's designers decided that they wanted linearity to have the second meaning, not the first.

Lambdas and Other Bindings

 q x : A
------------------------------
E : B

(\q x => E) : (q x : A) -> B

This rule is the most important, as it is the only one in which quantities actually impact typechecking. It is also one of the most straightforward: a lambda expression \q x => E is only valid if x is used q times inside E. This rule doesn't only apply to lambdas, actually - it applies to any syntax where a variable that has a quantity is bound, such as function parameters, let, case, with, and so on.

let x = 1 in x + x

To see how quantity checking would work with this let-expression, we can simply desugar it into its equivalent lambda form:

(\x => x + x) 1

An explicit quantity q isn't given for the lambda in this expression, so Idris will try to infer the quantity, then check to see if it's valid. In this case, Idris will infer that x is unrestricted.

Pattern Matching

All of the binding constructs that this rule applies to support pattern matching, so we need to determine how quantities interact with patterns. To be more specific, if we have a function that pattern-matches like this:

func : (1 _ : LPair a b) -> c
func (x # y) = ?impl

How does the linear quantity of this function's input "descend" into the bindings x and y?

A simple rule is to apply the same function-application rule we looked at earlier, but to the left side of the equation. For example, here's how we compute the quantity required for x in this function definition:

func      (((#)      x)       y)
  0 + 1 * (( 0 + 1 * 1) + 1 * 0)  = 1

We start from the outside and work our way inwards, applying the qf + r*qe rule as we go. x is used zero times in the constant func, and its argument is linear. We know that x is used once inside of the linear pair (x # y) (aside from being obvious, we can compute this fact ourselves), so the number of times x must be used in func's definition is 0 + 1 * 1 = 1.

The same argument applies to y, meaning that y should also be used once inside of func for this definition to pass quantity checking. And in fact, if we look at the context of the hole ?impl, that's exactly what we see!

 0 a : Type
 0 b : Type
 0 c : Type
 1 x : a
 1 y : b
------------------------------
impl : c

As a final note, pattern matching in Idris 2 is only allowed when the value in question exists at runtime, meaning that it isn't erased. This is because in QTT, a value must be constructed before it can be pattern-matched: if you match on a variable x, the resources required to make that variable's value are added to the total count.

 1 x : T
------------------------------
x : T

 q x : T
------------------------------
E : T'

 (1 + q) x : T
------------------------------
(case x of ... => E) : T'

For this reason, the total uses of the variable x when pattern-matching on it must be 1 + q, where q is the uses of x after the pattern-match (x is still possible to use with an as-pattern x@...). This prevents the quantity from being 0.

The Erased Fragment

Earlier I stated that only variables in the context can have quantities, which in particular means top-level definitions cannot have them. This is mostly true, but there is one slight exception: a function can be marked as erased by placing a 0 before its name.

0 erasedId : (0 x : a) -> a
erasedId x = x

This tells the type system to define this function within the erased fragment, which is a fragment of the type system wherein all quantity checks are ignored. In the erasedId function above, we use the function's input x once despite labeling it as erased. This would normally result in a quantity error, but this function is allowed due to being defined in the erased fragment.

This quantity freedom the erased fragment gives us comes with a big drawback, though - erased functions are banned from being used at runtime. In terms of the type theory, what this means is that an erased function can only ever be used in these two places:

  1. Inside of another erased-fragment function or expression;
  2. Inside of a function argument that's erased:
constInt : (0 _ : a) -> Int
constInt _ = 2

erased2 : Int
erased2 = constInt (erasedId 1)

This makes sure that quantities are always handled correctly at runtime, which is where it matters!

There is another important place where the erased fragment comes into play, and that's in type signatures. The type signatures of definitions are always erased, so erased functions can be used inside of them.

erasedPrf : erasedId 0 = 0
erasedPrf = Refl

For this reason, erased functions are sometimes thought of as "exclusively type-level functions", though as we've seen, that's not entirely accurate.

Conclusion

This concludes our thorough discussion of Quantitative Type Theory. In this section, we learned about the various operations on quantities: their addition, multiplication, and ordering. We saw how quantities were linked to the context, and how to properly think about the context when analyzing type systems (bottom-to-top instead of top-to-bottom). We then moved on to studying QTT proper, and we saw how the quantities in our context change as the expressions we write grow more complex. Finally, we looked at the erased fragment, and how we can define erased functions.

In Idris 2's current state, most of this information is still entirely unnecessary for learning the language. That may not always be the case, though: there have been some discussions to change the quantity semiring that Idris 2 uses, or even to allow the programmer to choose which set of quantities to use. Whether those discussions lead to anything or not, it can still useful to better understand how Quantitative Type Theory functions in order to write better Idris 2 code.

A Note on Mathematical Accuracy

The information in this appendix is partially based on Robert Atkey's 2018 paper Syntax and Semantics of Quantitative Type Theory, which outlines QTT in the standard language of type theory. The QTT presented in Atkey's paper is roughly similar to Idris 2's type system except for these differences:

  1. Atkey's theory does not have subusaging, and so the quantity semiring in Atkey's paper is not ordered.
  2. In Atkey's theory, types can only be constructed in the erased fragment, which means it is impossible to construct a type at runtime. Idris 2 allows constructing types at runtime, but still uses the erased fragment when inside of type signatures.

To resolve these differences, I directly observed how Idris 2's type system behaved in practice in order to determine where to deviate from Atkey's paper.

While I tried to be as mathematically accurate as possible in this section, some accuracy had to be sacrificed for the sake of simplicity. In particular, the description of pattern matching given here is substantially oversimplified. A proper formal treatment of pattern matching would require introducing an eliminator function for each datatype; this eliminator would serve to determine how that datatype's constructors interacted with quantity checking. The details of how this would work for a few simple types (such as the boolean type Bool) are in Atkey's paper above. I did not include these details because I decided that what I was describing was complicated enough already.

src/Solutions/Functions1.idr

module Solutions.Functions1

--------------------------------------------------------------------------------
--          Exercise 1
--------------------------------------------------------------------------------

square : Integer -> Integer
square n = n * n

testSquare : (Integer -> Bool) -> Integer -> Bool
testSquare fun = fun . square

twice : (Integer -> Integer) -> Integer -> Integer
twice f = f . f

--------------------------------------------------------------------------------
--          Exercise 2
--------------------------------------------------------------------------------

isEven : Integer -> Bool
isEven n = (n `mod` 2) == 0

isOdd : Integer -> Bool
isOdd = not . isEven

--------------------------------------------------------------------------------
--          Exercise 3
--------------------------------------------------------------------------------

isSquareOf : Integer -> Integer -> Bool
isSquareOf n x = n == x * x

--------------------------------------------------------------------------------
--          Exercise 4
--------------------------------------------------------------------------------

isSmall : Integer -> Bool
isSmall n = n <= 100

--------------------------------------------------------------------------------
--          Exercise 5
--------------------------------------------------------------------------------

absIsSmall : Integer -> Bool
absIsSmall = isSmall . abs

--------------------------------------------------------------------------------
--          Exercise 6
--------------------------------------------------------------------------------

and : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
and f1 f2 n = f1 n && f2 n

or : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
or f1 f2 n = f1 n || f2 n

negate : (Integer -> Bool) -> Integer -> Bool
negate f = not . f

--------------------------------------------------------------------------------
--          Exercise 7
--------------------------------------------------------------------------------

(&&) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
(&&) = and

(||) : (Integer -> Bool) -> (Integer -> Bool) -> Integer -> Bool
(||) = or

not : (Integer -> Bool) -> Integer -> Bool
not = negate

src/Solutions/DataTypes.idr

module Solutions.DataTypes

-- If all or almost all functions in a module are provably
-- total, it is convenient to add the following pragma
-- at the top of the module. It is then no longer necessary
-- to annotate each function with the `total` keyword.
%default total

--------------------------------------------------------------------------------
--          Enumerations
--------------------------------------------------------------------------------

-- 1
and : Bool -> Bool -> Bool
and True  b = b
and False _ = False

or : Bool -> Bool -> Bool
or True  _ = True
or False b = b

--2
data UnitOfTime = Second | Minute | Hour | Day | Week

toSeconds : UnitOfTime -> Integer -> Integer
toSeconds Second y = y
toSeconds Minute y = 60 * y
toSeconds Hour y   = 60 * 60 * y
toSeconds Day y    = 24 * 60 * 60 * y
toSeconds Week y   = 7 * 24 * 60 * 60 * y

fromSeconds : UnitOfTime -> Integer -> Integer
fromSeconds u s = s `div` toSeconds u 1

convert : UnitOfTime -> Integer -> UnitOfTime -> Integer
convert u1 n u2 = fromSeconds u2 (toSeconds u1 n)

--3

data Element = H | C | N | O | F

atomicMass : Element -> Double
atomicMass H = 1.008
atomicMass C = 12.011
atomicMass N = 14.007
atomicMass O = 15.999
atomicMass F = 18.9984


--------------------------------------------------------------------------------
--          Sum Types
--------------------------------------------------------------------------------

data Title = Mr | Mrs | Other String

eqTitle : Title -> Title -> Bool
eqTitle Mr        Mr        = True
eqTitle Mrs       Mrs       = True
eqTitle (Other x) (Other y) = x == y
eqTitle _         _         = False

isOther : Title -> Bool
isOther (Other _) = True
isOther _         = False

data LoginError = UnknownUser String | InvalidPassword | InvalidKey

showError : LoginError -> String
showError (UnknownUser x) = "Unknown user: " ++ x
showError InvalidPassword = "Invalid password"
showError InvalidKey      = "Invalid key"

--------------------------------------------------------------------------------
--          Records
--------------------------------------------------------------------------------

-- 1
record TimeSpan where
  constructor MkTimeSpan
  unit  : UnitOfTime
  value : Integer

timeSpanToSeconds : TimeSpan -> Integer
timeSpanToSeconds (MkTimeSpan unit value) = toSeconds unit value

-- 2
eqTimeSpan : TimeSpan -> TimeSpan -> Bool
eqTimeSpan x y = timeSpanToSeconds x == timeSpanToSeconds y

-- alternative equality check using `on` from the Idris Prelude
eqTimeSpan' : TimeSpan -> TimeSpan -> Bool
eqTimeSpan' = (==) `on` timeSpanToSeconds

-- 3
showUnit : UnitOfTime -> String
showUnit Second = "s"
showUnit Minute = "min"
showUnit Hour   = "h"
showUnit Day    = "d"
showUnit Week   = "w"

prettyTimeSpan : TimeSpan -> String
prettyTimeSpan (MkTimeSpan Second v) = show v ++ " s"
prettyTimeSpan (MkTimeSpan u v)      =
  show v ++ " " ++ showUnit u ++ "(" ++ show (toSeconds u v) ++ " s)"

-- 4
compareUnit : UnitOfTime -> UnitOfTime -> Ordering
compareUnit = compare `on` (\x => toSeconds x 1)

minUnit : UnitOfTime -> UnitOfTime -> UnitOfTime
minUnit x y = case compareUnit x y of
  LT => x
  _  => y

addTimeSpan : TimeSpan -> TimeSpan -> TimeSpan
addTimeSpan (MkTimeSpan u1 v1) (MkTimeSpan u2 v2) =
  case minUnit u1 u2 of
    u => MkTimeSpan u (convert u1 v1 u + convert u2 v2 u)

--------------------------------------------------------------------------------
--          Generic Data Types
--------------------------------------------------------------------------------

-- 1
mapMaybe : (a -> b) -> Maybe a -> Maybe b
mapMaybe _ Nothing  = Nothing
mapMaybe f (Just x) = Just (f x)

appMaybe : Maybe (a -> b) -> Maybe a -> Maybe b
appMaybe (Just f) (Just v) = Just (f v)
appMaybe _        _        = Nothing

bindMaybe : Maybe a -> (a -> Maybe b) -> Maybe b
bindMaybe Nothing  _ = Nothing
bindMaybe (Just x) f = f x

filterMaybe : (a -> Bool) -> Maybe a -> Maybe a
filterMaybe f Nothing  = Nothing
filterMaybe f (Just x) = if (f x) then Just x else Nothing

first : Maybe a -> Maybe a -> Maybe a
first Nothing  y = y
first (Just x) _ = Just x

last : Maybe a -> Maybe a -> Maybe a
last x y = first y x

foldMaybe : (acc -> el -> acc) -> acc -> Maybe el -> acc
foldMaybe f x = maybe x (f x)

-- 2
mapEither : (a -> b) -> Either e a -> Either e b
mapEither _ (Left x)  = Left x
mapEither f (Right x) = Right (f x)

appEither : Either e (a -> b) -> Either e a -> Either e b
appEither (Left x)  _         = Left x
appEither (Right _) (Left x)  = Left x
appEither (Right f) (Right v) = Right (f v)

bindEither : Either e a -> (a -> Either e b) -> Either e b
bindEither (Left x)  _ = Left x
bindEither (Right x) f = f x

firstEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
firstEither fun (Left e1) (Left e2) = Left (fun e1 e2)
firstEither _   (Left e1) y         = y
firstEither _   (Right x) _         = Right x

-- instead of implementing this via pattern matching, we use
-- firstEither and swap the arguments. Since this would mean that
-- in the case of two `Left`s the errors would be in the wrong
-- order, we have to swap the arguments of `fun` as well.
-- Function `flip` from the prelude does this for us.
lastEither : (e -> e -> e) -> Either e a -> Either e a -> Either e a
lastEither fun x y = firstEither (flip fun) y x

fromEither : (e -> c) -> (a -> c) -> Either e a -> c
fromEither f _ (Left x)  = f x
fromEither _ g (Right x) = g x

-- 3
mapList : (a -> b) -> List a -> List b
mapList f Nil       = Nil
mapList f (x :: xs) = f x :: mapList f xs

filterList : (a -> Bool) -> List a -> List a
filterList f Nil       = Nil
filterList f (x :: xs) =
  if f x then x :: filterList f xs else filterList f xs

(++) : List a -> List a -> List a
(++) Nil ys = ys
(++) (x :: xs) ys = x :: (Solutions.DataTypes.(++) xs ys)

headMaybe : List a -> Maybe a
headMaybe Nil      = Nothing
headMaybe (x :: _) = Just x

tailMaybe : List a -> Maybe (List a)
tailMaybe Nil       = Nothing
tailMaybe (x :: xs) = Just xs

lastMaybe : List a -> Maybe a
lastMaybe Nil        = Nothing
lastMaybe (x :: Nil) = Just x
lastMaybe (_ :: xs)  = lastMaybe xs

initMaybe : List a -> Maybe (List a)
initMaybe l = case l of
  Nil => Nothing
  x :: xs => case initMaybe xs of
    Nothing => Just Nil
    Just ys => Just (x :: ys)

foldList : (acc -> el -> acc) -> acc -> List el -> acc
foldList fun vacc Nil       = vacc
foldList fun vacc (x :: xs) = foldList fun (fun vacc x) xs

-- 4
record Client where
  constructor MkClient
  name          : String
  title         : Title
  age           : Bits8
  passwordOrKey : Either Bits64 String

data Credentials = Password String Bits64 | Key String String

login1 : Client -> Credentials -> Either LoginError Client
login1 c (Password u y) =
  if c.name == u then
    if c.passwordOrKey == Left y then Right c else Left InvalidPassword
  else Left (UnknownUser u)

login1 c (Key u x) =
  if c.name == u then
    if c.passwordOrKey == Right x then Right c else Left InvalidKey
  else Left (UnknownUser u)

login : List Client -> Credentials -> Either LoginError Client
login Nil       (Password u _) = Left (UnknownUser u)
login Nil       (Key u _)      = Left (UnknownUser u)
login (x :: xs) cs             = case login1 x cs of
  Right c               => Right c
  Left  InvalidPassword => Left InvalidPassword
  Left  InvalidKey      => Left InvalidKey
  Left _                => login xs cs

--5

formulaMass : List (Element,Nat) -> Double
formulaMass []             = 0
formulaMass ((e, n) :: xs) = atomicMass e * cast n + formulaMass xs



src/Solutions/Interfaces.idr

module Solutions.Interfaces

%default total

--------------------------------------------------------------------------------
--          Basics
--------------------------------------------------------------------------------

interface Comp a where
  comp : a -> a -> Ordering

-- 1
anyLarger : Comp a => a -> List a -> Bool
anyLarger va []        = False
anyLarger va (x :: xs) = comp va x == GT || anyLarger va xs

-- 2
allLarger : Comp a => a -> List a -> Bool
allLarger va []        = True
allLarger va (x :: xs) = comp va x == GT && allLarger va xs

-- 3
maxElem : Comp a => List a -> Maybe a
maxElem []        = Nothing
maxElem (x :: xs) = case maxElem xs of
  Nothing => Just x
  Just v  => if comp x v == GT then Just x else Just v

minElem : Comp a => List a -> Maybe a
minElem []        = Nothing
minElem (x :: xs) = case minElem xs of
  Nothing => Just x
  Just v  => if comp x v == LT then Just x else Just v

-- 4
interface Concat a where
  concat : a -> a -> a

implementation Concat String where
  concat = (++)

implementation Concat (List a) where
  concat = (++)

-- 5
concatList : Concat a => List a -> Maybe a
concatList []        = Nothing
concatList (x :: xs) = case concatList xs of
  Nothing => Just x
  Just v  => Just (concat x v)

--------------------------------------------------------------------------------
--          More about Interfaces
--------------------------------------------------------------------------------

-- 1
interface Equals a where
  eq : a -> a -> Bool

  neq : a -> a -> Bool
  neq x y = not (eq x y)

interface Concat a => Empty a where
  empty : a

Equals a => Equals b => Equals (a,b) where
  eq (x1,y1) (x2,y2) = eq x1 x2 && eq y1 y2

Comp a => Comp b => Comp (a,b) where
  comp (x1,y1) (x2,y2) = case comp x1 x2 of
    EQ => comp y1 y2
    v  => v

Concat a => Concat b => Concat (a,b) where
  concat (x1,y1) (x2,y2) = (concat x1 x2, concat y1 y2)

Empty a => Empty b => Empty (a,b) where
  empty = (empty, empty)

-- 2
data Tree : Type -> Type where
  Leaf : a -> Tree a
  Node : Tree a -> Tree a -> Tree a

Equals a => Equals (Tree a) where
  eq (Leaf x)     (Leaf y)     = eq x y
  eq (Node l1 r1) (Node l2 r2) = eq l1 l2 && eq r1 r2
  eq _            _            = False

Concat (Tree a) where
  concat = Node

--------------------------------------------------------------------------------
--          Interfaces in the Prelude
--------------------------------------------------------------------------------

-- 1
record Complex where
  constructor MkComplex
  rel : Double
  img : Double

Eq Complex where
  MkComplex r1 i1 == MkComplex r2 i2 = r1 == r2 && i1 == i2

Num Complex where
  MkComplex r1 i1 + MkComplex r2 i2 = MkComplex (r1 + r2) (i1 + i2)
  MkComplex r1 i1 * MkComplex r2 i2 =
    MkComplex (r1 * r2 - i1 * i2) (r1 * i2 + r2 * i1)
  fromInteger n = MkComplex (fromInteger n) 0.0

Neg Complex where
  negate (MkComplex r i) = MkComplex (negate r) (negate i)
  MkComplex r1 i1 - MkComplex r2 i2 = MkComplex (r1 - r2) (i1 - i2)

Fractional Complex where
  MkComplex r1 i1 / MkComplex r2 i2 = case r2 * r2 + i2 * i2 of
    denom => MkComplex ((r1 * r2 + i1 * i2) / denom)
                       ((i1 * r2 - r1 * i2) / denom)

-- 2
Show Complex where
  showPrec p c = showCon p "MkComplex" (showArg c.rel ++ showArg c.img)

-- 3
record First a where
  constructor MkFirst
  value : Maybe a

pureFirst : a -> First a
pureFirst = MkFirst . Just

mapFirst : (a -> b) -> First a -> First b
mapFirst f = MkFirst . map f . value

mapFirst2 : (a -> b -> c) -> First a -> First b -> First c
mapFirst2 f (MkFirst (Just va)) (MkFirst (Just vb)) = pureFirst (f va vb)
mapFirst2 _ _ _ = MkFirst Nothing

Eq a => Eq (First a) where
  (==) = (==) `on` value

Ord a => Ord (First a) where
  compare = compare `on` value

Show a => Show (First a) where
  show = show . value

FromString a => FromString (First a) where
  fromString = pureFirst . fromString

FromDouble a => FromDouble (First a) where
  fromDouble = pureFirst . fromDouble

FromChar a => FromChar (First a) where
  fromChar = pureFirst . fromChar

Num a => Num (First a) where
  (+) = mapFirst2 (+)
  (*) = mapFirst2 (*)
  fromInteger = pureFirst . fromInteger

Neg a => Neg (First a) where
  negate = mapFirst negate
  (-) = mapFirst2 (-)

Integral a => Integral (First a) where
  mod = mapFirst2 mod
  div = mapFirst2 div

Fractional a => Fractional (First a) where
  (/) = mapFirst2 (/)
  recip = mapFirst recip

-- 4
Semigroup (First a) where
  l@(MkFirst (Just _)) <+> _ = l
  _                    <+> r = r

Monoid (First a) where
  neutral = MkFirst Nothing

-- 5
record Last a where
  constructor MkLast
  value : Maybe a

pureLast : a -> Last a
pureLast = MkLast . Just

mapLast : (a -> b) -> Last a -> Last b
mapLast f = MkLast . map f . value

mapLast2 : (a -> b -> c) -> Last a -> Last b -> Last c
mapLast2 f (MkLast (Just va)) (MkLast (Just vb)) = pureLast (f va vb)
mapLast2 _ _ _ = MkLast Nothing

Eq a => Eq (Last a) where
  (==) = (==) `on` value

Ord a => Ord (Last a) where
  compare = compare `on` value

Show a => Show (Last a) where
  show = show . value

FromString a => FromString (Last a) where
  fromString = pureLast . fromString

FromDouble a => FromDouble (Last a) where
  fromDouble = pureLast . fromDouble

FromChar a => FromChar (Last a) where
  fromChar = pureLast . fromChar

Num a => Num (Last a) where
  (+) = mapLast2 (+)
  (*) = mapLast2 (*)
  fromInteger = pureLast . fromInteger

Neg a => Neg (Last a) where
  negate = mapLast negate
  (-) = mapLast2 (-)

Integral a => Integral (Last a) where
  mod = mapLast2 mod
  div = mapLast2 div

Fractional a => Fractional (Last a) where
  (/) = mapLast2 (/)
  recip = mapLast recip

Semigroup (Last a) where
  _ <+> r@(MkLast (Just _)) = r
  l <+> _                   = l

Monoid (Last a) where
  neutral = MkLast Nothing

-- 6
last : List a -> Maybe a
last = value . foldMap pureLast

-- 7
record Any where
  constructor MkAny
  any : Bool

Semigroup Any where
  MkAny x <+> MkAny y = MkAny (x || y)

Monoid Any where
  neutral = MkAny False

record All where
  constructor MkAll
  all : Bool

Semigroup All where
  MkAll x <+> MkAll y = MkAll (x && y)

Monoid All where
  neutral = MkAll True

-- 8
anyElem : (a -> Bool) -> List a -> Bool
anyElem f = any . foldMap (MkAny . f)

allElems : (a -> Bool) -> List a -> Bool
allElems f = all . foldMap (MkAll . f)

-- 9
record Sum a where
  constructor MkSum
  value : a

record Product a where
  constructor MkProduct
  value : a

Num a => Semigroup (Sum a) where
  MkSum x <+> MkSum y = MkSum (x + y)

Num a => Monoid (Sum a) where
  neutral = MkSum 0

Num a => Semigroup (Product a) where
  MkProduct x <+> MkProduct y = MkProduct (x * y)

Num a => Monoid (Product a) where
  neutral = MkProduct 1

-- 10

sumList : Num a => List a -> a
sumList = value . foldMap MkSum

productList : Num a => List a -> a
productList = value . foldMap MkProduct

-- 12

data Element = H | C | N | O | F

record Mass where
  constructor MkMass
  value : Double

FromDouble Mass
  where fromDouble = MkMass

Eq Mass where
  (==) = (==) `on` value

Ord Mass where
  compare = compare `on` value

Show Mass where
  show = show . value

Semigroup Mass where
  x <+> y = MkMass $ x.value + y.value

Monoid Mass where
  neutral = 0.0

-- 13

atomicMass : Element -> Mass
atomicMass H = 1.008
atomicMass C = 12.011
atomicMass N = 14.007
atomicMass O = 15.999
atomicMass F = 18.9984

formulaMass : List (Element,Nat) -> Mass
formulaMass = foldMap pairMass
  where pairMass : (Element,Nat) -> Mass
        pairMass (e, n) = MkMass $ value (atomicMass e) * cast n

src/Solutions/Functions2.idr

module Solutions.Functions2

import Data.List

%default total

--------------------------------------------------------------------------------
--          Let Bindings and Where Blocks
--------------------------------------------------------------------------------

-- 1
record Artist where
  constructor MkArtist
  name : String

record Album where
  constructor MkAlbum
  name   : String
  artist : Artist

record Email where
  constructor MkEmail
  value : String

record Password where
  constructor MkPassword
  value : String

record User where
  constructor MkUser
  name     : String
  email    : Email
  password : Password
  albums   : List Album

Eq Artist where (==) = (==) `on` name

Eq Email where (==) = (==) `on` value

Eq Password where (==) = (==) `on` value

Eq Album where (==) = (==) `on` \a => (a.name, a.artist)

record Credentials where
  constructor MkCredentials
  email    : Email
  password : Password

record Request where
  constructor MkRequest
  credentials : Credentials
  album       : Album

data Response : Type where
  UnknownUser     : Email -> Response
  InvalidPassword : Response
  AccessDenied    : Email -> Album -> Response
  Success         : Album -> Response

DB : Type
DB = List User

handleRequest : DB -> Request -> Response
handleRequest xs (MkRequest (MkCredentials e pw) album) =
  case find ((e ==) . email) xs of
    Nothing => UnknownUser e
    Just (MkUser _ _ pw' albums)  =>
      if      pw' /= pw         then InvalidPassword
      else if elem album albums then Success album
      else                           AccessDenied e album

--2

namespace Ex2

  data Failure : Type where
    UnknownUser : Email -> Failure
    InvalidPassword : Failure
    AccessDenied : Email -> Album -> Failure

  handleRequest : DB -> Request -> Either Failure Album
  handleRequest db req = case find ((==) req.credentials.email . email) db of
    Nothing => Left (UnknownUser req.credentials.email)
    Just u2 => case (u2.email == req.credentials.email && u2.password == req.credentials.password) of
      False => Left InvalidPassword
      True => case elem req.album u2.albums of
        False => Left (AccessDenied req.credentials.email req.album)
        True => Right req.album

-- 3
data Nucleobase = Adenine | Cytosine | Guanine | Thymine

readBase : Char -> Maybe Nucleobase
readBase 'A' = Just Adenine
readBase 'C' = Just Cytosine
readBase 'G' = Just Guanine
readBase 'T' = Just Thymine
readBase c   = Nothing

-- 4
traverseList : (a -> Maybe b) -> List a -> Maybe (List b)
traverseList _ []        = Just []
traverseList f (x :: xs) =
  case f x of
    Just y  => case traverseList f xs of
      Just ys => Just (y :: ys)
      Nothing => Nothing
    Nothing => Nothing

-- 5
DNA : Type
DNA = List Nucleobase

readDNA : String -> Maybe DNA
readDNA = traverseList readBase . unpack

-- 6
complement : DNA -> DNA
complement = map comp
  where comp : Nucleobase -> Nucleobase
        comp Adenine  = Thymine
        comp Cytosine = Guanine
        comp Guanine  = Cytosine
        comp Thymine  = Adenine

src/Solutions/Dependent.idr

module Solutions.Dependent

%default total

--------------------------------------------------------------------------------
--          Length-Indexed Lists
--------------------------------------------------------------------------------

data Vect : (len : Nat) -> Type -> Type where
  Nil  : Vect 0 a
  (::) : a -> Vect n a -> Vect (S n) a

-- 1
len : List a -> Nat
len Nil = Z
len (_ :: xs) = S (len xs)

-- 2
head : Vect (S n) a -> a
head (x :: _) = x
head Nil impossible

-- 3
tail : Vect (S n) a -> Vect n a
tail (_ :: xs) = xs
tail Nil impossible

-- 4
zipWith3 : (a -> b -> c -> d) -> Vect n a -> Vect n b -> Vect n c -> Vect n d
zipWith3 f []        []        []        = []
zipWith3 f (x :: xs) (y :: ys) (z :: zs) = f x y z :: zipWith3 f xs ys zs

-- 5
-- Since we only have a `Semigroup` constraint, we can't conjure
-- a value of type `a` out of nothing in case of an empty list.
-- We therefore have to return a `Nothing` in case of an empty list.
foldSemi : Semigroup a => List a -> Maybe a
foldSemi []        = Nothing
foldSemi (x :: xs) = Just . maybe x (x <+>) $ foldSemi xs

-- 6
-- the `Nil` case is impossible here, so unlike in Exercise 4,
-- we don't need to wrap the result in a `Maybe`.
-- However, we need to pattern match on the tail of the Vect to
-- decide whether to invoke `foldSemiVect` recursively or not
foldSemiVect : Semigroup a => Vect (S n) a -> a
foldSemiVect (x :: [])         = x
foldSemiVect (x :: t@(_ :: _)) = x <+> foldSemiVect t

-- 7
iterate : (n : Nat) -> (a -> a) -> a -> Vect n a
iterate 0     _ _ = Nil
iterate (S k) f v = v :: iterate k f (f v)

-- 8
generate : (n : Nat) -> (s -> (s,a)) -> s -> Vect n a
generate 0     _ _ = Nil
generate (S k) f v =
  let (v', va) = f v
   in va :: generate k f v'

-- 9
fromList : (as : List a) -> Vect (length as) a
fromList []        = []
fromList (x :: xs) = x :: fromList xs

-- 10
-- Lookup the type and implementation of functions `maybe` `const` and
-- try figuring out, what's going on here. An alternative implementation
-- would of course just pattern match on the argument.
maybeSize : Maybe a -> Nat
maybeSize = maybe 0 (const 1)

fromMaybe : (m : Maybe a) -> Vect (maybeSize m) a
fromMaybe Nothing  = []
fromMaybe (Just x) = [x]

--------------------------------------------------------------------------------
--          Fin: Safe Indexing into Vectors
--------------------------------------------------------------------------------

data Fin : (n : Nat) -> Type where
  FZ : {0 n : Nat} -> Fin (S n)
  FS : (k : Fin n) -> Fin (S n)

(++) : Vect m a -> Vect n a -> Vect (m + n) a
(++) []        ys = ys
(++) (x :: xs) ys = x :: (xs ++ ys)

replicate : (n : Nat) -> a -> Vect n a
replicate 0     _ = []
replicate (S k) x = x :: replicate k x

zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c
zipWith _ []        []        = []
zipWith f (x :: xs) (y :: ys) = f x y :: zipWith f xs ys



-- 1
update : (a -> a) -> Fin n -> Vect n a -> Vect n a
update f FZ     (x :: xs) = f x :: xs
update f (FS k) (x :: xs) = x :: update f k xs

-- 2
insert : a -> Fin (S n) -> Vect n a -> Vect (S n) a
insert v FZ     xs         = v :: xs
insert v (FS k) (x :: xs)  = x :: insert v k xs
insert v (FS k) []  impossible

-- 3
-- The trick here is to pattern match on the tail of the
-- vector in the `FS k` case and realize that an empty
-- tail is impossible. Otherwise we won't be able to
-- convince the type checker, that the vector's tail is
-- non-empty in the recursive case.
delete : Fin (S n) -> Vect (S n) a -> Vect n a
delete FZ     (_ :: xs)          = xs
delete (FS k) (x :: xs@(_ :: _)) = x :: delete k xs
delete (FS k) (x :: []) impossible

-- 4
safeIndexList : (xs : List a) -> Fin (length xs) -> a
safeIndexList (x :: _)  FZ     = x
safeIndexList (x :: xs) (FS k) = safeIndexList xs k
safeIndexList Nil _ impossible

-- 5
finToNat : Fin n -> Nat
finToNat FZ     = Z
finToNat (FS k) = S $ finToNat k

take : (k : Fin (S n)) -> Vect n a -> Vect (finToNat k) a
take FZ     x         = []
take (FS k) (x :: xs) = x :: take k xs

-- 6
minus : (n : Nat) -> Fin (S n) -> Nat
minus n FZ         = n
minus (S j) (FS k) = minus j k
minus 0 (FS k) impossible

-- 7
drop : (k : Fin (S n)) -> Vect n a -> Vect (minus n k) a
drop FZ     xs        = xs
drop (FS k) (_ :: xs) = drop k xs

-- 8
splitAt :  (k : Fin (S n))
        -> Vect n a
        -> (Vect (finToNat k) a, Vect (minus n k) a)
splitAt k xs = (take k xs, drop k xs)

--------------------------------------------------------------------------------
--          Compile-Time Computations
--------------------------------------------------------------------------------

-- 1
flattenList : List (List a) -> List a
flattenList []          = []
flattenList (xs :: xss) = xs ++ flattenList xss

flattenVect : Vect m (Vect n a) -> Vect (m * n) a
flattenVect []          = []
flattenVect (xs :: xss) = xs ++ flattenVect xss

-- 2
take' : (m : Nat) -> Vect (m + n) a -> Vect m a
take' 0     _         = []
take' (S k) (x :: xs) = x :: take' k xs
take' (S k) Nil impossible

drop' : (m : Nat) -> Vect (m + n) a -> Vect n a
drop' 0     xs        = xs
drop' (S k) (x :: xs) = drop' k xs
drop' (S k) Nil impossible

splitAt' : (m : Nat) -> Vect (m + n) a -> (Vect m a, Vect n a)
splitAt' m xs = (take' m xs, drop' m xs)

-- 3
-- Since we must call `replicate` in the `Nil` case, `k`
-- must be a non-erased argument. I used an implicit argument here,
-- since this reflects the type of the mathematical function
-- more closely.
--
-- Empty matrices probably don't make too much sense,
-- so we could also request at the type-level that `k` and `m`
-- are non-zero, in which case both values could be derived
-- by pattern matching on the vectors.
transpose : {k : _} -> Vect m (Vect k a) -> Vect k (Vect m a)
transpose []          = replicate k []
transpose (xs :: xss) = zipWith (::) xs (transpose xss)

src/Solutions/IO.idr

module Solutions.IO

import Data.List1
import Data.String

import System.File

%default total

--------------------------------------------------------------------------------
--          Pure Side Effects?
--------------------------------------------------------------------------------

-- 1
rep : (String -> String) -> IO ()
rep f = do
  s <- getLine
  putStrLn (f s)

-- 2
covering
repl : (String -> String) -> IO ()
repl f = do
  _ <- rep f
  repl f

-- 3
covering
replTill : (String -> Either String String) -> IO ()
replTill f = do
  s <- getLine
  case f s of
    Left  msg => putStrLn msg
    Right msg => do
      _ <- putStrLn msg
      replTill f

-- 4
data Error : Type where
  NotAnInteger    : (value : String) -> Error
  UnknownOperator : (value : String) -> Error
  ParseError      : (input : String) -> Error

dispError : Error -> String
dispError (NotAnInteger v)    = "Not an integer: " ++ v ++ "."
dispError (UnknownOperator v) = "Unknown operator: " ++ v ++ "."
dispError (ParseError v)      = "Invalid expression: " ++ v ++ "."

readInteger : String -> Either Error Integer
readInteger s = maybe (Left $ NotAnInteger s) Right $ parseInteger s

readOperator : String -> Either Error (Integer -> Integer -> Integer)
readOperator "+" = Right (+)
readOperator "*" = Right (*)
readOperator s   = Left (UnknownOperator s)

eval : String -> Either Error Integer
eval s =
  let [x,y,z]  := forget $ split isSpace s | _ => Left (ParseError s)
      Right v1 := readInteger x  | Left e => Left e
      Right op := readOperator y | Left e => Left e
      Right v2 := readInteger z  | Left e => Left e
   in Right $ op v1 v2

covering
exprProg : IO ()
exprProg = replTill prog
  where prog : String -> Either String String
        prog "done" = Left "Goodbye!"
        prog s      = Right . either dispError show $ eval s

-- 5
covering
replWith :  (state      : s)
         -> (next       : s -> String -> Either res s)
         -> (dispState  : s -> String)
         -> (dispResult : res -> s -> String)
         -> IO ()
replWith state next dispState dispResult = do
  _     <- putStrLn (dispState state)
  input <- getLine
  case next state input of
    Left  result => putStrLn (dispResult result state)
    Right state' => replWith state' next dispState dispResult

-- 6
data Abort : Type where
  NoNat : (input : String) -> Abort
  Done  : Abort

printSum : Nat -> String
printSum n =
  "Current sum: " ++ show n ++ "\nPlease enter a natural number:"

printRes : Abort -> Nat -> String
printRes (NoNat input) _ =
  "Not a natural number: " ++ input ++ ". Aborting..."
printRes Done k =
  "Final sum: " ++ show k ++ "\nHave a nice day."

readInput : Nat -> String -> Either Abort Nat
readInput _ "done" = Left Done
readInput n s      = case parseInteger {a = Integer} s of
  Nothing => Left $ NoNat s
  Just v  => if v >= 0 then Right (cast v + n) else Left (NoNat s)

covering
sumProg : IO ()
sumProg = replWith 0 readInput printSum printRes

--------------------------------------------------------------------------------
--          Do Blocks, Desugared
--------------------------------------------------------------------------------

-- 1
ex1a : IO String
ex1a = do
  s1 <- getLine
  s2 <- getLine
  s3 <- getLine
  pure $ s1 ++ reverse s2 ++ s3

ex1aBind : IO String
ex1aBind =
  getLine >>= (\s1 =>
    getLine >>= (\s2 =>
      getLine >>= (\s3 =>
        pure $ s1 ++ reverse s2 ++ s3
      )
    )
  )

ex1aBang : IO String
ex1aBang =
  pure $ !getLine ++ reverse !getLine ++ !getLine

ex1b : Maybe Integer
ex1b = do
  n1 <- parseInteger "12"
  n2 <- parseInteger "300"
  Just $ n1 + n2 * 100

ex1bBind : Maybe Integer
ex1bBind =
  parseInteger "12" >>= (\n1 =>
    parseInteger "300" >>= (\n2 =>
      Just $ n1 + n2 * 100
    )
  )

ex1bBang : Maybe Integer
ex1bBang =
  Just $ !(parseInteger "12") + !(parseInteger "300") * 100

-- 2
data List01 : (nonEmpty : Bool) -> Type -> Type where
  Nil  : List01 False a
  (::) : a -> List01 False a -> List01 ne a

head : List01 True a -> a
head (x :: _) = x

weaken : List01 ne a -> List01 False a
weaken []       = []
weaken (h :: t) = h :: t

map01 : (a -> b) -> List01 ne a -> List01 ne b
map01 _ []       = []
map01 f (x :: y) = f x :: map01 f y

tail : List01 True a -> List01 False a
tail (_ :: t) = weaken t

(++) : List01 ne1 a -> List01 ne2 a -> List01 (ne1 || ne2) a
(++) []       []       = []
(++) []       (h :: t) = h :: t
(++) (h :: t) xs       = h :: weaken (t ++ xs)

concat' : List01 ne1 (List01 ne2 a) -> List01 False a
concat' []       = []
concat' (x :: y) = weaken (x ++ concat' y)

concat :  {ne1, ne2 : _}
       -> List01 ne1 (List01 ne2 a)
       -> List01 (ne1 && ne2) a
concat {ne1 = True}  {ne2 = True}  (x :: y) = x ++ concat' y
concat {ne1 = True}  {ne2 = False} x        = concat' x
concat {ne1 = False} {ne2 = _}     x        = concat' x

namespace List01
  export
  (>>=) :  {ne1, ne2 : _}
        -> List01 ne1 a
        -> (a -> List01 ne2 b)
        -> List01 (ne1 && ne2) b
  as >>= f = concat (map01 f as)

--------------------------------------------------------------------------------
--          Working with Files
--------------------------------------------------------------------------------

-- 1
namespace IOErr
  export
  pure : a -> IO (Either e a)
  pure = pure . Right

  export
  fail : e -> IO (Either e a)
  fail = pure . Left

  export
  lift : IO a -> IO (Either e a)
  lift = map Right

  export
  catch : IO (Either e1 a) -> (e1 -> IO (Either e2 a)) -> IO (Either e2 a)
  catch io f = do
    Left err <- io | Right v => pure v
    f err

  export
  (>>=) : IO (Either e a) -> (a -> IO (Either e b)) -> IO (Either e b)
  io >>= f = Prelude.do
    Right v <- io | Left err => fail err
    f v

  export
  (>>) : IO (Either e ()) -> Lazy (IO (Either e a)) -> IO (Either e a)
  iou >> ioa = Prelude.do
    Right _ <- iou | Left err => fail err
    ioa

covering
countEmpty'' : (path : String) -> IO (Either FileError Nat)
countEmpty'' path = withFile path Read pure (go 0)
  where covering go : Nat -> File -> IO (Either FileError Nat)
        go k file = do
          False <- lift (fEOF file) | True => pure k
          "\n"  <- fGetLine file    | _  => go k file
          go (k + 1) file

-- 2
covering
countWords : (path : String) -> IO (Either FileError Nat)
countWords path = withFile path Read pure (go 0)
  where covering go : Nat -> File -> IO (Either FileError Nat)
        go k file = do
          False <- lift (fEOF file) | True => pure k
          s     <- fGetLine file
          go (k + length (words s)) file

-- 3
covering
withLines :  (path : String)
          -> (accum : s -> String -> s)
          -> (initialState : s)
          -> IO (Either FileError s)
withLines path accum ini = withFile path Read pure (go ini)
  where covering go : s -> File -> IO (Either FileError s)
        go st file = do
          False <- lift (fEOF file) | True => pure st
          line  <- fGetLine file
          go (accum st line) file

covering
countEmpty3 : (path : String) -> IO (Either FileError Nat)
countEmpty3 path = withLines path acc 0
  where acc : Nat -> String -> Nat
        acc k "\n" = k + 1
        acc k _    = k

covering
countWords2 : (path : String) -> IO (Either FileError Nat)
countWords2 path = withLines path (\n,s => n + length (words s)) 0

-- 4
covering
foldLines :  Monoid s
          => (path : String)
          -> (f    : String -> s)
          -> IO (Either FileError s)
foldLines path f = withLines path (\vs => (vs <+>) . f) neutral

-- 5

-- Instead of returning a triple of natural numbers,
-- it is better to make the semantics clear and use
-- a custom record type to store the result.
--
-- In a larger, more-complex application it might be
-- even better to make things truly type safe and
-- define a single field record together with an instance
-- of monoid for each kind of count.
record WC where
  constructor MkWC
  lines : Nat
  words : Nat
  chars : Nat

Semigroup WC where
  MkWC l1 w1 c1 <+> MkWC l2 w2 c2 = MkWC (l1 + l2) (w1 + w2) (c1 + c2)

Monoid WC where
  neutral = MkWC 0 0 0

covering
toWC : String -> WC
toWC s = MkWC 1 (length (words s)) (length s)

covering
wordCount : (path : String) -> IO (Either FileError WC)
wordCount path = foldLines path toWC

-- this is for testing the `wordCount` example.
covering
testWC : (path : String) -> IO ()
testWC path = Prelude.do
  Right (MkWC ls ws cs) <- wordCount path
    | Left err => putStrLn "Error: \{show err}"
  putStrLn "\{show ls} lines, \{show ws} words, \{show cs} characters"

src/Solutions/Functor.idr

module Solutions.Functor

import Data.IORef
import Data.List
import Data.List1
import Data.String
import Data.Vect

%default total

--------------------------------------------------------------------------------
--          Code Required from the Turoial
--------------------------------------------------------------------------------

interface Functor' (0 f : Type -> Type) where
  map' : (a -> b) -> f a -> f b

interface Functor' f => Applicative' f where
  app   : f (a -> b) -> f a -> f b
  pure' : a -> f a

record Comp (f,g : Type -> Type) (a : Type) where
  constructor MkComp
  unComp  : f (g a)

implementation Functor f => Functor g => Functor (Comp f g) where
  map f (MkComp v) = MkComp $ map f <$> v

record Product (f,g : Type -> Type) (a : Type) where
  constructor MkProduct
  fst  : f a
  snd  : g a

implementation Functor f => Functor g => Functor (Product f g) where
  map f (MkProduct l r) = MkProduct (map f l) (map f r)

data Gender = Male | Female | Other

record Name where
  constructor MkName
  value : String

record Email where
  constructor MkEmail
  value : String

record Password where
  constructor MkPassword
  value : String

record User where
  constructor MkUser
  firstName : Name
  lastName  : Name
  age       : Maybe Nat
  email     : Email
  gender    : Gender
  password  : Password

interface CSVField a where
  read : String -> Maybe a

CSVField Gender where
  read "m" = Just Male
  read "f" = Just Female
  read "o" = Just Other
  read _   = Nothing

CSVField Bool where
  read "t" = Just True
  read "f" = Just False
  read _   = Nothing

CSVField Nat where
  read = parsePositive

CSVField Integer where
  read = parseInteger

CSVField Double where
  read = parseDouble

CSVField a => CSVField (Maybe a) where
  read "" = Just Nothing
  read s  = Just <$> read s

readIf : (String -> Bool) -> (String -> a) -> String -> Maybe a
readIf p mk s = if p s then Just (mk s) else Nothing

isValidName : String -> Bool
isValidName s =
  let len = length s
   in 0 < len && len <= 100 && all isAlpha (unpack s)

CSVField Name where
  read = readIf isValidName MkName

isEmailChar : Char -> Bool
isEmailChar '.' = True
isEmailChar '@' = True
isEmailChar c   = isAlphaNum c

isValidEmail : String -> Bool
isValidEmail s =
  let len = length s
   in 0 < len && len <= 100 && all isEmailChar (unpack s)

CSVField Email where
  read = readIf isValidEmail MkEmail

isPasswordChar : Char -> Bool
isPasswordChar ' ' = True
isPasswordChar c   = not (isControl c) && not (isSpace c)

isValidPassword : String -> Bool
isValidPassword s =
  let len = length s
   in 8 < len && len <= 100 && all isPasswordChar (unpack s)

CSVField Password where
  read = readIf isValidPassword MkPassword

data HList : (ts : List Type) -> Type where
  Nil  : HList Nil
  (::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)

--------------------------------------------------------------------------------
--          Functor
--------------------------------------------------------------------------------

-- 1

Functor' Maybe where
  map' _ Nothing  = Nothing
  map' f (Just v) = Just $ f v

Functor' List where
  map' _ []        = []
  map' f (x :: xs) = f x :: map' f xs

Functor' List1 where
  map' f (h ::: t) = f h ::: map' f t

Functor' (Vect n) where
  map' _ []        = []
  map' f (x :: xs) = f x :: map' f xs

Functor' (Either e) where
  map' _ (Left ve)  = Left ve
  map' f (Right va) = Right $ f va

Functor' (Pair e) where
  map' f (ve,va) = (ve, f va)

-- 2

[Prod] Functor f => Functor g => Functor (\a => (f a, g a)) where
  map fun (fa, ga) = (map fun fa, map fun ga)

-- 3

record Identity a where
  constructor Id
  value : a

Functor Identity where
  map f (Id va) = Id $ f va

-- 4

record Const (e,a : Type) where
  constructor MkConst
  value : e

Functor (Const e) where
  map _ (MkConst v) = MkConst v

-- 5

data Crud : (i : Type) -> (a : Type) -> Type where
  Create : (value : a) -> Crud i a
  Update : (id : i) -> (value : a) -> Crud i a
  Read   : (id : i) -> Crud i a
  Delete : (id : i) -> Crud i a

Functor (Crud i) where
  map f (Create value)    = Create $ f value
  map f (Update id value) = Update id $ f value
  map _ (Read id)         = Read id
  map _ (Delete id)       = Delete id

-- 6

data Response : (e, i, a : Type) -> Type where
  Created : (id : i) -> (value : a) -> Response e i a
  Updated : (id : i) -> (value : a) -> Response e i a
  Found   : (values : List a) -> Response e i a
  Deleted : (id : i) -> Response e i a
  Error   : (err : e) -> Response e i a

Functor (Response e i) where
  map f (Created id value) = Created id $ f value
  map f (Updated id value) = Updated id $ f value
  map f (Found values)     = Found $ map f values
  map _ (Deleted id)       = Deleted id
  map _ (Error err)        = Error err


-- 7

data Validated : (e,a : Type) -> Type where
  Invalid : (err : e) -> Validated e a
  Valid   : (val : a) -> Validated e a

Functor (Validated e) where
  map _ (Invalid err) = Invalid err
  map f (Valid val)   = Valid $ f val

--------------------------------------------------------------------------------
--          Applicative
--------------------------------------------------------------------------------

-- 1

Applicative' (Either e) where
  pure' = Right
  app (Right f) (Right v) = Right $ f v
  app (Left ve) _         = Left ve
  app _         (Left ve) = Left ve

Applicative Identity where
  pure = Id
  Id f <*> Id v = Id $ f v

-- 2

{n : _} -> Applicative' (Vect n) where
  pure' = replicate n
  app []        []        = []
  app (f :: fs) (v :: vs) = f v :: app fs vs

-- 3

Monoid e => Applicative' (Pair e) where
  pure' v = (neutral, v)
  app (e1,f) (e2,v) = (e1 <+> e2, f v)

-- 4

Monoid e => Applicative (Const e) where
  pure _ = MkConst neutral
  MkConst e1 <*> MkConst e2 = MkConst $ e1 <+> e2

-- 5

Semigroup e => Applicative (Validated e) where
  pure = Valid
  Valid   f  <*> Valid v    = Valid $ f v
  Valid   _  <*> Invalid ve = Invalid ve
  Invalid e1 <*> Invalid e2 = Invalid $ e1 <+> e2
  Invalid ve <*> Valid _    = Invalid ve

-- 6

data CSVError : Type where
  FieldError           : (line, column : Nat) -> (str : String) -> CSVError
  UnexpectedEndOfInput : (line, column : Nat) -> CSVError
  ExpectedEndOfInput   : (line, column : Nat) -> CSVError
  App                  : (fst, snd : CSVError) -> CSVError

Semigroup CSVError where
  (<+>) = App

-- 7

readField : CSVField a => (line, column : Nat) -> String -> Validated CSVError a
readField line col str =
  maybe (Invalid $ FieldError line col str) Valid (read str)

toVect : (n : Nat) -> (line, col : Nat) -> List a -> Validated CSVError (Vect n a)
toVect 0     line _   []        = Valid []
toVect 0     line col _         = Invalid (ExpectedEndOfInput line col)
toVect (S k) line col []        = Invalid (UnexpectedEndOfInput line col)
toVect (S k) line col (x :: xs) = (x ::) <$> toVect k line (S col) xs

-- We can't use do notation here as we don't have an implementation
-- of Monad for `Validated`
readUser' : (line : Nat) -> List String -> Validated CSVError User
readUser' line ss = case toVect 6 line 0 ss of
  Valid [fn,ln,a,em,g,pw] =>
    [| MkUser (readField line 1 fn)
              (readField line 2 ln)
              (readField line 3 a)
              (readField line 4 em)
              (readField line 5 g)
              (readField line 6 pw) |]
  Invalid err => Invalid err

readUser : (line : Nat) -> String -> Validated CSVError User
readUser line = readUser' line . forget . split (',' ==)

interface CSVLine a where
  decodeAt : (line, col : Nat) -> List String -> Validated CSVError a

CSVLine (HList []) where
  decodeAt _ _ [] = Valid Nil
  decodeAt l c _  = Invalid (ExpectedEndOfInput l c)

CSVField t => CSVLine (HList ts) => CSVLine (HList (t :: ts)) where
  decodeAt l c []        = Invalid (UnexpectedEndOfInput l c)
  decodeAt l c (s :: ss) = [| readField l c s :: decodeAt l (S c) ss |]

decode : CSVLine a => (line : Nat) -> String -> Validated CSVError a
decode line = decodeAt line 1 . forget . split (',' ==)

hdecode :  (0 ts : List Type)
        -> CSVLine (HList ts)
        => (line : Nat)
        -> String
        -> Validated CSVError (HList ts)
hdecode _ = decode

-- 8

-- 8.1
head : HList (t :: ts) -> t
head (v :: _) = v

-- 8.2
tail : HList (t :: ts) -> HList ts
tail (_ :: t) = t

-- 8.3
(++) : HList xs -> HList ys -> HList (xs ++ ys)
[]        ++ ws = ws
(v :: vs) ++ ws = v :: (vs ++ ws)

-- 8.4
indexList : (as : List a) -> Fin (length as) -> a
indexList (x :: _)   FZ    = x
indexList (_ :: xs) (FS y) = indexList xs y
indexList []        x impossible

index : (ix : Fin (length ts)) -> HList ts -> indexList ts ix
index FZ     (v :: _)  = v
index (FS x) (_ :: vs) = index x vs
index ix [] impossible

-- 8.5
namespace HVect
  public export
  data HVect : (ts : Vect n Type) -> Type where
    Nil  : HVect Nil
    (::) : (v : t) -> (vs : HVect ts) -> HVect (t :: ts)

  public export
  head : HVect (t :: ts) -> t
  head (v :: _) = v

  public export
  tail : HVect (t :: ts) -> HVect ts
  tail (_ :: t) = t

  public export
  (++) : HVect xs -> HVect ys -> HVect (xs ++ ys)
  []        ++ ws = ws
  (v :: vs) ++ ws = v :: (vs ++ ws)

  public export
  index :  {0 n : Nat}
        -> {0 ts : Vect n Type}
        -> (ix : Fin n)
        -> HVect ts -> index ix ts
  index FZ     (v :: _)  = v
  index (FS x) (_ :: vs) = index x vs
  index ix [] impossible

-- 8.6

-- Note: We are usually not allowed to pattern match
-- on an erased argument. However, in this case, the
-- shape of `ts` follows from `n`, so we can pattern
-- match on `ts` to help Idris inferring the types.
--
-- Note also, that we create a `HVect` holding only empty
-- `Vect`s. We therefore only need to know about the length
-- of the type-level vector to implement this.
empties :  {n : Nat} -> {0 ts : Vect n Type} -> HVect (Vect 0 <$> ts)
empties {n = 0}   {ts = []}     = []
empties {n = S _} {ts = _ :: _} = [] :: empties

hcons :  {0 ts : Vect n Type}
      -> HVect ts
      -> HVect (Vect m <$> ts)
      -> HVect (Vect (S m) <$> ts)
hcons []        []        = []
hcons (v :: vs) (w :: ws) = (v :: w) :: hcons vs ws

htranspose :  {n : Nat}
           -> {0 ts : Vect n Type}
           -> Vect m (HVect ts)
           -> HVect (Vect m <$> ts)
htranspose []        = empties
htranspose (x :: xs) = hcons x (htranspose xs)

vects : Vect 3 (HVect [Bool, Nat, String])
vects = [[True, 100, "Hello"], [False, 0, "Idris"], [False, 2, "!"]]

vects' : HVect [Vect 3 Bool, Vect 3 Nat, Vect 3 String]
vects' = htranspose vects

-- 9
Applicative f => Applicative g => Applicative (Comp f g) where
  pure = MkComp . pure . pure
  MkComp ff <*> MkComp fa = MkComp [| ff <*> fa |]

-- 10
Applicative f => Applicative g => Applicative (Product f g) where
  pure v = MkProduct (pure v) (pure v)
  MkProduct ffl ffr  <*> MkProduct fal far =
    MkProduct (ffl <*> fal) (ffr <*> far)

--------------------------------------------------------------------------------
--          Monad
--------------------------------------------------------------------------------

-- 1
mapWithApp : Applicative f => (a -> b) -> f a -> f b
mapWithApp fun fa = pure fun <*> fa

-- 2
appWithBind : Monad f => f (a -> b) -> f a -> f b
appWithBind ff fa = ff >>= (\fun => fa >>= (\va => pure $ fun va))

-- or, more readable, the same thing with do notation
appWithBindDo : Monad f => f (a -> b) -> f a -> f b
appWithBindDo ff fa = do
  fun <- ff
  va  <- fa
  pure $ fun va

-- 3
bindFromJoin : Monad m => m a -> (a -> m b) -> m b
bindFromJoin ma f = join $ map f ma

-- 4
joinFromBind : Monad m => m (m a) -> m a
joinFromBind = (>>= id)

-- 5
-- The third law
-- `mf <*> ma = mf >>= (\fun => map (fun $) ma)`
-- does not hold, as implementation of *apply* on the
-- right hand side does not perform error accumulation.
--
-- `Validated e` therefore comes without implementation of
-- `Monad`. In order to use it in do blocks, it's best to
-- convert it to Either and back.

-- 6

DB : Type
DB = IORef (List (Nat,User))

data DBError : Type where
  UserExists        : Email -> Nat -> DBError
  UserNotFound      : Nat -> DBError
  SizeLimitExceeded : DBError

record Prog a where
  constructor MkProg
  runProg : DB -> IO (Either DBError a)

-- 6.1

-- make sure you are able to read and understand the
-- point-free style in the implementation of `map`!
Functor Prog where
  map f (MkProg run) = MkProg $ map (map f) . run

Applicative Prog where
  pure v = MkProg $ _ => pure (Right v)
  MkProg rf <*> MkProg ra = MkProg $ \db => do
    Right fun <- rf db | Left err => pure (Left err)
    Right va  <- ra db | Left err => pure (Left err)
    pure (Right $ fun va)

Monad Prog where
  MkProg ra >>= f = MkProg $ \db => do
    Right va <- ra db | Left err => pure (Left err)
    runProg (f va) db

-- 6.2

HasIO Prog where
  liftIO act = MkProg $ _ => map Right act

-- 6.3
throw : DBError -> Prog a
throw err = MkProg $ _ => pure (Left err)

getUsers : Prog (List (Nat,User))
getUsers = MkProg (map Right . readIORef)

putUsers : List (Nat,User) -> Prog ()
putUsers us =
  if length us > 1000 then throw SizeLimitExceeded
  else MkProg $ \db => Right <$> writeIORef db us

modifyDB : (List (Nat,User) -> List (Nat,User)) -> Prog ()
modifyDB f = getUsers >>= putUsers . f

-- 6.4
lookupUser : (id : Nat) -> Prog User
lookupUser id = do
  db <- getUsers
  case lookup id db of
    Just u  => pure u
    Nothing => throw (UserNotFound id)

-- 6.5
deleteUser : (id : Nat) -> Prog ()
deleteUser id =
  -- In the first step, we are only interested in the potential
  -- of failure, not the actual user value.
  -- We can therefore use `(>>)` to chain the operations.
  -- In order to do so, we must wrap `lookupUser` in a call
  -- to `ignore`.
  ignore (lookupUser id) >> modifyDB (filter $ (id /=) . fst)

-- 6.6
Eq Email where (==) = (==) `on` value

newId : List (Nat,User) -> Nat
newId = S . foldl (\n1,(n2,_) => max n1 n2) 0

addUser : (u : User) -> Prog Nat
addUser u = do
  us <- getUsers
  case find ((u.email ==) . email . snd) us of
    Just (id,_) => throw $ UserExists u.email id
    Nothing     => let id = newId us in putUsers ((id, u) :: us) $> id

-- 6.7

update : Eq a => a -> b -> List (a,b) -> List (a,b)
update va vb = map (\p@(va',vb') => if va == va' then (va,vb) else p)

updateUser : (id : Nat) -> (mod : User -> User) -> Prog User
updateUser id mod = do
  u  <- mod <$> lookupUser id
  us <- getUsers
  case find ((u.email ==) . email . snd) us of
    Just (id',_) => if id /= id'
                      then throw $ UserExists u.email id'
                      else putUsers (update id u us) $> u
    Nothing      => putUsers (update id u us) $> u

-- 6.8

record Prog' env err a where
  constructor MkProg'
  runProg' : env -> IO (Either err a)

Functor (Prog' env err) where
  map f (MkProg' run) = MkProg' $ map (map f) . run

Applicative (Prog' env err) where
  pure v = MkProg' $ _ => pure (Right v)
  MkProg' rf <*> MkProg' ra = MkProg' $ \db => do
    Right fun <- rf db | Left err => pure (Left err)
    Right va  <- ra db | Left err => pure (Left err)
    pure (Right $ fun va)

Monad (Prog' env err) where
  MkProg' ra >>= f = MkProg' $ \db => do
    Right va <- ra db | Left err => pure (Left err)
    runProg' (f va) db

HasIO (Prog' env err) where
  liftIO act = MkProg' $ _ => map Right act

throw' : err -> Prog' env err a
throw' ve = MkProg' $ _ => pure (Left ve)

src/Solutions/Folds.idr

module Solutions.Folds

import Data.Maybe
import Data.SnocList
import Data.Vect

%default total

--------------------------------------------------------------------------------
--          Recursion
--------------------------------------------------------------------------------

-- 1

anyList : (a -> Bool) -> List a -> Bool
anyList p []        = False
anyList p (x :: xs) = case p x of
  False => anyList p xs
  True  => True

anyList' : (a -> Bool) -> List a -> Bool
anyList' p Nil       = False
anyList' p (x :: xs) = p x || anyList p xs

allList : (a -> Bool) -> List a -> Bool
allList p []        = True
allList p (x :: xs) = case p x of
  True  => allList p xs
  False => False

allList' : (a -> Bool) -> List a -> Bool
allList' p Nil       = True
allList' p (x :: xs) = p x && allList p xs

-- 2

findList : (a -> Bool) -> List a -> Maybe a
findList f []        = Nothing
findList f (x :: xs) = if f x then Just x else findList f xs

-- 3

collectList : (a -> Maybe b) -> List a -> Maybe b
collectList f []        = Nothing
collectList f (x :: xs) = case f x of
  Just vb => Just vb
  Nothing => collectList f xs

-- Note utility function `Data.Maybe.toMaybe` in the implementation
lookupList : Eq a => a -> List (a,b) -> Maybe b
lookupList va = collectList (\(k,v) => toMaybe (k == va) v)

-- 4

mapTR' : (a -> b) -> List a -> List b
mapTR' f = go Lin
  where go : SnocList b -> List a -> List b
        go sx []        = sx <>> Nil
        go sx (x :: xs) = go (sx :< f x) xs

-- 5

filterTR' : (a -> Bool) -> List a -> List a
filterTR' f = go Lin
  where go : SnocList a -> List a -> List a
        go sx []        = sx <>> Nil
        go sx (x :: xs) = if f x then go (sx :< x) xs else go sx xs

-- 6

mapMayTR : (a -> Maybe b) -> List a -> List b
mapMayTR f = go Lin
  where go : SnocList b -> List a -> List b
        go sx []        = sx <>> Nil
        go sx (x :: xs) = case f x of
          Just vb => go (sx :< vb) xs
          Nothing => go sx xs

catMaybesTR : List (Maybe a) -> List a
catMaybesTR = mapMayTR id

-- 7

concatTR : List a -> List a -> List a
concatTR xs ys = (Lin <>< xs) <>> ys

-- 8

bindTR : List a -> (a -> List b) -> List b
bindTR xs f = go Lin xs
  where go : SnocList b -> List a -> List b
        go sx []        = sx <>> Nil
        go sx (x :: xs) = go (sx <>< f x) xs

joinTR : List (List a) -> List a
joinTR = go Lin
  where go : SnocList a -> List (List a) -> List a
        go sx []        = sx <>> Nil
        go sx (x :: xs) = go (sx <>< x) xs

-- Using the connection between join and bind:
-- yielding a tail recursive implementation as bindTR is.
joinTR' : List (List a) -> List a
joinTR' xss = bindTR xss id


--------------------------------------------------------------------------------
--          A few Notes on Totality Checking
--------------------------------------------------------------------------------

record Tree a where
  constructor Node
  value  : a
  forest : List (Tree a)

Forest : Type -> Type
Forest = List . Tree

example : Tree Bits8
example = Node 0 [Node 1 [], Node 2 [Node 3 [], Node 4 [Node 5 []]]]

mutual
  treeSize : Tree a -> Nat
  treeSize (Node _ forest) = S $ forestSize forest

  forestSize : Forest a -> Nat
  forestSize []        = 0
  forestSize (x :: xs) = treeSize x + forestSize xs

-- 1

mutual
  treeDepth : Tree a -> Nat
  treeDepth (Node _ forest) = S $ forestDepth forest

  forestDepth : Forest a -> Nat
  forestDepth []        = 0
  forestDepth (x :: xs) = max (treeDepth x) (forestDepth xs)

-- 2

-- It's often easier to write complex interface implementations
-- via a utility function.
--
-- Of course, we could also use a `mutual` block as with
-- `treeSize` and `forestSize` here.
treeEq : Eq a => Tree a -> Tree a -> Bool
treeEq (Node v1 f1) (Node v2 f2) = v1 == v2 && go f1 f2
  where go : Forest a -> Forest a -> Bool
        go []        []        = True
        go (x :: xs) (y :: ys) = treeEq x y && go xs ys
        go _         _         = False

Eq a => Eq (Tree a) where (==) = treeEq

-- 3

treeMap : (a -> b) -> Tree a -> Tree b
treeMap f (Node value forest) = Node (f value) (go forest)
  where go : Forest a -> Forest b
        go []        = []
        go (x :: xs) = treeMap f x :: go xs

Functor Tree where map = treeMap

-- 4

treeShow : Show a => Prec -> Tree a -> String
treeShow p (Node value forest) =
  showCon p "Node" $ showArg value ++ case forest of
    []      => " []"
    x :: xs => " [" ++ treeShow Open x ++ go xs ++ "]"

  where go : Forest a -> String
        go []        = ""
        go (y :: ys) = ", " ++ treeShow Open y ++ go ys

Show a => Show (Tree a) where showPrec = treeShow

-- 5

mutual
  treeToVect : (tr : Tree a) -> Vect (treeSize tr) a
  treeToVect (Node value forest) = value :: forestToVect forest

  forestToVect : (f : Forest a) -> Vect (forestSize f) a
  forestToVect []        = []
  forestToVect (x :: xs) = treeToVect x ++ forestToVect xs

--------------------------------------------------------------------------------
--          Interface Foldable
--------------------------------------------------------------------------------

-- 1

data Crud : (i : Type) -> (a : Type) -> Type where
  Create : (value : a) -> Crud i a
  Update : (id : i) -> (value : a) -> Crud i a
  Read   : (id : i) -> Crud i a
  Delete : (id : i) -> Crud i a

Foldable (Crud i) where
  foldr acc st (Create value)   = acc value st
  foldr acc st (Update _ value) = acc value st
  foldr _   st (Read _)         = st
  foldr _   st (Delete _)       = st

  foldl acc st (Create value)   = acc st value
  foldl acc st (Update _ value) = acc st value
  foldl _   st (Read _)         = st
  foldl _   st (Delete _)       = st

  null (Create _)   = False
  null (Update _ _) = False
  null (Read _)     = True
  null (Delete _)   = True

  foldMap f (Create value)   = f value
  foldMap f (Update _ value) = f value
  foldMap _ (Read _)         = neutral
  foldMap _ (Delete _)       = neutral

  foldlM acc st (Create value)   = acc st value
  foldlM acc st (Update _ value) = acc st value
  foldlM _   st (Read _)         = pure st
  foldlM _   st (Delete _)       = pure st

  toList (Create v)   = [v]
  toList (Update _ v) = [v]
  toList (Read _)     = []
  toList (Delete _)   = []

-- 2

data Response : (e, i, a : Type) -> Type where
  Created : (id : i) -> (value : a) -> Response e i a
  Updated : (id : i) -> (value : a) -> Response e i a
  Found   : (values : List a) -> Response e i a
  Deleted : (id : i) -> Response e i a
  Error   : (err : e) -> Response e i a

Foldable (Response e i) where
  foldr acc st (Created _ value) = acc value st
  foldr acc st (Updated _ value) = acc value st
  foldr acc st (Found values)    = foldr acc st values
  foldr _   st (Deleted _)       = st
  foldr _   st (Error _)         = st

  foldl acc st (Created _ value) = acc st value
  foldl acc st (Updated _ value) = acc st value
  foldl acc st (Found values)    = foldl acc st values
  foldl _   st (Deleted _)       = st
  foldl _   st (Error _)         = st

  null (Created _ _)     = False
  null (Updated _ _)     = False
  null (Found values)    = null values
  null (Deleted _)       = True
  null (Error _)         = True

  foldMap f (Created _ value) = f value
  foldMap f (Updated _ value) = f value
  foldMap f (Found values)    = foldMap f values
  foldMap f (Deleted _)       = neutral
  foldMap f (Error _)         = neutral

  toList (Created _ value) = [value]
  toList (Updated _ value) = [value]
  toList (Found values)    = values
  toList (Deleted _)       = []
  toList (Error _)         = []

  foldlM acc st (Created _ value) = acc st value
  foldlM acc st (Updated _ value) = acc st value
  foldlM acc st (Found values)    = foldlM acc st values
  foldlM _   st (Deleted _)       = pure st
  foldlM _   st (Error _)         = pure st

-- 3

data List01 : (nonEmpty : Bool) -> Type -> Type where
  Nil  : List01 False a
  (::) : a -> List01 False a -> List01 ne a

list01ToList : List01 ne a -> List a
list01ToList = go Lin
  where go : SnocList a -> List01 ne' a -> List a
        go sx []        = sx <>> Nil
        go sx (x :: xs) = go (sx :< x) xs

list01FoldMap : Monoid m => (a -> m) -> List01 ne a -> m
list01FoldMap f = go neutral
  where go : m -> List01 ne' a -> m
        go vm []        = vm
        go vm (x :: xs) = go (vm <+> f x) xs


Foldable (List01 ne) where
  foldr acc st []        = st
  foldr acc st (x :: xs) = acc x (foldr acc st xs)

  foldl acc st []        = st
  foldl acc st (x :: xs) = foldl acc (acc st x) xs

  null []       = True
  null (_ :: _) = False

  toList = list01ToList

  foldMap = list01FoldMap

  foldlM _ st []        = pure st
  foldlM f st (x :: xs) = f st x >>= \st' => foldlM f st' xs

-- 4

mutual
  foldrTree : (el -> st -> st) -> st -> Tree el -> st
  foldrTree f v (Node value forest) = f value (foldrForest f v forest)

  foldrForest : (el -> st -> st) -> st -> Forest el -> st
  foldrForest _ v []        = v
  foldrForest f v (x :: xs) = foldrTree f (foldrForest f v xs) x

mutual
  foldlTree : (st -> el -> st) -> st -> Tree el -> st
  foldlTree f v (Node value forest) = foldlForest f (f v value) forest

  foldlForest : (st -> el -> st) -> st -> Forest el -> st
  foldlForest _ v []        = v
  foldlForest f v (x :: xs) = foldlForest f (foldlTree f v x) xs

mutual
  foldMapTree : Monoid m => (el -> m) -> Tree el -> m
  foldMapTree f (Node value forest) = f value <+> foldMapForest f forest

  foldMapForest : Monoid m => (el -> m) -> Forest el -> m
  foldMapForest _ []        = neutral
  foldMapForest f (x :: xs) = foldMapTree f x <+> foldMapForest f xs

mutual
  toListTree : Tree el -> List el
  toListTree (Node value forest) = value :: toListForest forest

  toListForest : Forest el -> List el
  toListForest []        = []
  toListForest (x :: xs) = toListTree x ++ toListForest xs

mutual
  foldlMTree : Monad m => (st -> el -> m st) -> st -> Tree el -> m st
  foldlMTree f v (Node value forest) =
    f v value >>= \v' => foldlMForest f v' forest

  foldlMForest : Monad m => (st -> el -> m st) -> st -> Forest el -> m st
  foldlMForest _ v []        = pure v
  foldlMForest f v (x :: xs) =
    foldlMTree f v x >>= \v' => foldlMForest f v' xs

Foldable Tree where
  foldr   = foldrTree
  foldl   = foldlTree
  foldMap = foldMapTree
  foldlM  = foldlMTree
  null _  = False
  toList  = toListTree

-- 5

record Comp (f,g : Type -> Type) (a : Type) where
  constructor MkComp
  unComp  : f (g a)

Foldable f => Foldable g => Foldable (Comp f g) where
  foldr f st (MkComp v)  = foldr (flip $ foldr f) st v
  foldl f st (MkComp v)  = foldl (foldl f) st v
  foldMap f (MkComp v)   = foldMap (foldMap f) v
  foldlM f st (MkComp v) = foldlM (foldlM f) st v
  toList (MkComp v)      = foldMap toList v
  null (MkComp v)        = all null v

record Product (f,g : Type -> Type) (a : Type) where
  constructor MkProduct
  fst : f a
  snd : g a

Foldable f => Foldable g => Foldable (Product f g) where
  foldr f st (MkProduct v w)  = foldr f (foldr f st w) v
  foldl f st (MkProduct v w)  = foldl f (foldl f st v) w
  foldMap f (MkProduct v w)   = foldMap f v <+> foldMap f w
  toList  (MkProduct v w)     = toList v ++ toList w
  null (MkProduct v w)        = null v && null w
  foldlM f st (MkProduct v w) = foldlM f st v >>= \st' => foldlM f st' w

--------------------------------------------------------------------------------
--          Tests
--------------------------------------------------------------------------------

iterateTR : Nat -> (a -> a) -> a -> List a
iterateTR k f = go k Lin
  where go : Nat -> SnocList a -> a -> List a
        go 0     sx _ = sx <>> Nil
        go (S k) sx x = go k (sx :< x) (f x)


values : List Integer
values = iterateTR 100000 (+1) 0

main : IO ()
main = do
  printLn . length $ mapTR' (*2)  values
  printLn . length $ filterTR' (\n => n `mod` 2 == 0)  values
  printLn . length $ mapMayTR (\n => toMaybe (n `mod` 2 == 1) "foo")  values
  printLn . length $ concatTR values values
  printLn . length $ bindTR [1..500] (\n => iterateTR n (+1) n)

src/Solutions/Traverse.idr

module Solutions.Traverse

import Control.Applicative.Const
import Control.Monad.Identity
import Data.HList
import Data.List1
import Data.Singleton
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV

%default total

record State state a where
  constructor ST
  runST : state -> (state,a)

get : State state state
get = ST $ \s => (s,s)

put : state -> State state ()
put v = ST $ _ => (v,())

modify : (state -> state) -> State state ()
modify f = ST $ \v => (f v,())

runState : state -> State state a -> (state, a)
runState = flip runST

evalState : state -> State state a -> a
evalState s = snd . runState s

execState : state -> State state a -> state
execState s = fst . runState s

Functor (State state) where
  map f (ST run) = ST $ \s => let (s2,va) = run s in (s2, f va)

Applicative (State state) where
  pure v = ST $ \s => (s,v)
  ST fun <*> ST val = ST $ \s =>
    let (s2, f)  = fun s
        (s3, va) = val s2
     in (s3, f va)

Monad (State state) where
  ST val >>= f = ST $ \s =>
    let (s2, va) = val s
     in runST (f va) s2

--------------------------------------------------------------------------------
--          Reading CSV Tables
--------------------------------------------------------------------------------

-- 1

mapFromTraverse : Traversable t => (a -> b) -> t a -> t b
mapFromTraverse f = runIdentity . traverse (Id . f)

-- 2

-- Since Idris can't infer the type of `b` the call to `MkConst`, we have
-- to pass a value (which we can choose freely) explicitly.
foldMapFromTraverse : Traversable t => Monoid m => (a -> m) -> t a -> m
foldMapFromTraverse f = runConst . traverse (MkConst {b = ()}. f)

-- 3

interface Functor t => Foldable t => Traversable' t where
  traverse' : Applicative f => (a -> f b) -> t a -> f (t b)

Traversable' List where
  traverse' f Nil       = pure Nil
  traverse' f (x :: xs) = [| f x :: traverse' f xs |]

Traversable' List1 where
  traverse' f (h ::: t) = [| f h ::: traverse' f t |]

Traversable' (Either e) where
  traverse' f (Left ve)  = pure $ Left ve
  traverse' f (Right va) = Right <$> f va

Traversable' Maybe where
  traverse' f Nothing   = pure Nothing
  traverse' f (Just va) = Just <$> f va

-- 4

data List01 : (nonEmpty : Bool) -> Type -> Type where
  Nil  : List01 False a
  (::) : a -> List01 False a -> List01 ne a

Functor (List01 ne) where
  map f Nil       = Nil
  map f (x :: xs) = f x :: map f xs

Foldable (List01 ne) where
  foldr acc st []        = st
  foldr acc st (x :: xs) = acc x (foldr acc st xs)

Traversable (List01 ne) where
  traverse _ Nil       = pure Nil
  traverse f (x :: xs) = [| f x :: traverse f xs |]

-- 5

record Tree a where
  constructor Node
  value  : a
  forest : List (Tree a)

Forest : Type -> Type
Forest = List . Tree

treeMap : (a -> b) -> Tree a -> Tree b
treeMap f (Node value forest) = Node (f value) (go forest)
  where go : Forest a -> Forest b
        go []        = []
        go (x :: xs) = treeMap f x :: go xs

Functor Tree where map = treeMap

mutual
  foldrTree : (el -> st -> st) -> st -> Tree el -> st
  foldrTree f v (Node value forest) = f value (foldrForest f v forest)

  foldrForest : (el -> st -> st) -> st -> Forest el -> st
  foldrForest _ v []        = v
  foldrForest f v (x :: xs) = foldrTree f (foldrForest f v xs) x

Foldable Tree where
  foldr   = foldrTree

mutual
  traverseTree : Applicative f => (a -> f b) -> Tree a -> f (Tree b)
  traverseTree g (Node v fo) = [| Node (g v) (traverseForest g fo) |]

  traverseForest : Applicative f => (a -> f b) -> Forest a -> f (Forest b)
  traverseForest g []        = pure []
  traverseForest g (x :: xs) = [| traverseTree g x :: traverseForest g xs |]

Traversable Tree where
  traverse = traverseTree

-- 6

data Crud : (i : Type) -> (a : Type) -> Type where
  Create : (value : a) -> Crud i a
  Update : (id : i) -> (value : a) -> Crud i a
  Read   : (id : i) -> Crud i a
  Delete : (id : i) -> Crud i a

Functor (Crud i) where
  map f (Create value)    = Create $ f value
  map f (Update id value) = Update id $ f value
  map f (Read id)         = Read id
  map f (Delete id)       = Delete id

Foldable (Crud i) where
  foldr acc st (Create value)   = acc value st
  foldr acc st (Update _ value) = acc value st
  foldr _   st (Read _)         = st
  foldr _   st (Delete _)       = st

Traversable (Crud i) where
  traverse f (Create value)    = Create <$> f value
  traverse f (Update id value) = Update id <$> f value
  traverse f (Read id)         = pure $ Read id
  traverse f (Delete id)       = pure $ Delete id

-- 7

data Response : (e, i, a : Type) -> Type where
  Created : (id : i) -> (value : a) -> Response e i a
  Updated : (id : i) -> (value : a) -> Response e i a
  Found   : (values : List a) -> Response e i a
  Deleted : (id : i) -> Response e i a
  Error   : (err : e) -> Response e i a

Functor (Response e i) where
  map f (Created id value) = Created id $ f value
  map f (Updated id value) = Updated id $ f value
  map f (Found values)     = Found $ map f values
  map _ (Deleted id)       = Deleted id
  map _ (Error err)        = Error err

Foldable (Response e i) where
  foldr acc st (Created _ value) = acc value st
  foldr acc st (Updated _ value) = acc value st
  foldr acc st (Found values)    = foldr acc st values
  foldr _   st (Deleted _)       = st
  foldr _   st (Error _)         = st

Traversable (Response e i) where
  traverse f (Created id value) = Created id <$> f value
  traverse f (Updated id value) = Updated id <$> f value
  traverse f (Found values)     = Found <$> traverse f values
  traverse _ (Deleted id)       = pure $ Deleted id
  traverse _ (Error err)        = pure $ Error err

-- 8

record Comp (f,g : Type -> Type) (a : Type) where
  constructor MkComp
  unComp  : f (g a)

Functor f => Functor g => Functor (Comp f g) where
  map fun = MkComp . (map . map) fun . unComp

Foldable f => Foldable g => Foldable (Comp f g) where
  foldr f st (MkComp v)  = foldr (flip $ foldr f) st v

Traversable f => Traversable g => Traversable (Comp f g) where
  traverse fun = map MkComp . (traverse . traverse) fun . unComp

record Product (f,g : Type -> Type) (a : Type) where
  constructor MkProduct
  fst : f a
  snd : g a

Functor f => Functor g => Functor (Product f g) where
  map fun (MkProduct fa ga) = MkProduct (map fun fa) (map fun ga)

Foldable f => Foldable g => Foldable (Product f g) where
  foldr f st (MkProduct v w)  = foldr f (foldr f st w) v

Traversable f => Traversable g => Traversable (Product f g) where
  traverse fun (MkProduct fa ga) =
    [| MkProduct (traverse fun fa) (traverse fun ga) |]

--------------------------------------------------------------------------------
--          Programming with State
--------------------------------------------------------------------------------

-- 1

rnd : Bits64 -> Bits64
rnd seed = fromInteger
         $ (437799614237992725 * cast seed) `mod` 2305843009213693951

Gen : Type -> Type
Gen = State Bits64

-- 1.1

bits64 : Gen Bits64
bits64 = get <* modify rnd

-- 1.2

range64 : (upper : Bits64) -> Gen Bits64
range64 18446744073709551615 = bits64
range64 n                    = (`mod` (n + 1)) <$> bits64

interval64 : (a,b : Bits64) -> Gen Bits64
interval64 a b =
  let mi = min a b
      ma = max a b
   in (mi +) <$> range64 (ma - mi)

interval : Num n => Cast n Bits64 => (a,b : n) -> Gen n
interval a b = fromInteger . cast <$> interval64 (cast a) (cast b)

-- 1.3

bool : Gen Bool
bool = (== 0) <$> range64 1

-- 1.4

fin : {n : _} -> Gen (Fin $ S n)
fin = (\x => fromMaybe FZ $ natToFin x _) <$> interval 0 n

-- 1.5

element : {n : _} -> Vect (S n) a -> Gen a
element vs = (`index` vs) <$> fin

-- 1.6

vect : {n : _} -> Gen a -> Gen (Vect n a)
vect = sequence . replicate n

list : Gen Nat -> Gen a -> Gen (List a)
list gnat ga = gnat >>= \n => toList <$> vect {n} ga

testGen : Bits64 -> Gen a -> Vect 10 a
testGen seed = evalState seed . vect

-- 1.7

choice : {n : _} -> Vect (S n) (Gen a) -> Gen a
choice gens = element gens >>= id

-- 1.8

either : Gen a -> Gen b -> Gen (Either a b)
either ga gb = choice [Left <$> ga, Right <$> gb]

-- 1.9

printableAscii : Gen Char
printableAscii = chr <$> interval 32 126

-- 1.10

string : Gen Nat -> Gen Char -> Gen String
string gn = map pack . list gn

-- 1.11

namespace HListF

  public export
  data HListF : (f : Type -> Type) -> (ts : List Type) -> Type where
    Nil  : HListF f []
    (::) : (x : f t) -> (xs : HListF f ts) -> HListF f (t :: ts)

hlist : HListF Gen ts -> Gen (HList ts)
hlist Nil        = pure Nil
hlist (gh :: gt) = [| gh :: hlist gt |]

-- 1.12

hlistT : Applicative f => HListF f ts -> f (HList ts)
hlistT Nil        = pure Nil
hlistT (fh :: ft) = [| fh :: hlistT ft |]

-- 2

-- 2.1

record IxState s t a where
  constructor IxST
  runIxST : s -> (t,a)

-- 2.2

Functor (IxState s t) where
  map f (IxST run) = IxST $ \vs => let (vt,va) = run vs in (vt, f va)

-- 2.3

pure : a -> IxState s s a
pure va = IxST $ \vs => (vs,va)

(<*>) : IxState r s (a -> b) -> IxState s t a -> IxState r t b
IxST ff <*> IxST fa = IxST $ \vr =>
  let (vs,f)  = ff vr
      (vt,va) = fa vs
   in (vt, f va)

-- 2.4

(>>=) : IxState r s a -> (a -> IxState s t b) -> IxState r t b
IxST fa >>= f = IxST $ \vr =>
  let (vs,va) = fa vr in runIxST (f va) vs

(>>) : IxState r s () -> IxState s t a -> IxState r t a
IxST fu >> IxST fb = IxST $ fb . fst . fu

-- 2.5

namespace IxMonad
  interface Functor (m s t) =>
            IxApplicative (0 m : Type -> Type -> Type -> Type) where
    pure : a -> m s s a
    (<*>) : m r s (a -> b) -> m s t a -> m r t b

  interface IxApplicative m => IxMonad m where
    (>>=) : m r s a -> (a -> m s t b) -> m r t b

  IxApplicative IxState where
    pure = Traverse.pure
    (<*>) = Traverse.(<*>)

  IxMonad IxState where
    (>>=) = Traverse.(>>=)

-- 2.6

namespace IxState
  get : IxState s s s
  get = IxST $ \vs => (vs,vs)

  put : t -> IxState s t ()
  put vt = IxST $ _ => (vt,())

  modify : (s -> t) -> IxState s t ()
  modify f = IxST $ \vs => (f vs, ())

  runState : s -> IxState s t a -> (t,a)
  runState = flip runIxST

  evalState : s -> IxState s t a -> a
  evalState vs = snd . runState vs

  execState : s -> IxState s t a -> t
  execState vs = fst . runState vs

-- 2.7

Applicative (IxState s s) where
  pure = Traverse.pure
  (<*>) = Traverse.(<*>)

Monad (IxState s s) where
  (>>=) = Traverse.(>>=)
  join = (>>= id)

--------------------------------------------------------------------------------
--          The Power of Composition
--------------------------------------------------------------------------------

-- 1

data Tagged : (tag, val : Type) -> Type where
  Tag  : tag -> val -> Tagged tag val
  Pure : val -> Tagged tag val

Functor (Tagged tag) where
  map f (Tag x y) = Tag x (f y)
  map f (Pure x)  = Pure (f x)

Foldable (Tagged tag) where
  foldr f acc (Tag _ x) = f x acc
  foldr f acc (Pure x)  = f x acc

Traversable (Tagged tag) where
  traverse f (Tag x y) = Tag x <$> f y
  traverse f (Pure x)  = Pure <$> f x

Bifunctor Tagged where
  bimap f g (Tag x y) = Tag (f x) (g y)
  bimap _ g (Pure x)  = Pure (g x)

  mapFst f (Tag x y) = Tag (f x) y
  mapFst _ (Pure x)  = Pure x

  mapSnd g (Tag x y) = Tag x (g y)
  mapSnd g (Pure x)  = Pure (g x)

Bifoldable Tagged where
  bifoldr f g acc (Tag x y) = f x (g y acc)
  bifoldr f g acc (Pure x)  = g x acc

  bifoldl f g acc (Tag x y) = g (f acc x) y
  bifoldl _ g acc (Pure x)  = g acc x

  binull _ = False


Bitraversable Tagged where
  bitraverse f g (Tag x y) = [| Tag (f x) (g y) |]
  bitraverse _ g (Pure x)  = Pure <$> g x

-- 2

record Biff (p : Type -> Type -> Type) (f,g : Type -> Type) (a,b : Type) where
  constructor MkBiff
  runBiff : p (f a) (g b)

Bifunctor p => Functor f => Functor g => Bifunctor (Biff p f g) where
  bimap ff fg = MkBiff .  bimap (map ff) (map fg) . runBiff

Bifoldable p => Foldable f => Foldable g => Bifoldable (Biff p f g) where
  bifoldr ff fg acc = bifoldr (flip $ foldr ff) (flip $ foldr fg) acc . runBiff

Bitraversable p => Traversable f => Traversable g =>
  Bitraversable (Biff p f g) where
    bitraverse ff fg =
      map MkBiff . bitraverse (traverse ff) (traverse fg) . runBiff

-- 3

record Tannen (f : Type -> Type) (p : Type -> Type -> Type) (a,b : Type) where
  constructor MkTannen
  runTannen : f (p a b)

Bifunctor p => Functor f => Bifunctor (Tannen f p) where
  bimap ff fg = MkTannen .  map (bimap ff fg) . runTannen

Bifoldable p => Foldable f => Bifoldable (Tannen f p) where
  bifoldr ff fg acc = foldr (flip $ bifoldr ff fg) acc . runTannen

Bitraversable p => Traversable f => Bitraversable (Tannen f p) where
  bitraverse ff fg = map MkTannen . traverse (bitraverse ff fg) . runTannen

-- 4

data TagError : Type where
  CE         : CSVError -> TagError
  InvalidTag : (line : Nat) -> (tag : String) -> TagError
  Append     : TagError -> TagError -> TagError

Semigroup TagError where (<+>) = Append

pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = ST $ \index => (S index, (index, v))

data Color = Red | Green | Blue

readColor : String -> State Nat (Validated TagError Color)
readColor s = uncurry decodeTag . (`MkPair` s) <$> get
  where decodeTag : Nat -> String -> Validated TagError Color
        decodeTag k "red"   = pure Red
        decodeTag k "green" = pure Green
        decodeTag k "blue"  = pure Blue
        decodeTag k s       = Invalid $ InvalidTag k s

readTaggedLine : String -> Tagged String String
readTaggedLine s = case split ('#' ==) s of
  h ::: [t] => Tag t h
  _         => Pure s

tagAndDecodeTE :  (0 ts : List Type)
               -> CSVLine (HList ts)
               => String
               -> State Nat (Validated TagError (HList ts))
tagAndDecodeTE ts s = mapFst CE . uncurry (hdecode ts) <$> pairWithIndex s

readTagged :  (0 ts : List Type)
           -> CSVLine (HList ts)
           => String
           -> Validated TagError (List $ Tagged Color $ HList ts)
readTagged ts = map runTannen
              . evalState 1
              . bitraverse @{%search} @{Compose} readColor (tagAndDecodeTE ts)
              . MkTannen {f = List} {p = Tagged}
              . map readTaggedLine
              . lines

validInput : String
validInput = """
  f,12,-13.01#green
  t,100,0.0017
  t,1,100.8#blue
  f,255,0.0
  f,24,1.12e17
  """

invalidInput : String
invalidInput = """
  o,12,-13.01#yellow
  t,100,0.0017
  t,1,abc
  f,256,0.0
  f,24,1.12e17
  """

src/Solutions/DPair.idr

module Solutions.DPair

import Control.Monad.State

import Data.DPair
import Data.Either
import Data.HList
import Data.List
import Data.List1
import Data.Singleton
import Data.String
import Data.Vect

import Text.CSV

import System.File

%default total

--------------------------------------------------------------------------------
--          Dependent Pairs
--------------------------------------------------------------------------------

-- 1

filterVect : (a -> Bool) -> Vect m a -> (n ** Vect n a)
filterVect f []        = (_ ** [])
filterVect f (x :: xs) = case f x of
  True  => let (_ ** ys= filterVect f xs in (_ ** x :: ys)
  False => filterVect f xs

-- 2

mapMaybeVect : (a -> Maybe b) -> Vect m a -> (n ** Vect n b)
mapMaybeVect f []        = (_ ** [])
mapMaybeVect f (x :: xs) = case f x of
  Just v  => let (_ ** vs= mapMaybeVect f xs in (_ ** v :: vs)
  Nothing => mapMaybeVect f xs

-- 3

dropWhileVect : (a -> Bool) -> Vect m a -> Exists (\n => Vect n a)
dropWhileVect f []        = Evidence _ []
dropWhileVect f (x :: xs) = case f x of
  True  => dropWhileVect f xs
  False => Evidence _ (x :: xs)

-- 4

vectLength : Vect n a -> Singleton n
vectLength []        = Val 0
vectLength (x :: xs) = let Val k = vectLength xs in Val (S k)

dropWhileVect' : (a -> Bool) -> Vect m a -> (n ** Vect n a)
dropWhileVect' f xs =
  let Evidence _ ys = dropWhileVect f xs
      Val n         = vectLength ys
   in (n ** ys)

--------------------------------------------------------------------------------
--          Use Case: Nucleic Acids
--------------------------------------------------------------------------------

-- 1

data BaseType = DNABase | RNABase

data Nucleobase' : BaseType -> Type where
  Adenine'  : Nucleobase' b
  Cytosine' : Nucleobase' b
  Guanine'  : Nucleobase' b
  Thymine'  : Nucleobase' DNABase
  Uracile'  : Nucleobase' RNABase

RNA' : Type
RNA' = List (Nucleobase' RNABase)

DNA' : Type
DNA' = List (Nucleobase' DNABase)

Acid1 : Type
Acid1 = (b ** List (Nucleobase' b))

record Acid2 where
  constructor MkAcid2
  baseType : BaseType
  sequence : List (Nucleobase' baseType)

data Acid3 : Type where
  SomeRNA : RNA' -> Acid3
  SomeDNA : DNA' -> Acid3

nb12 : Acid1 -> Acid2
nb12 (fst ** snd= MkAcid2 fst snd

nb21 : Acid2 -> Acid1
nb21 (MkAcid2 bt seq) = (bt ** seq)

nb13 : Acid1 -> Acid3
nb13 (DNABase ** snd= SomeDNA snd
nb13 (RNABase ** snd= SomeRNA snd

nb31 : Acid3 -> Acid1
nb31 (SomeRNA xs) = (RNABase ** xs)
nb31 (SomeDNA xs) = (DNABase ** xs)

-- 2

data Dir = Sense | Antisense

data Nucleobase : BaseType -> Dir -> Type where
  Adenine  : Nucleobase b d
  Cytosine : Nucleobase b d
  Guanine  : Nucleobase b d
  Thymine  : Nucleobase DNABase d
  Uracile  : Nucleobase RNABase d

RNA : Dir -> Type
RNA d = List (Nucleobase RNABase d)

DNA : Dir -> Type
DNA d = List (Nucleobase DNABase d)

-- 3

inverse : Dir -> Dir
inverse Sense     = Antisense
inverse Antisense = Sense

complementBase :  (b : BaseType)
               -> Nucleobase b dir
               -> Nucleobase b (inverse dir)
complementBase DNABase Adenine  = Thymine
complementBase RNABase Adenine  = Uracile
complementBase _       Cytosine = Guanine
complementBase _       Guanine  = Cytosine
complementBase _       Thymine  = Adenine
complementBase _       Uracile  = Adenine

complement :  (b : BaseType)
           -> List (Nucleobase b dir)
           -> List (Nucleobase b $ inverse dir)
complement b = map (complementBase b)

transcribeBase : Nucleobase DNABase Antisense -> Nucleobase RNABase Sense
transcribeBase Adenine  = Uracile
transcribeBase Cytosine = Guanine
transcribeBase Guanine  = Cytosine
transcribeBase Thymine  = Adenine

transcribe : DNA Antisense -> RNA Sense
transcribe = map transcribeBase

transcribeAny : (dir : Dir) -> DNA dir -> RNA Sense
transcribeAny Antisense = transcribe
transcribeAny Sense     = transcribe . complement _

-- 4

record NucleicAcid where
  constructor MkNucleicAcid
  baseType : BaseType
  dir      : Dir
  sequence : List (Nucleobase baseType dir)

-- 5

readAnyBase : {0 dir : _} -> Char -> Maybe (Nucleobase b dir)
readAnyBase 'A' = Just Adenine
readAnyBase 'C' = Just Cytosine
readAnyBase 'G' = Just Guanine
readAnyBase _   = Nothing

readRNABase : {0 dir : _} -> Char -> Maybe (Nucleobase RNABase dir)
readRNABase 'U' = Just Uracile
readRNABase c   = readAnyBase c

readDNABase : {0 dir : _} -> Char -> Maybe (Nucleobase DNABase dir)
readDNABase 'T' = Just Thymine
readDNABase c   = readAnyBase c

readRNA : String -> Maybe (dir : Dir ** RNA dir)
readRNA str = case forget $ split ('-' ==) str of
  ["5´",s,"3´"] => MkDPair Sense     <$> traverse readRNABase (unpack s)
  ["3´",s,"5´"] => MkDPair Antisense <$> traverse readRNABase (unpack s)
  _             => Nothing

readDNA : String -> Maybe (dir : Dir ** DNA dir)
readDNA str = case forget $ split ('-' ==) str of
  ["5´",s,"3´"] => MkDPair Sense     <$> traverse readDNABase (unpack s)
  ["3´",s,"5´"] => MkDPair Antisense <$> traverse readDNABase (unpack s)
  _             => Nothing

-- 6

preSuf : Dir -> (String,String)
preSuf Sense     = ("5´-", "-3´")
preSuf Antisense = ("3´-", "-5´")

encodeBase : Nucleobase c d -> Char
encodeBase Adenine  = 'A'
encodeBase Cytosine = 'C'
encodeBase Guanine  = 'G'
encodeBase Thymine  = 'T'
encodeBase Uracile  = 'U'

encode : (dir : Dir) -> List (Nucleobase b dir) -> String
encode dir seq =
  let (pre,suf) = preSuf dir
   in pre ++ pack (map encodeBase seq) ++ suf

-- 7

public export
data InputError : Type where
  UnknownBaseType : String -> InputError
  InvalidSequence : String -> InputError

readAcid :  (b : BaseType)
         -> String
         -> Either InputError (d ** List $ Nucleobase b d)
readAcid b str =
  let err = InvalidSequence str
   in case b of
        DNABase => maybeToEither err $ readDNA str
        RNABase => maybeToEither err $ readRNA str

toAcid : (b : BaseType) -> (d ** List $ Nucleobase b d-> NucleicAcid
toAcid b (d ** seq= MkNucleicAcid b d seq

getNucleicAcid : IO (Either InputError NucleicAcid)
getNucleicAcid = do
  baseString <- getLine
  case baseString of
    "DNA" => map (toAcid _) . readAcid DNABase <$> getLine
    "RNA" => map (toAcid _) . readAcid RNABase <$> getLine
    _     => pure $ Left (UnknownBaseType baseString)

printRNA : RNA Sense -> IO ()
printRNA = putStrLn . encode _

transcribeProg : IO ()
transcribeProg = do
  Right (MkNucleicAcid b d seq) <- getNucleicAcid
    | Left (InvalidSequence str) => putStrLn $ "Invalid sequence: " ++ str
    | Left (UnknownBaseType str) => putStrLn $ "Unknown base type: " ++ str
  case b of
    DNABase => printRNA $ transcribeAny d seq
    RNABase => case d of
      Sense     => printRNA seq
      Antisense => printRNA $ complement _ seq

--------------------------------------------------------------------------------
--          Use Case: CSV Files with a Schema
--------------------------------------------------------------------------------

-- A lot of code was copy-pasted from the chapter's text and is, therefore
-- not very interesting. I tried to annotate the new parts with some hints
-- for better understanding. Also, instead of grouping code by exercise number,
-- I organized it thematically.



--   *** Types ***



-- I used an indexed type here to make sure, data
-- constructor `Optional` takes only non-nullary types
-- as arguments. As noted in exercise 3, having a nesting
-- of nullary types does not make sense without a way to
-- distinguish between a `Nothing` and a `Just Nothing`,
-- both of which would be encoded as the empty string.
-- For `Finite`, we have to add `n` as an argument to the
-- data constructor, so we can use it to decode values
-- of type `Fin n`.
data ColType0 : (nullary : Bool) -> Type where
  I64      : ColType0 b
  Str      : ColType0 b
  Boolean  : ColType0 b
  Float    : ColType0 b
  Natural  : ColType0 b
  BigInt   : ColType0 b
  Finite   : Nat -> ColType0 b
  Optional : ColType0 False -> ColType0 True

-- This is the type used in schemata, where nullary types
-- are explicitly allowed.
ColType : Type
ColType = ColType0 True

Schema : Type
Schema = List ColType

-- The only interesting new parts are the last two
-- lines. They should be pretty self-explanatory.
IdrisType : ColType0 b -> Type
IdrisType I64          = Int64
IdrisType Str          = String
IdrisType Boolean      = Bool
IdrisType Float        = Double
IdrisType Natural      = Nat
IdrisType BigInt       = Integer
IdrisType (Finite n)   = Fin n
IdrisType (Optional t) = Maybe $ IdrisType t

Row : Schema -> Type
Row = HList . map IdrisType

record Table where
  constructor MkTable
  schema : Schema
  size   : Nat
  rows   : Vect size (Row schema)

data Error : Type where
  ExpectedEOI    : (pos : Nat) -> String -> Error
  ExpectedLine   : Error
  InvalidCell    : (row, col : Nat) -> ColType0 b -> String -> Error
  NoNat          : String -> Error
  OutOfBounds    : (size : Nat) -> (index : Nat) -> Error
  ReadError      : (path : String) -> FileError -> Error
  UnexpectedEOI  : (pos : Nat) -> String -> Error
  UnknownCommand : String -> Error
  UnknownType    : (pos : Nat) -> String -> Error
  WriteError     : (path : String) -> FileError -> Error

-- Oh, the type of `Query` is a nice one. :-)
-- `PrintTable`, on the other hand, is trivial.
-- The save and load commands are special: They will
-- already have carried out their tasks after parsing.
-- This allow us to keep `applyCommand` pure.
data Command : (t : Table) -> Type where
  PrintSchema : Command t
  PrintSize   : Command t
  PrintTable  : Command t
  Load        : Table -> Command t
  Save        : Command t
  New         : (newSchema : Schema) -> Command t
  Prepend     : Row (schema t) -> Command t
  Get         : Fin (size t) -> Command t
  Delete      : Fin (size t) -> Command t
  Quit        : Command t
  Query       :  (ix  : Fin (length $ schema t))
              -> (val : IdrisType $ indexList (schema t) ix)
              -> Command t



--   *** Core Functionality ***



-- Compares two values for equality.
eq : (c : ColType0 b) -> IdrisType c -> IdrisType c -> Bool
eq I64          x        y        = x == y
eq Str          x        y        = x == y
eq Boolean      x        y        = x == y
eq Float        x        y        = x == y
eq Natural      x        y        = x == y
eq BigInt       x        y        = x == y
eq (Finite k)   x        y        = x == y
eq (Optional z) (Just x) (Just y) = eq z x y
eq (Optional z) Nothing  Nothing  = True
eq (Optional z) _        _        = False

-- Note: It would have been quite a bit easier to type and
-- implement this, had we used a heterogeneous vector instead
-- of a heterogeneous list for encoding table rows. However,
-- I still think it's pretty cool that this type checks!
eqAt :  (ts  : Schema)
     -> (ix  : Fin $ length ts)
     -> (val : IdrisType $ indexList ts ix)
     -> (row : Row ts)
     -> Bool
eqAt (x :: _)  FZ     val (v :: _)  = eq x val v
eqAt (_ :: xs) (FS y) val (_ :: vs) = eqAt xs y val vs
eqAt []        _      _   _ impossible

-- Most new commands don't change the table,
-- so their cases are trivial. The exception is
-- `Load`, which replaces the table completely.
applyCommand : (t : Table) -> Command t -> Table
applyCommand t                 PrintSchema    = t
applyCommand t                 PrintSize      = t
applyCommand t                 PrintTable     = t
applyCommand t                 Save           = t
applyCommand _                 (Load t')      = t'
applyCommand _                 (New ts)       = MkTable ts _ []
applyCommand (MkTable ts n rs) (Prepend r)    = MkTable ts _ $ r :: rs
applyCommand t                 (Get x)        = t
applyCommand t                 Quit           = t
applyCommand t                 (Query ix val) = t
applyCommand (MkTable ts n rs) (Delete x)  = case n of
  S k => MkTable ts k (deleteAt x rs)
  Z   => absurd x



--   *** Parsers ***



zipWithIndex : Traversable t => t a -> t (Nat, a)
zipWithIndex = evalState 1 . traverse pairWithIndex
  where pairWithIndex : a -> State Nat (Nat,a)
        pairWithIndex v = (,v) <$> get <* modify S

fromCSV : String -> List String
fromCSV = forget . split (',' ==)

-- Reads a primitive (non-nullary) type. This is therefore
-- universally quantified over parameter `b`.
-- The only interesting part is the parsing of `finXYZ`,
-- where we `break` the string at the occurrence of
-- the first digit.
readPrim : Nat -> String -> Either Error (ColType0 b)
readPrim _ "i64"      = Right I64
readPrim _ "str"      = Right Str
readPrim _ "boolean"  = Right Boolean
readPrim _ "float"    = Right Float
readPrim _ "natural"  = Right Natural
readPrim _ "bigint"   = Right BigInt
readPrim n s          =
  let err = Left $ UnknownType n s
   in case break isDigit s of
        ("fin",r) => maybe err (Right . Finite) $ parsePositive r
        _         => err

-- This is the parser for (possibly nullary) column types.
-- A nullary type is encoded as the corresponding non-nullary
-- type with a question mark appended. We therefore first check
-- for the presence of said question mark at the end of the string.
readColType : Nat -> String -> Either Error ColType
readColType n s = case reverse (unpack s) of
  '?' :: t => Optional <$> readPrim n (pack $ reverse t)
  _        => readPrim n s

readSchema : String -> Either Error Schema
readSchema = traverse (uncurry readColType) . zipWithIndex . fromCSV

readSchemaList : List String -> Either Error Schema
readSchemaList [s] = readSchema s
readSchemaList _   = Left ExpectedLine

-- For all except nullary types we can just use the `CSVField`
-- implementation for reading values.
-- For values of nullary types, we treat the empty string specially.
decodeF : (c : ColType0 b) -> String -> Maybe (IdrisType c)
decodeF I64          s  = read s
decodeF Str          s  = read s
decodeF Boolean      s  = read s
decodeF Float        s  = read s
decodeF Natural      s  = read s
decodeF BigInt       s  = read s
decodeF (Finite k)   s  = read s
decodeF (Optional y) "" = Just Nothing
decodeF (Optional y) s  = Just <$> decodeF y s

decodeField : (row,col : Nat) -> (c : ColType0 b) -> String -> Either Error (IdrisType c)
decodeField row k c s = maybeToEither (InvalidCell row k c s) $ decodeF c s

decodeRow : {ts : _} -> (row : Nat) -> String -> Either Error (Row ts)
decodeRow row s = go 1 ts $ fromCSV s
  where go : Nat -> (cs : Schema) -> List String -> Either Error (Row cs)
        go k []       []         = Right []
        go k []       (_ :: _)   = Left $ ExpectedEOI k s
        go k (_ :: _) []         = Left $ UnexpectedEOI k s
        go k (c :: cs) (s :: ss) = [| decodeField row k c s :: go (S k) cs ss |]

decodeRows : {ts : _} -> List String -> Either Error (List $ Row ts)
decodeRows = traverse (uncurry decodeRow) . zipWithIndex

readFin : {n : _} -> String -> Either Error (Fin n)
readFin s = do
  S k <- maybeToEither (NoNat s) $ parsePositive {a = Nat} s
    | Z => Left $ OutOfBounds n Z
  maybeToEither (OutOfBounds n $ S k) $ natToFin k n

readCommand :  (t : Table) -> String -> Either Error (Command t)
readCommand _                "schema"  = Right PrintSchema
readCommand _                "size"    = Right PrintSize
readCommand _                "table"   = Right PrintTable
readCommand _                "quit"    = Right Quit
readCommand (MkTable ts n _) s         = case words s of
  ["new",    str]    => New     <$> readSchema str
  "add" ::   ss      => Prepend <$> decodeRow 1 (unwords ss)
  ["get",    str]    => Get     <$> readFin str
  ["delete", str]    => Delete  <$> readFin str
  "query" :: n :: ss => do
    ix  <- readFin n
    val <- decodeField 1 1 (indexList ts ix) (unwords ss)
    pure $ Query ix val
  _                  => Left $ UnknownCommand s



--   *** Printers ***



toCSV : List String -> String
toCSV = concat . intersperse ","

-- We mark optional type by appending a question
-- mark after the corresponding non-nullary type.
showColType : ColType0 b -> String
showColType I64          = "i64"
showColType Str          = "str"
showColType Boolean      = "boolean"
showColType Float        = "float"
showColType Natural      = "natural"
showColType BigInt       = "bigint"
showColType (Finite n)   = "fin\{show n}"
showColType (Optional t) = showColType t ++ "?"

-- Again, only nullary values are treated specially. This
-- is another case of a dependent pattern match: We use
-- explicit pattern matches on the value to encode based
-- on the type calculated from the `ColType0 b` parameter.
-- There are few languages capable of expressing this as
-- cleanly as Idris does.
encodeField : (t : ColType0 b) -> IdrisType t -> String
encodeField I64          x        = show x
encodeField Str          x        = x
encodeField Boolean      True     = "t"
encodeField Boolean      False    = "f"
encodeField Float        x        = show x
encodeField Natural      x        = show x
encodeField BigInt       x        = show x
encodeField (Finite k)   x        = show x
encodeField (Optional y) (Just v) = encodeField y v
encodeField (Optional y) Nothing  = ""

encodeFields : (ts : Schema) -> Row ts -> Vect (length ts) String
encodeFields []        []        = []
encodeFields (c :: cs) (v :: vs) = encodeField c v :: encodeFields cs vs

encodeTable : Table -> String
encodeTable (MkTable ts _ rows) =
  unlines . toList $ map (toCSV . toList . encodeFields ts) rows

encodeSchema : Schema -> String
encodeSchema = toCSV . map showColType

-- Pretty printing a table plus header. All cells are right-padded
-- with spaces to adjust their size to the cell with the longest
-- entry for each colum.
-- Value `ls` is a `Vect n Nat` holding these lengths.
-- Here is an example of how the output looks like:
--
-- fin100 | boolean | natural | str         | bigint?
-- --------------------------------------------------
-- 88     | f       | 10      | stefan      |
-- 13     | f       | 10      | hock        | -100
-- 58     | t       | 1000    | hello world | -1234
--
-- Ideally, numeric values would be right-aligned, but since this
-- whole exercise is already quite long and complex, I refrained
-- from adding this luxury.
prettyTable :  {n : _}
            -> (header : Vect n String)
            -> (table  : Vect m (Vect n String))
            -> String
prettyTable h t =
  let -- vector holding the maximal length of each column
      ls  = foldl (zipWith $ \k => max k . length) (replicate n Z) (h::t)

      -- horizontal bar used to separate the header from the rows
      bar = concat . intersperse "---" $ map (`replicate` '-') ls
   in unlines . toList $ line ls h :: bar :: map (line ls) t

  where pad : Nat -> String -> String
        pad v = padRight v ' '

        -- given a vector of lengths, pads each string to the
        -- desired length, separating cells with a vertical bar.
        line : Vect n Nat -> Vect n String -> String
        line lengths = concat . intersperse " | " . zipWith pad lengths

printTable :  (cs   : List ColType)
           -> (rows : Vect n (Row cs))
           -> String
printTable cs rows =
  let header  = map showColType $ fromList cs
      table   = map (encodeFields cs) rows
   in prettyTable header table

allTypes : String
allTypes = concat
         . List.intersperse ", "
         . map (showColType {b = True})
         $ [I64,Str,Boolean,Float]

showError : Error -> String
showError ExpectedLine = """
  Error when reading schema.
  Expected a single line of content.
  """

showError (UnknownCommand x) = """
  Unknown command: \{x}.
  Known commands are: clear, schema, size, table, new, add, get, delete, quit.
  """

showError (UnknownType pos x) = """
  Unknown type at position \{show pos}: \{x}.
  Known types are: \{allTypes}.
  """

showError (InvalidCell row col tpe x) = """
  Invalid value at row \{show row}, column \{show col}.
  Expected type: \{showColType tpe}.
  Value found: \{x}.
  """

showError (ExpectedEOI k x) = """
  Expected end of input.
  Position: \{show k}
  Input: \{x}
  """

showError (UnexpectedEOI k x) = """
  Unxpected end of input.
  Position: \{show k}
  Input: \{x}
  """

showError (OutOfBounds size index) = """
  Index out of bounds.
  Size of table: \{show size}
  Index: \{show index}
  Note: Indices start at zero.
  """

showError (WriteError path err) = """
  Error when writing file \{path}.
  Message: \{show err}
  """

showError (ReadError path err) = """
  Error when reading file \{path}.
  Message: \{show err}
  """

showError (NoNat x) = "Not a natural number: \{x}"

result :  (t : Table) -> Command t -> String
result t PrintSchema    = "Current schema: \{encodeSchema t.schema}"
result t PrintSize      = "Current size: \{show t.size}"
result t PrintTable     = "Table:\n\n\{printTable t.schema t.rows}"
result _ Save           = "Table written to disk."
result _ (Load t)       = "Table loaded. Schema: \{encodeSchema t.schema}"
result _ (New ts)       = "Created table. Schema: \{encodeSchema ts}"
result t (Prepend r)    = "Row prepended:\n\n\{printTable t.schema [r]}"
result _ (Delete x)     = "Deleted row: \{show $ FS x}."
result _ Quit           = "Goodbye."
result t (Query ix val) =
  let (_ ** rs= filter (eqAt t.schema ix val) t.rows
   in "Result:\n\n\{printTable t.schema rs}"
result t (Get x)        =
  "Row \{show $ FS x}:\n\n\{printTable t.schema [index x t.rows]}"



--   *** File IO ***



-- We use partial function `readFile` for simplicity here.
partial
load :  (path   : String)
     -> (decode : List String -> Either Error a)
     -> IO (Either Error a)
load path decode = do
  Right ls <- readFile path
    | Left err        => pure $ Left (ReadError path err)
  pure $ decode (filter (not . null) $ lines ls)

write : (path : String) -> (content : String) -> IO (Either Error ())
write path content = mapFst (WriteError path) <$> writeFile path content

namespace IOEither
  export
  (>>=) : IO (Either err a) -> (a -> IO (Either err b)) -> IO (Either err b)
  ioa >>= f = Prelude.(>>=) ioa (either (pure . Left) f)

  export
  (>>) : IO (Either err ()) -> IO (Either err a) -> IO (Either err a)
  (>>) x y = x >>= const y

  export
  pure : a -> IO (Either err a)
  pure = Prelude.pure . Right

partial
readCommandIO : (t : Table) -> String -> IO (Either Error (Command t))
readCommandIO t s = case words s of
  ["save", pth] => IOEither.do
    write (pth ++ ".schema") (encodeSchema t.schema)
    write (pth ++ ".csv") (encodeTable t)
    pure Save

  ["load", pth] => IOEither.do
    schema <- load (pth ++ ".schema") readSchemaList
    rows   <- load (pth ++ ".csv") (decodeRows {ts = schema})
    pure . Load $ MkTable schema (length rows) (fromList rows)

  _ => Prelude.pure $ readCommand t s



--   *** Main Loop ***



partial
runProg : Table -> IO ()
runProg t = do
  putStr "Enter a command: "
  str <- getLine
  cmd <- readCommandIO t str
  case cmd of
    Left err   => putStrLn (showError err) >> runProg t
    Right Quit => putStrLn (result t Quit)
    Right cmd  => putStrLn (result t cmd) >>
                  runProg (applyCommand t cmd)

partial
main : IO ()
main = runProg $ MkTable [] _ []

src/Solutions/Eq.idr

module Solutions.Eq

import Data.HList
import Data.Vect
import Decidable.Equality

%default total

data ColType = I64 | Str | Boolean | Float

Schema : Type
Schema = List ColType

IdrisType : ColType -> Type
IdrisType I64     = Int64
IdrisType Str     = String
IdrisType Boolean = Bool
IdrisType Float   = Double

Row : Schema -> Type
Row = HList . map IdrisType

record Table where
  constructor MkTable
  schema : Schema
  size   : Nat
  rows   : Vect size (Row schema)

data SameColType : (c1, c2 : ColType) -> Type where
  SameCT : SameColType c1 c1

--------------------------------------------------------------------------------
--          Equality as a Type
--------------------------------------------------------------------------------

-- 1

sctReflexive : SameColType c1 c1
sctReflexive = SameCT

-- 2

sctSymmetric : SameColType c1 c2 -> SameColType c2 c1
sctSymmetric SameCT = SameCT

-- 3

sctTransitive : SameColType c1 c2 -> SameColType c2 c3 -> SameColType c1 c3
sctTransitive SameCT SameCT = SameCT

-- 4

sctCong : (f : ColType -> a) -> SameColType c1 c2 -> f c1 = f c2
sctCong f SameCT = Refl

-- 5

natEq : (n1,n2 : Nat) -> Maybe (n1 = n2)
natEq 0     0     = Just Refl
natEq (S k) (S j) = (\x => cong S x) <$> natEq k j
natEq (S k) 0     = Nothing
natEq 0     (S _) = Nothing

-- 6

appRows : {ts1 : _} -> Row ts1 -> Row ts2 -> Row (ts1 ++ ts2)
appRows {ts1 = []}     Nil      y = y
appRows {ts1 = _ :: _} (h :: t) y = h :: appRows t y

zip : Table -> Table -> Maybe Table
zip (MkTable s1 m rs1) (MkTable s2 n rs2) = case natEq m n of
  Just Refl => Just $ MkTable _ _ (zipWith appRows rs1 rs2)
  Nothing   => Nothing

--------------------------------------------------------------------------------
--          Programs as Proofs
--------------------------------------------------------------------------------

-- 1

mapIdEither : (ea : Either e a) -> map Prelude.id ea = ea
mapIdEither (Left ve)  = Refl
mapIdEither (Right va) = Refl

-- 2

mapIdList : (as : List a) -> map Prelude.id as = as
mapIdList []        = Refl
mapIdList (x :: xs) = cong (x ::) $ mapIdList xs

-- 3

data BaseType = DNABase | RNABase

data Nucleobase : BaseType -> Type where
  Adenine  : Nucleobase b
  Cytosine : Nucleobase b
  Guanine  : Nucleobase b
  Thymine  : Nucleobase DNABase
  Uracile  : Nucleobase RNABase

NucleicAcid : BaseType -> Type
NucleicAcid = List . Nucleobase

complementBase : (b : BaseType) -> Nucleobase b -> Nucleobase b
complementBase DNABase Adenine  = Thymine
complementBase RNABase Adenine  = Uracile
complementBase _       Cytosine = Guanine
complementBase _       Guanine  = Cytosine
complementBase _       Thymine  = Adenine
complementBase _       Uracile  = Adenine

complement : (b : BaseType) -> NucleicAcid b -> NucleicAcid b
complement b = map (complementBase b)

complementBaseId :  (b  : BaseType)
                 -> (nb : Nucleobase b)
                 -> complementBase b (complementBase b nb) = nb
complementBaseId DNABase Adenine  = Refl
complementBaseId RNABase Adenine  = Refl
complementBaseId DNABase Cytosine = Refl
complementBaseId RNABase Cytosine = Refl
complementBaseId DNABase Guanine  = Refl
complementBaseId RNABase Guanine  = Refl
complementBaseId DNABase Thymine  = Refl
complementBaseId RNABase Uracile  = Refl

complementId :  (b  : BaseType)
             -> (na : NucleicAcid b)
             -> complement b (complement b na) = na
complementId b []        = Refl
complementId b (x :: xs) =
  cong2 (::) (complementBaseId b x) (complementId b xs)

-- 4

replaceVect : Fin n -> a -> Vect n a -> Vect n a
replaceVect FZ     v (x :: xs) = v :: xs
replaceVect (FS k) v (x :: xs) = x :: replaceVect k v xs

indexReplace :  (ix : Fin n)
             -> (v : a)
             -> (as : Vect n a)
             -> index ix (replaceVect ix v as) = v
indexReplace FZ     v (x :: xs) = Refl
indexReplace (FS k) v (x :: xs) = indexReplace k v xs

-- 5

insertVect : (ix : Fin (S n)) -> a -> Vect n a -> Vect (S n) a
insertVect FZ     v xs        = v :: xs
insertVect (FS k) v (x :: xs) = x :: insertVect k v xs

indexInsert :  (ix : Fin (S n))
             -> (v : a)
             -> (as : Vect n a)
             -> index ix (insertVect ix v as) = v
indexInsert FZ     v xs        = Refl
indexInsert (FS k) v (x :: xs) = indexInsert k v xs

--------------------------------------------------------------------------------
--          Into the Void
--------------------------------------------------------------------------------

-- 1

Uninhabited (Vect (S n) Void) where
  uninhabited (_ :: _) impossible

-- 2

Uninhabited a => Uninhabited (Vect (S n) a) where
  uninhabited = uninhabited . head

-- 3

notSym : Not (a = b) -> Not (b = a)
notSym f prf = f $ sym prf

-- 4

notTrans : a = b -> Not (b = c) -> Not (a = c)
notTrans ab f ac = f $ trans (sym ab) ac

-- 5

data Crud : (i : Type) -> (a : Type) -> Type where
  Create : (value : a) -> Crud i a
  Update : (id : i) -> (value : a) -> Crud i a
  Read   : (id : i) -> Crud i a
  Delete : (id : i) -> Crud i a

Uninhabited a => Uninhabited i => Uninhabited (Crud i a) where
  uninhabited (Create value)    = uninhabited value
  uninhabited (Update id value) = uninhabited value
  uninhabited (Read id)         = uninhabited id
  uninhabited (Delete id)       = uninhabited id

-- 6

namespace DecEq
  DecEq ColType where
    decEq I64 I64         = Yes Refl
    decEq I64 Str         = No $ \case Refl impossible
    decEq I64 Boolean     = No $ \case Refl impossible
    decEq I64 Float       = No $ \case Refl impossible

    decEq Str I64         = No $ \case Refl impossible
    decEq Str Str         = Yes Refl
    decEq Str Boolean     = No $ \case Refl impossible
    decEq Str Float       = No $ \case Refl impossible

    decEq Boolean I64     = No $ \case Refl impossible
    decEq Boolean Str     = No $ \case Refl impossible
    decEq Boolean Boolean = Yes Refl
    decEq Boolean Float   = No $ \case Refl impossible

    decEq Float I64       = No $ \case Refl impossible
    decEq Float Str       = No $ \case Refl impossible
    decEq Float Boolean   = No $ \case Refl impossible
    decEq Float Float     = Yes Refl

-- 7

ctNat : ColType -> Nat
ctNat I64     = 0
ctNat Str     = 1
ctNat Boolean = 2
ctNat Float   = 3

ctNatInjective : (c1,c2 : ColType) -> ctNat c1 = ctNat c2 -> c1 = c2
ctNatInjective I64     I64     Refl = Refl
ctNatInjective Str     Str     Refl = Refl
ctNatInjective Boolean Boolean Refl = Refl
ctNatInjective Float   Float   Refl = Refl

DecEq ColType where
  decEq c1 c2 = case decEq (ctNat c1) (ctNat c2) of
    Yes prf    => Yes $ ctNatInjective c1 c2 prf
    No  contra => No $ \x => contra $ cong ctNat x

--------------------------------------------------------------------------------
--          Rewrite Rules
--------------------------------------------------------------------------------

-- 1

psuccRightSucc : (m,n : Nat) -> S (m + n) = m + S n
psuccRightSucc 0     n = Refl
psuccRightSucc (S k) n = cong S $ psuccRightSucc k n

-- 2

minusSelfZero : (n : Nat) -> minus n n = 0
minusSelfZero 0     = Refl
minusSelfZero (S k) = minusSelfZero k

-- 3

minusZero : (n : Nat) -> minus n 0 = n
minusZero 0     = Refl
minusZero (S k) = Refl

-- 4

timesOneLeft : (n : Nat) -> 1 * n = n
timesOneLeft 0     = Refl
timesOneLeft (S k) = cong S $ timesOneLeft k

timesOneRight : (n : Nat) -> n * 1 = n
timesOneRight 0     = Refl
timesOneRight (S k) = cong S $ timesOneRight k


-- 5

plusCommutes : (m,n : Nat) -> m + n = n + m
plusCommutes 0     n = rewrite plusZeroRightNeutral n in Refl
plusCommutes (S k) n =
  rewrite sym (psuccRightSucc n k)
  in cong S (plusCommutes k n)

-- 6

mapOnto : (a -> b) -> Vect k b -> Vect m a -> Vect (k + m) b
mapOnto            _ xs []        =
  rewrite plusZeroRightNeutral k in reverse xs
mapOnto {m = S m'} f xs (y :: ys) =
  rewrite sym (plusSuccRightSucc k m') in mapOnto f (f y :: xs) ys

mapTR : (a -> b) -> Vect n a -> Vect n b
mapTR f = mapOnto f Nil

-- 7

mapAppend :  (f : a -> b)
          -> (xs : List a)
          -> (ys : List a)
          -> map f (xs ++ ys) = map f xs ++ map f ys
mapAppend f []        ys = Refl
mapAppend f (x :: xs) ys = cong (f x ::) $ mapAppend f xs ys

-- 8

zip2 : Table -> Table -> Maybe Table
zip2 (MkTable s1 m rs1) (MkTable s2 n rs2) = case decEq m n of
  Yes Refl =>
    let rs2 = zipWith (++) rs1 rs2
     in Just $ MkTable (s1 ++ s2) _ (rewrite mapAppend IdrisType s1 s2 in rs2)
  No  _    => Nothing

src/Solutions/Predicates.idr

module Solutions.Predicates

import Data.Vect
import Decidable.Equality

%default total

--------------------------------------------------------------------------------
--          Preconditions
--------------------------------------------------------------------------------

data NonEmpty : (as : List a) -> Type where
  IsNonEmpty : NonEmpty (h :: t)

-- 1

tail : (as : List a) -> (0 _ : NonEmpty as) => List a
tail (_ :: xs) = xs
tail [] impossible

-- 2

concat1 : Semigroup a => (as : List a) -> (0 _ : NonEmpty as) => a
concat1 (h :: t) = foldl (<+>) h t

foldMap1 : Semigroup m => (a -> m) -> (as : List a) -> (0 _ : NonEmpty as) => m
foldMap1 f (h :: t) = foldl (\x,y => x <+> f y) (f h) t

-- 3

maximum : Ord a => (as : List a) -> (0 _ : NonEmpty as) => a
maximum (x :: xs) = foldl max x xs

minimum : Ord a => (as : List a) -> (0 _ : NonEmpty as) => a
minimum (x :: xs) = foldl min x xs

-- 4

data Positive : Nat -> Type where
  IsPositive : Positive (S n)

saveDiv : (m,n : Nat) -> (0 _ : Positive n) => Nat
saveDiv m (S k) = go 0 m k
  where go : (res, rem, sub : Nat) -> Nat
        go res 0       _     = res
        go res (S rem) 0     = go (res + 1) rem k
        go res (S rem) (S x) = go res rem x

-- 5

data IJust : Maybe a -> Type where
  ItIsJust : IJust (Just v)

Uninhabited (IJust Nothing) where
  uninhabited ItIsJust impossible

isJust : (m : Maybe a) -> Dec (IJust m)
isJust Nothing  = No uninhabited
isJust (Just x) = Yes ItIsJust

fromJust : (m : Maybe a) -> (0 _ : IJust m) => a
fromJust (Just x) = x
fromJust Nothing  impossible

-- 6

data IsLeft : Either e a -> Type where
  ItIsLeft : IsLeft (Left v)

Uninhabited (IsLeft $ Right w) where
  uninhabited ItIsLeft impossible

isLeft : (v : Either e a) -> Dec (IsLeft v)
isLeft (Right _) = No uninhabited
isLeft (Left x)  = Yes ItIsLeft

data IsRight : Either e a -> Type where
  ItIsRight : IsRight (Right v)

Uninhabited (IsRight $ Left w) where
  uninhabited ItIsRight impossible

isRight : (v : Either e a) -> Dec (IsRight v)
isRight (Left _)  = No uninhabited
isRight (Right x) = Yes ItIsRight

fromLeft : (v : Either e a) -> (0 _ : IsLeft v) => e
fromLeft (Left x) = x
fromLeft (Right x) impossible

fromRight : (v : Either e a) -> (0 _ : IsRight v) => a
fromRight (Right x) = x
fromRight (Left x) impossible

--------------------------------------------------------------------------------
--          Contracts between Values
--------------------------------------------------------------------------------

data ColType = I64 | Str | Boolean | Float

IdrisType : ColType -> Type
IdrisType I64     = Int64
IdrisType Str     = String
IdrisType Boolean = Bool
IdrisType Float   = Double

record Column where
  constructor MkColumn
  name : String
  type : ColType

infixr 8 :>

(:>) : String -> ColType -> Column
(:>) = MkColumn

Schema : Type
Schema = List Column

data Row : Schema -> Type where
  Nil  : Row []
  (::) :  {0 name : String}
       -> {0 type : ColType}
       -> (v : IdrisType type)
       -> Row ss
       -> Row (name :> type :: ss)

data InSchema :  (name    : String)
              -> (schema  : Schema)
              -> (colType : ColType)
              -> Type where
  [search name schema]
  IsHere  : InSchema n (n :> t :: ss) t
  IsThere : InSchema n ss t -> InSchema n (fld :: ss) t

getAt :  {0 ss   : Schema}
      -> (name : String)
      -> Row ss
      -> (prf : InSchema name ss c)
      => IdrisType c
getAt name (v :: vs) {prf = IsHere}    = v
getAt name (_ :: vs) {prf = IsThere p} = getAt name vs

-- 1

Uninhabited (InSchema n [] c) where
  uninhabited IsHere impossible
  uninhabited (IsThere _) impossible

inSchema : (ss : Schema) -> (n : String) -> Dec (c ** InSchema n ss c)
inSchema []                    _ = No $ \(_ ** prf) => uninhabited prf
inSchema (MkColumn cn t :: xs) n = case decEq cn n of
  Yes Refl   => Yes (t ** IsHere)
  No  contra => case inSchema xs n of
    Yes (t ** prf=> Yes (t ** IsThere prf)
    No  contra2    => No $ \case (_ ** IsHere)    => contra Refl
                                 (t ** IsThere p=> contra2 (t ** p)

-- 2

updateAt : (name : String)
         -> Row ss
         -> (prf : InSchema name ss c)
         => (f : IdrisType c -> IdrisType c)
         -> Row ss
updateAt name (v :: vs) {prf = IsHere}    f = f v :: vs
updateAt name (v :: vs) {prf = IsThere p} f = v :: updateAt name vs f

-- 3

public export
data Elems : (xs,ys : List a) -> Type where
  ENil   : Elems [] ys
  EHere  : Elems xs ys -> Elems (x :: xs) (x :: ys)
  EThere : Elems xs ys -> Elems xs (y :: ys)

extract :  (0 s1 : Schema)
        -> (row : Row s2)
        -> (prf : Elems s1 s2)
        => Row s1
extract []       _         {prf = ENil}     = []
extract (_ :: t) (v :: vs) {prf = EHere x}  = v :: extract t vs
extract s1       (v :: vs) {prf = EThere x} = extract s1 vs

-- 4

namespace AllInSchema
  public export
  data AllInSchema :  (names : List String)
                   -> (schema : Schema)
                   -> (result : Schema)
                   -> Type where
    [search names schema]
    Nil  :  AllInSchema [] s []
    (::) :  InSchema n s c
         -> AllInSchema ns s res
         -> AllInSchema (n :: ns) s (n :> c :: res)

getAll :  {0 ss  : Schema}
       -> (names : List String)
       -> Row ss
       -> (prf : AllInSchema names ss res)
       => Row res
getAll []        _   {prf = []}     = []
getAll (n :: ns) row {prf = _ :: _} = getAt n row :: getAll ns row

--------------------------------------------------------------------------------
--          Use Case: Flexible Error Handling
--------------------------------------------------------------------------------

data Has :  (v : a) -> (vs  : Vect n a) -> Type where
  Z : Has v (v :: vs)
  S : Has v vs -> Has v (w :: vs)

Uninhabited (Has v []) where
  uninhabited Z impossible
  uninhabited (_) impossible

data Union : Vect n Type -> Type where
  U : {0 ts : _} -> (ix : Has t ts) -> (val : t) -> Union ts

Uninhabited (Union []) where
  uninhabited (U ix _) = absurd ix

0 Err : Vect n Type -> Type -> Type
Err ts t = Either (Union ts) t

-- 1

project : (0 t : Type) -> (prf : Has t ts) => Union ts -> Maybe t
project t {prf = Z}   (U Z val)     = Just val
project t {prf = S p} (U (S x) val) = project t (U x val)
project t {prf = Z}   (U (S x) val) = Nothing
project t {prf = S p} (U Z val)     = Nothing

project1 : Union [t] -> t
project1 (U Z val) = val
project1 (U (S x) val) impossible

safe : Err [] a -> a
safe (Right x) = x
safe (Left x)  = absurd x

-- 2

weakenHas : Has t ts -> Has t (ts ++ ss)
weakenHas Z     = Z
weakenHas (S x) = S (weakenHas x)

weaken : Union ts -> Union (ts ++ ss)
weaken (U ix val) = U (weakenHas ix) val

extendHas : {m : _} -> {0 pre : Vect m a} -> Has t ts -> Has t (pre ++ ts)
extendHas {m = Z}   {pre = []}     x = x
extendHas {m = S p} {pre = _ :: _} x = S (extendHas x)

extend : {m : _} -> {0 pre : Vect m _} -> Union ts -> Union (pre ++ ts)
extend (U ix val) = U (extendHas ix) val

-- 3

0 Errs : Vect m Type -> Vect n Type -> Type
Errs []        _  = ()
Errs (x :: xs) ts = (Has x ts, Errs xs ts)

inject : Has t ts => (v : t) -> Union ts
inject v = U %search v

embed : (prf : Errs ts ss) => Union ts -> Union ss
embed (U Z val)     = inject val
embed (U (S x) val) = embed (U x val)

-- 4

data Rem : (v : a) -> (vs : Vect (S n) a) -> (rem : Vect n a) -> Type where
  [search v vs]
  RZ : Rem v (v :: rem) rem
  RS : Rem v vs rem -> Rem v (w :: vs) (w :: rem)

split : (prf : Rem t ts rem) => Union ts -> Either t (Union rem)
split {prf = RZ}   (U Z     val) = Left val
split {prf = RZ}   (U (S x) val) = Right (U x val)
split {prf = RS p} (U Z     val) = Right (U Z val)
split {prf = RS p} (U (S x) val) = case split {prf = p} (U x val) of
  Left vt        => Left vt
  Right (U ix y) => Right $ U (S ix) y

handle :  Applicative f
       => Rem t ts rem
       => (h : t -> f (Err rem a))
       -> Err ts a
       -> f (Err rem a)
handle h (Left x)  = case split x of
  Left v    => h v
  Right err => pure $ Left err
handle _ (Right x) = pure $ Right x

--------------------------------------------------------------------------------
--          Tests
--------------------------------------------------------------------------------

EmployeeSchema : Schema
EmployeeSchema = [ "firstName"  :> Str
                 , "lastName"   :> Str
                 , "email"      :> Str
                 , "age"        :> I64
                 , "salary"     :> Float
                 , "management" :> Boolean
                 ]

0 Employee : Type
Employee = Row EmployeeSchema

hock : Employee
hock = [ "Stefan", "Höck", "hock@foo.com", 46, 5443.2, False ]

shoeck : String
shoeck = getAt "firstName" hock ++ " " ++ getAt "lastName" hock

shoeck2 : String
shoeck2 = case getAll ["firstName", "lastName", "age"] hock of
  [fn,ln,a] => "\{fn} \{ln}: \{show a} years old."

embedTest :  Err [Nat,Bits8] a
          -> Err [String, Bits8, Int32, Nat] a
embedTest = mapFst embed

src/Solutions/Prim.idr

module Solutions.Prim

import Data.Bits
import Data.DPair
import Data.List
import Data.Maybe
import Data.SnocList
import Decidable.Equality

%default total

--------------------------------------------------------------------------------
--          Working with Strings
--------------------------------------------------------------------------------

-- 1

map : (Char -> Char) -> String -> String
map f = pack . map f . unpack

filter : (Char -> Bool) -> String -> String
filter f = pack . filter f . unpack

mapMaybe : (Char -> Maybe Char) -> String -> String
mapMaybe f = pack . mapMaybe f . unpack

-- 2

foldl : (a -> Char -> a) -> a -> String -> a
foldl f v = foldl f v . unpack

foldMap : Monoid m => (Char -> m) -> String -> m
foldMap f = foldMap f . unpack

-- 3

traverse : Applicative f => (Char -> f Char) -> String -> f String
traverse fun = map pack . traverse fun . unpack

-- 4
(>>=) : String -> (Char -> String) -> String
str >>= f = foldMap f $ unpack str

--------------------------------------------------------------------------------
--          Integers
--------------------------------------------------------------------------------

-- 1

record And a where
  constructor MkAnd
  value : a

Bits a => Semigroup (And a) where
  MkAnd x <+> MkAnd y = MkAnd $ x .&. y

Bits a => Monoid (And a) where
  neutral = MkAnd oneBits

-- 2

record Or a where
  constructor MkOr
  value : a

Bits a => Semigroup (Or a) where
  MkOr x <+> MkOr y = MkOr $ x .|. y

Bits a => Monoid (Or a) where
  neutral = MkOr zeroBits

-- 3

even : Bits64 -> Bool
even x = not $ testBit x 0

-- 4

binChar : Bits64 -> Char
binChar x = if testBit x 0 then '1' else '0'

toBin : Bits64 -> String
toBin 0 = "0"
toBin v = go [] v
  where go : List Char -> Bits64 -> String
        go cs 0 = pack cs
        go cs v = go (binChar v :: cs) (assert_smaller v $ v `shiftR` 1)

-- 5

-- Note: We know that `x .&. 15` must be a value in the range
-- [0,15] (unless there is a bug in the backend we use), but since
-- `Bits64` is a primitive, Idris can't know this. We therefore
-- fail with a runtime crash in the impossible case, but annotate the
-- call to `idris_crash` with `assert_total` (otherwise, `hexChar` would
-- be a partial function).
hexChar : Bits64 -> Char
hexChar x = case x .&. 15 of
  0  => '0'
  1  => '1'
  2  => '2'
  3  => '3'
  4  => '4'
  5  => '5'
  6  => '6'
  7  => '7'
  8  => '8'
  9  => '9'
  10 => 'a'
  11 => 'b'
  12 => 'c'
  13 => 'd'
  14 => 'e'
  15 => 'f'
  x  => assert_total $ idris_crash "IMPOSSIBLE: Invalid hex digit (\{show x})"

toHex : Bits64 -> String
toHex 0 = "0"
toHex v = go [] v
  where go : List Char -> Bits64 -> String
        go cs 0 = pack cs
        go cs v = go (hexChar v :: cs) (assert_smaller v $ v `shiftR` 4)

--------------------------------------------------------------------------------
--          Refined Primitives
--------------------------------------------------------------------------------

data Dec0 : (prop : Type) -> Type where
  Yes0 : (0 prf : prop) -> Dec0 prop
  No0  : (0 contra : prop -> Void) -> Dec0 prop

data IsYes0 : (d : Dec0 prop) -> Type where
  ItIsYes0 : IsYes0 (Yes0 prf)

0 fromYes0 : (d : Dec0 prop) -> (0 prf : IsYes0 d) => prop
fromYes0 (Yes0 x) = x
fromYes0 (No0 contra) impossible

interface Decidable (0 a : Type) (0 p : a -> Type) | p where
  decide : (v : a) -> Dec0 (p v)

decideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> Dec0 (p v)
decideOn _ = decide

test0 : (b : Bool) -> Dec0 (b === True)
test0 True  = Yes0 Refl
test0 False = No0 absurd

0 unsafeDecideOn : (0 p : a -> Type) -> Decidable a p => (v : a) -> p v
unsafeDecideOn p v = case decideOn p v of
  Yes0 prf => prf
  No0  _   =>
    assert_total $ idris_crash "Unexpected refinement failure in `unsafeRefineOn`"

0 safeDecideOn :  (0 p : a -> Type)
               -> Decidable a p
               => (v : a)
               -> (0 prf : IsYes0 (decideOn p v))
               => p v
safeDecideOn p v = fromYes0 $ decideOn p v

-- 1

{x : a} -> DecEq a => Decidable a (Equal x) where
  decide v = case decEq x v of
    Yes prf   => Yes0 prf
    No contra => No0 contra

-- 2

data Neg : (p : a -> Type) -> a -> Type where
  IsNot : {0 p : a -> Type} -> (contra : p v -> Void) -> Neg p v

Decidable a p => Decidable a (Neg p) where
  decide v = case decideOn p v of
    Yes0 prf   => No0 $ \(IsNot contra) => contra prf
    No0 contra => Yes0 $ IsNot contra

-- 3

data (&&) : (p,q : a -> Type) -> a -> Type where
  Both : {0 p,q : a -> Type} -> (prf1 : p v) -> (prf2 : q v) -> (&&) p q v

Decidable a p => Decidable a q => Decidable a (p && q) where
  decide v = case decideOn p v of
    Yes0 prf1 => case decideOn q v of
      Yes0 prf2   => Yes0 $ Both prf1 prf2
      No0  contra => No0 $ \(Both _ prf2) => contra prf2
    No0  contra => No0 $ \(Both prf1 _) => contra prf1

-- 4

data (||) : (p,q : a -> Type) -> a -> Type where
  L : {0 p,q : a -> Type} -> (prf : p v) -> (p || q) v
  R : {0 p,q : a -> Type} -> (prf : q v) -> (p || q) v

Decidable a p => Decidable a q => Decidable a (p || q) where
  decide v = case decideOn p v of
    Yes0 prf1    => Yes0 $ L prf1
    No0  contra1 => case decideOn q v of
      Yes0 prf2    => Yes0 $ R prf2
      No0  contra2 => No0 $ \case L prf => contra1 prf
                                  R prf => contra2 prf

-- 5

negOr : Neg (p || q) v -> (Neg p && Neg q) v
negOr (IsNot contra) = Both (IsNot $ contra . L) (IsNot $ contra . R)

andNeg : (Neg p && Neg q) v -> Neg (p || q) v
andNeg (Both (IsNot c1) (IsNot c2)) =
  IsNot $ \case L p1 => c1 p1
                R p2 => c2 p2

orNeg : (Neg p || Neg q) v -> Neg (p && q) v
orNeg (L (IsNot contra)) = IsNot $ \(Both p1 _) => contra p1
orNeg (R (IsNot contra)) = IsNot $ \(Both _ p2) => contra p2

0 negAnd :  Decidable a p
         => Decidable a q
         => Neg (p && q) v
         -> (Neg p || Neg q) v
negAnd (IsNot contra) = case decideOn p v of
  Yes0 p1 => case decideOn q v of
    Yes0 p2 => void (contra $ Both p1 p2)
    No0 c   => R $ IsNot c
  No0 c    => L $ IsNot c

-- 6

data (<=) : (m,n : Nat) -> Type where
  ZLTE : 0 <= n
  SLTE : m <= n -> S m <= S n

(>=) : (m,n : Nat) -> Type
m >= n = n <= m

(<) : (m,n : Nat) -> Type
m < n = S m <= n

(>) : (m,n : Nat) -> Type
m > n = n < m

LessThan : (m,n : Nat) -> Type
LessThan m = (< m)

To : (m,n : Nat) -> Type
To m = (<= m)

GreaterThan : (m,n : Nat) -> Type
GreaterThan m = (> m)

From : (m,n : Nat) -> Type
From m = (>= m)

FromTo : (lower,upper : Nat) -> Nat -> Type
FromTo l u = From l && To u

Between : (lower,upper : Nat) -> Nat -> Type
Between l u = GreaterThan l && LessThan u

Uninhabited (S n <= 0) where
  uninhabited ZLTE impossible
  uninhabited (SLTE _) impossible

0 fromLTE : (n1,n2 : Nat) -> (n1 <= n2) === True -> n1 <= n2
fromLTE 0     n2    prf = ZLTE
fromLTE (S k) (S j) prf = SLTE $ fromLTE k j prf
fromLTE (S k) 0     prf = absurd prf

0 toLTE : (n1,n2 : Nat) -> n1 <= n2 -> (n1 <= n2) === True
toLTE 0     0     _        = Refl
toLTE 0     (S k) _        = Refl
toLTE (S k) (S j) (SLTE x) = toLTE k j x
toLTE (S k) 0     x        = absurd x

{n : Nat} -> Decidable Nat (<= n) where
  decide m = case test0 (m <= n) of
    Yes0 prf   => Yes0 $ fromLTE m n prf
    No0 contra => No0 $ contra . toLTE m n

{m : Nat} -> Decidable Nat (m <=) where
  decide n = case test0 (m <= n) of
    Yes0 prf   => Yes0 $ fromLTE m n prf
    No0 contra => No0 $ contra . toLTE m n

-- 7

0 refl : {n : Nat} -> n <= n
refl {n = 0}   = ZLTE
refl {n = S _} = SLTE refl

0 trans : {l,m,n : Nat} -> l <= m -> m <= n -> l <= n
trans {l = 0}   _        _        = ZLTE
trans {l = S _} (SLTE x) (SLTE y) = SLTE $ trans x y

0 (>>) : {l,m,n : Nat} -> l <= m -> m <= n -> l <= n
(>>) = trans

-- 8

0 toIsSucc : {n : Nat} -> n > 0 -> IsSucc n
toIsSucc {n = S _} (SLTE _) = ItIsSucc

0 fromIsSucc : {n : Nat} -> IsSucc n -> n > 0
fromIsSucc {n = S _} ItIsSucc = SLTE ZLTE

-- 9

safeDiv : (x,y : Bits64) -> (0 prf : cast y > 0) => Bits64
safeDiv x y = x `div` y

safeMod :  (x,y : Bits64)
        -> (0 prf : cast y > 0)
        => Subset Bits64 (\v => cast v < cast y)
safeMod x y = Element (x `mod` y) (unsafeDecideOn (<= cast y) _)

-- 10

digit : (v : Bits64) -> (0 prf : cast v < 16) => Char
digit 0  = '0'
digit 1  = '1'
digit 2  = '2'
digit 3  = '3'
digit 4  = '4'
digit 5  = '5'
digit 6  = '6'
digit 7  = '7'
digit 8  = '8'
digit 9  = '9'
digit 10 = 'a'
digit 11 = 'b'
digit 12 = 'c'
digit 13 = 'd'
digit 14 = 'e'
digit 15 = 'f'
digit x  = assert_total $ idris_crash "IMPOSSIBLE: Invalid digit (\{show x})"

record Base where
  constructor MkBase
  value : Bits64
  0 prf : FromTo 2 16 (cast value)

base : Bits64 -> Maybe Base
base v = case decideOn (FromTo 2 16) (cast v) of
  Yes0 prf => Just $ MkBase v prf
  No0  _   => Nothing

namespace Base
  public export
  fromInteger : (v : Integer) -> {auto 0 _ : IsJust (base $ cast v)} -> Base
  fromInteger v = fromJust $ base (cast v)

digits : Bits64 -> Base -> String
digits 0 _ = "0"
digits x (MkBase b $ Both p1 p2) = go [] x
  where go : List Char -> Bits64 -> String
        go cs 0 = pack cs
        go cs v =
          let Element d p = (v `safeMod` b) {prf = %search >> p1}
              v2          = (v `safeDiv` b) {prf = %search >> p1}
           in go (digit d {prf = p >> p2} :: cs) (assert_smaller v v2)

-- 11

data CharOrd : (p : Nat -> Type) -> Char -> Type where
  IsCharOrd : {0 p : Nat -> Type} -> (prf : p (cast c)) -> CharOrd p c

Decidable Nat p => Decidable Char (CharOrd p) where
  decide c = case decideOn p (cast c) of
    Yes0 prf   => Yes0 $ IsCharOrd prf
    No0 contra => No0 $ \(IsCharOrd prf) => contra prf

-- 12

IsAscii : Char -> Type
IsAscii = CharOrd (< 128)

IsLatin : Char -> Type
IsLatin = CharOrd (< 255)

IsUpper : Char -> Type
IsUpper = CharOrd (FromTo (cast 'A') (cast 'Z'))

IsLower : Char -> Type
IsLower = CharOrd (FromTo (cast 'a') (cast 'z'))

IsAlpha : Char -> Type
IsAlpha = IsUpper || IsLower

IsDigit : Char -> Type
IsDigit = CharOrd (FromTo (cast '0') (cast '9'))

IsAlphaNum : Char -> Type
IsAlphaNum = IsAlpha || IsDigit

IsControl : Char -> Type
IsControl = CharOrd (FromTo 0 31 || FromTo 127 159)

IsPlainAscii : Char -> Type
IsPlainAscii = IsAscii && Neg IsControl

IsPlainLatin : Char -> Type
IsPlainLatin = IsLatin && Neg IsControl

-- 12

0 plainToAscii : IsPlainAscii c -> IsAscii c
plainToAscii (Both prf1 _) = prf1

0 digitToAlphaNum : IsDigit c -> IsAlphaNum c
digitToAlphaNum = R

0 alphaToAlphaNum : IsAlpha c -> IsAlphaNum c
alphaToAlphaNum = L

0 lowerToAlpha : IsLower c -> IsAlpha c
lowerToAlpha = R

0 upperToAlpha : IsUpper c -> IsAlpha c
upperToAlpha = L

0 lowerToAlphaNum : IsLower c -> IsAlphaNum c
lowerToAlphaNum = L . R

0 upperToAlphaNum : IsUpper c -> IsAlphaNum c
upperToAlphaNum = L . L

0 asciiToLatin : IsAscii c -> IsLatin c
asciiToLatin (IsCharOrd x) = IsCharOrd (trans x $ safeDecideOn _ _)

0 plainAsciiToPlainLatin : IsPlainAscii c -> IsPlainLatin c
plainAsciiToPlainLatin (Both x y) = Both (asciiToLatin x) y

-- 13

data Head : (p : a -> Type) -> List a -> Type where
  AtHead : {0 p : a -> Type} -> (0 prf : p v) -> Head p (v :: vs)

Uninhabited (Head p []) where
  uninhabited (AtHead _) impossible

Decidable a p => Decidable (List a) (Head p) where
  decide []        = No0 $ \prf => absurd prf
  decide (x :: xs) = case decide {p} x of
    Yes0 prf    => Yes0 $ AtHead prf
    No0  contra => No0 $ \(AtHead prf) => contra prf

-- 14

data Length : (p : Nat -> Type) -> List a -> Type where
  HasLength :  {0 p : Nat -> Type}
            -> (0 prf : p (List.length vs))
            -> Length p vs

Decidable Nat p => Decidable (List a) (Length p) where
  decide vs = case decideOn p (length vs) of
    Yes0 prf   => Yes0 $ HasLength prf
    No0 contra => No0 $ \(HasLength prf) => contra prf

-- 15

data All : (p : a -> Type) -> (as : List a) -> Type where
  Nil  : All p []
  (::) :  {0 p : a -> Type}
       -> (0 h : p v)
       -> (0 t : All p vs)
       -> All p (v :: vs)

data AllSnoc : (p : a -> Type) -> (as : SnocList a) -> Type where
  Lin  : AllSnoc p [<]
  (:<) :  {0 p : a -> Type}
       -> (0 i : AllSnoc p vs)
       -> (0 l : p v)
       -> AllSnoc p (vs :< v)

0 head : All p (x :: xs) -> p x
head (h :: _) = h

0 (<>>) : AllSnoc p sx -> All p xs -> All p (sx <>> xs)
(<>>) [<]      y = y
(<>>) (i :< l) y = i <>> l :: y

0 suffix : (sx : SnocList a) -> All p (sx <>> xs) -> All p xs
suffix [<]       x = x
suffix (sx :< y) x = let (_ :: t) = suffix {xs = y :: xs} sx x in t

0 notInner :  {0 p : a -> Type}
           -> (sx : SnocList a)
           -> (0 contra : (prf : p x) -> Void)
           -> (0 prfs : All p (sx <>> x :: xs))
           -> Void
notInner sx contra prfs = let prfs2 = suffix sx prfs in contra (head prfs2)

allTR : {0 p : a -> Type} -> Decidable a p => (as : List a) -> Dec0 (All p as)
allTR as = go Lin as
  where go : (0 sp : AllSnoc p sx) -> (xs : List a) -> Dec0 (All p (sx <>> xs))
        go sp []        = Yes0 $ sp <>> Nil
        go sp (x :: xs) = case decide {p} x of
          Yes0 prf    => go (sp :< prf) xs
          No0  contra => No0 $ \prf => notInner sx contra prf

Decidable a p => Decidable (List a) (All p) where decide = allTR

-- 16

0 IsIdentChar : Char -> Type
IsIdentChar = IsAlphaNum || Equal '_'

0 IdentChars : List Char -> Type
IdentChars = Length (<= 100) && Head IsAlpha && All IsIdentChar

record Identifier where
  constructor MkIdentifier
  value : String
  0 prf : IdentChars (unpack value)

identifier : String -> Maybe Identifier
identifier s = case decideOn IdentChars (unpack s) of
  Yes0 prf => Just $ MkIdentifier s prf
  No0  _   => Nothing

namespace Identifier
  public export
  fromString :  (s : String)
             -> (0 _ : IsYes0 (decideOn IdentChars (unpack s)))
             => Identifier
  fromString s = MkIdentifier s (fromYes0 $ decide (unpack s))

testIdent : Identifier
testIdent = "fooBar_123"