> {-# OPTIONS_GHC -Wall #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
>
> import Data.Char

IO
==

CIS 194 Week 8 1 March 2012

Suggested reading:

-   [LYAH Chapter 9: Input and
    Output](http://learnyouahaskell.com/input-and-output)
-   [RWH Chapter 7: I/O](http://book.realworldhaskell.org/read/io.html)

First, finishing up a few things from last week:

What’s the point of `Monoid`?
-----------------------------

What’s the use of the `Monoid` type class we saw last week?

Type synonyms and newtypes
--------------------------

Suppose we have a type `T` and we want to make another type which is
“the same as” `T`. We have two options.

**Type synonyms**

The first option is to make a *type synonym*, introduced with the `type`
keyword:

~~~~ {.haskell}
type S = T
~~~~

This creates `S` as a type synonym for `T`. `S` is simply a *different
name* for `T`, so they can be used interchangeably.

When would we want to do this?

-   As abbreviation: `T` is long and frequently used, so we want to give
    it a shorter name `S` in order to save us some typing (and make type
    signatures easier to read).
-   As documentation: for example, suppose a function takes three
    `String`s, where the first two represent names and the third
    represents a message. We could write

    > type Name = String
    > type Message = String
    >
    > f :: Name -> Name -> Message -> Int

    This way it would be more obvious to someone reading the type of `f`
    what arguments it expects, but they can still pass `String` values
    to `f` as before.

**Newtypes**

The other option is to create a `newtype`, like this:

~~~~ {.haskell}
newtype N = C T
~~~~

The idea is that this creates a “new type” `N` which is *isomorphic* to
`T` but is a separate type—unlike with type synonyms, the compiler will
complain if we mix them up. We cannot pass a value of type `T` as an
argument when an `N` is expected, or vice versa. In order to convert
between them we need to use the constructor `C`—either applying it (to
convert from `T` to `N`) or pattern-matching on it (to convert from `N`
to `T`).

This has several common uses:

1.  To get the compiler’s help distinguishing between types we do not
    want to mix up. For example, if we have two types, one for
    representing meters and one for representing feet, we probably want
    to represent them as

~~~~ {.haskell}
newtype Meters = Meters Double
newtype Feet   = Feet Double
~~~~

    This way, if we accidentally use a value in meters where we meant to
    use one in feet, we will get a compiler error instead of having our
    $125 million Mars probe crash.

2.  As we saw last week, if we want to give multiple type class
    instances to the same type, we can instead wrap the type in several
    different `newtype`s and give an instance for each.

So why use `newtype` instead of just

~~~~ {.haskell}
data N = C T
~~~~

? There are several ways in which `newtype` differs from `data`:

1.  `newtype`s may only have a single constructor with a single
    argument. This may seem like an annoying restriction, but the point
    is that…

2.  `newtype`s have *no run-time cost*. That is, at run-time, values of
    the types `N` and `T` will be represented *identically* in memory.
    If we had instead written `data N = C T` values of type `N` would be
    paired with a “tag” to indicate the constructor `C`. Since
    `newtype`s can only have a single constructor with a single value
    inside it, there is no need to actually store the constructor.

3.  GHC has an extension called `GeneralizedNewtypeDeriving` which
    allows one to automatically derive type class instances for a
    `newtype` based on instances for the underlying type. For example,
    instead of writing

~~~~ {.haskell}
newtype Moo = Moo Int

instance Num Moo where
  (Moo x) + (Moo y) = Moo (x + y)
  (Moo x) * (Moo y) = Moo (x * y)
  abs (Moo x)       = Moo (abs x)
  ...
~~~~

    we can just write

~~~~ {.haskell}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

newtype Moo = Moo Int
  deriving (Num)
~~~~

Record syntax
-------------

Suppose we have a data type such as

~~~~ {.haskell}
data D = C T1 T2 T3
~~~~

We could also declare this data type with *record syntax* as follows:

~~~~ {.haskell}
data D = C { field1 :: T1, field2 :: T2, field3 :: T3 }
~~~~

where we specify not just a type but also a *name* for each field stored
inside the `C` constructor. This new version of `D` can be used in all
the same ways as the old version (in particular we can still construct
and pattern-match on values of type `D` as `C v1 v2 v3`). However, we
get some additional benefits.

1.  Each field name is automatically a *projection function* which gets
    the value of that field out of a value of type `D`. For example,
    `field2` is a function of type

~~~~ {.haskell}
field2 :: D -> T2
~~~~

    Before, we would have had to implement `field2` ourselves by writing

~~~~ {.haskell}
field2 (C _ f _) = f
~~~~

    This gets rid of a lot of boilerplate if we have a data type with
    many fields!

2.  There is special syntax for *constructing*, *modifying*, and
    *pattern-matching* on values of type `D` (in addition to the usual
    syntax for such things).

    We can *construct* a value of type `D` using syntax like

~~~~ {.haskell}
C { field3 = ..., field1 = ..., field2 = ... }
~~~~

    with the `...` filled in by expressions of the right type. Note that
    we can specify the fields in any order.

    Suppose we have a value `d :: D`. We can *modify* `d` using syntax
    like

~~~~ {.haskell}
d { field3 = ... }
~~~~

    Of course, by “modify” we don’t mean actually mutating `d`, but
    rather constructing a new value of type `D` which is the same as `d`
    except with the `field3` field replaced by the given value.

    Finally, we can *pattern-match* on values of type `D` like so:

~~~~ {.haskell}
foo (C { field1 = x }) = ... x ...
~~~~

    This matches only on the `field1` field from the `D` value, calling
    it `x` (of course, in place of `x` we could also put an arbitrary
    pattern), ignoring the other fields.

Now onwards to `IO`!

The problem with purity
-----------------------

Remember that Haskell is *lazy* and therefore *pure*. This means two
primary things:

1.  Functions may not have any external effects. For example, a function
    may not print anything on the screen. Functions may only compute
    their outputs.

2.  Functions may not depend on external stuff. For example, they may
    not read from the keyboard, or filesystem, or network. Functions may
    depend only on their inputs—put another way, functions should give
    the same output for the same input every time.

But—sometimes we *do* want to be able to do stuff like this! If the only
thing we could do with Haskell is write functions which we can then
evaluate at the ghci prompt, it would be theoretically interesting but
practically useless.

In fact, it *is* possible to do these sorts of things with Haskell, but
it looks very different than in most other languages.

The `IO` type
-------------

XXX edit this

Haskell has a special monad called `IO` which encapsulates I/O
operations (like printing to the screen, reading from or writing to
disk, communicating over a network…) A value of type `IO a` represents a
*description of* a computation which, *when executed by Haskell’s
runtime system*, will (possibly) perform some I/O operations and
(eventually) produce a value of type `a`. There is a level of
indirection here that’s crucial to understand. A value of type `IO a`,
*in and of itself*, is just an inert, perfectly safe thing with no
effects. It is only when it gets *run* that effects are produced.

As an illustration, suppose you have

    c :: Cake

What do you have? Why, a delicious cake, of course. Plain and simple. By
contrast, suppose you have

    r :: Recipe Cake

What do you have? A cake? No, you have some *instructions* for how to
make a cake, just a sheet of paper with some writing on it. Not only do
you not actually have a cake, merely being in possession of the recipe
has no effect on anything else whatsoever. Simply holding the recipe in
your hand does not cause your oven to get hot or flour to be spilled all
over your floor or anything of that sort. To actually produce a cake,
the recipe must be *followed* (causing flour to be spilled, ingredients
mixed, the oven to get hot, etc.).

In the same way, a value of type `IO a` is just a “recipe” for producing
a value of type `a` (and possibly having some effects along the way).
Like any other value, it can be passed as an argument, returned as the
output of a function, stored in a data structure, or (using the `(>>=)`
operator) combined with other `IO` values into more complex recipes.

So, how do values of type `IO a` actually get *executed* by the Haskell
runtime? There is only one way: the Haskell compiler looks for a special
value

    main :: IO ()

which will actually get handed to the runtime system and executed.
That’s it! Of course, `main` can be arbitrarily complicated, and will
usually be composed of many smaller `IO` computations.

So let’s write our first actual, executable Haskell program! We can use
the function

    putStrLn :: String -> IO ()

which, given a `String`, returns an `IO` computation that will (when
run) print out that `String` on the screen. So we simply put this in a
file called `Hello.hs`:

    main = putStrLn "Hello, Haskell!"

Then typing `runhaskell Hello.hs` at a command-line prompt results in
our message getting printed to the screen! We can also use
`ghc --make Hello.hs` to produce an executable version called `Hello`
(or `Hello.exe` on Windows).

How about a program to take whatever the user types and echo it back in
all uppercase?

> uppercaseify :: IO ()
> uppercaseify = getLine >>= \l -> putStrLn (map toUpper l)

…forever?

> uppercaseify2 :: IO ()
> uppercaseify2 = getLine >>= \l -> putStrLn (map toUpper l) >> uppercaseify2

…until the user types “quit”?

> uppercaseify3 :: IO ()
> uppercaseify3 = getLine >>= \l ->
>                   if (l /= "quit")
>                     then putStrLn (map toUpper l) >> uppercaseify3
>                     else return ()

> uppercaseify3a :: IO ()
> uppercaseify3a = getLine >>= \l ->
>                    when (l /= "quit") $
>                      putStrLn (map toUpper l) >> uppercaseify3

How about writing the processed contents of one file to another file?

> processFile :: IO ()
> processFile = putStrLn "Please enter an input file name: " >>
>                 getLine >>= \input ->
>                   putStrLn "And an output file name: " >>
>                     getLine >>= \output ->
>                       readFile input >>= \str ->
>                         writeFile output (process str)
>
> process :: String -> String
> process = map toUpper . filter (not . isSpace)

OK, seriously? That indentation is really annoying. Why not just write
it like this:

> processFile2 :: IO ()
> processFile2 = putStrLn "Please enter an input file name: " >>
>                getLine >>= \input ->
>                putStrLn "And an output file name: " >>
>                getLine >>= \output ->
>                readFile input >>= \str ->
>                writeFile output (process str)

Hmm, this is starting to look sort of like a Java program…

<!--

Local Variables:
mode:markdown
compile-command:"make explec"
End:

-->

* * * * *

`Generated 2012-03-01 16:39:34.491923`