Reflecting on Haskell in 2017

Alas, another year has come and gone. It feels like just yesterday I was writing
the last reflection blog post on my flight back to Boston for Christmas. I've
spent most of the last year traveling and working in Europe, meeting a lot of
new Haskellers and putting a lot of faces to names.

Haskell has had a great year and 2017 was defined by vast quantities of new
code, including 14,000 new Haskell projects on Github .
The amount of writing this year was voluminous and my list of interesting work
is eight times as large as last year. At least seven new companies came into
existence and many existing firms unexpectedly dropped large open source Haskell
projects into the public sphere. Driven by a lot of software catastrophes, the
intersection of security, software correctness and formal methods have been
become quite an active area of investment and research across both industry and
academia. It's really never been an easier and more exciting time to be
programming professionally in the world's most advanced (yet usable) statically
typed language.

Per what I guess is now a tradition, I will write my end of year retrospective
on my highlights of what happened in the Haskell scene in retrospect.

Writing

The amount of Haskell writing this year was vast and it's impossible to
enumerate all of the excellent writing that occurred. Some of the truly
exceptional bits that were drafted in 2017 included:

Books & Courses

Vladislav Zavialov and Artyom Kazak set out to write a book on the
netherlands of Intermediate Haskell development, a mythical place that we all
seemingly pass through but never speak of again. The project is intuitively called
Intermediate Haskell and is slated to be
released in 2018

Bartosz Milewski finished writing Category Theory for
Programmers

which is freely available and also generated as
PDF.

Brent Yorgey drafted a new course for teaching introduction to Computer Science
using Haskell at Hendrix
University
. Dmitry Kovanikov and
Arseniy Seroka also drafted a course for a wide variety of intermediate to
advanced Haskell topics
at ITMO
university. Some of which are in Russian but nevertheless большое письмо!

GHC

The Glorious Glasgow Haskell Compiler had it's 8.2 release, and landed to much
rejoicing. Major features such as unboxed sum types landed as planned in GHC
8.2. There were many longstanding issues that were closed and many miscellaneous
fixes contributed in 2017. For instance GHC now uses alternative linkers such as
ld.gold or ld.lld instead of the system default ld.

Semigroup is now a superclass of Monoid.

There was quite a bit of work on GHC cross
compilation
to ARM. The new build system
Hadrian has been in work for the past
three years, has was finally been merged into the GHC tree.

The DevOps Group has officially started and is being funded to help maintain the
infrastructure used to host Haskell packages and build GHC. The general push of
the group has been toward using hosted CI services, Appveyor and CircleCI and a
greater use of more transparent platforms such as Github for GHC development.

There is work on a major refactor of the AST types to use the new Trees that
Grow
research
to allow GHC API user to extend the AST for their own purposes. Eventually this
may allow the split of the AST types out of the ghc package, allowing tooling
authors, Template Haskell users, and the compiler itself to use the same AST
representation.

GHC is partially accepting pull requests on
Github although most of the development
still occurs on the mailing list and Phabricator.

There was significant engineering effort to allow GHC to produce deterministic
build artifacts
to
play nicely with caching reproducible build systems such as Buck and Bazel.
Previously the emitted .hi files would contain non-deterministic data such
as hashes, timestamps and unique name supply counts.

Errors

GHC 8.2 added wonderful new colorful error messages with caret diagnostics for
syntax and type errors:

Colorful errors GHC 8.2

Compact Regions

Support for ‘Compact Regions’ landed in GHC 8.2. Compact regions are manually
allocated regions where the data allocated inside it are compacted and not
traversed by the GC. This is amenable for long-lived data structures that are
resident in memory without mutation frequently occurring.

The interface can be accessed through the
ghc-compact modules and used
to create compact malloc'd memory.

import Control.DeepSeq
import Control.Exception
import qualified Data.ByteString.Char8 as B
import Data.Compact

main = compact (B.pack ['a'..'c'])

The test suite
inside GHC maintains the best illustrations of its use for complex non-traditional data structures.

Type.Reflection

GHC added a new more expressive Typeable mechanism using the
Type.Reflection
module. typeRep can be applied with explicit type applications to arrows and
functions can checked for type-level equality after application.

λ> typeRep @(->)
(->) 'LiftedRep 'LiftedRep
λ> typeRep @(Maybe Int) == App (typeRep @Maybe) (typeRep @Int)
True

Coercible

Not new to GHC 8.2 although the base library now exposes the Coercible
constraints allowing polymorphism over types which have the same runtime
representations to be safely coerced at runtime. This can also be extended to
polymorphic functions which take a compile-time proof of the equivalence of two
runtime data layouts.

import Data.Coerce

newtype Meters = Meters Integer deriving (Num)
newtype Feet = Feet Integer deriving (Num)

f :: (Coercible a b, Num b) => a -> b -> b
f x y = coerce x + y
λ> f (Meters 3) 4
7
λ> f (Feet 3) 4
7
λ> f 4 (Feet 3)
Feet 7

GHC tracks the usage (i.e. role) of type variables used as parameters as either
nominal representational or phantom allowing types that differ only in a
phantom type of nominal parameters to be safely coerced.

Join Points

Luke Maurer and Simon Peyton Jones merged new work on join
points
which modifies GHC
Core to optimize for join points in code. In Core, a join point is a specially
tagged function whose only occurrences are saturated tail calls. In the actual
GHC Core AST, a join point is simple bit of metadata indicated by IdDetails of
the binder.

Simon Peyton Jones presented the keynote at Haskell Exchange on his
collaboration on Compiling without
Continuation

which present the ideas and core optimizations that are allowed by the new join
points.

Deriving Strategies

In GHC 8.0 there were two alternative methods for automatic deriving of
typeclass instances, using GeneralizedNewtypeDeriving and DeriveAnyClass. In
addition there was also the wired-in deriving mechanisms for Show, Read, etc
that were hardcoded into the compiler. These all used the same syntax and would
conflict if multiple pragmas were enabled in a module.

The addition of DerivingStrategies allows us to disambiguate which deriving
mechanism to use for a specific instance generation.

newtype Meters = Meters Int
  deriving stock    (Read, Show)
  deriving newtype  (Num)
  deriving anyclass (ToJSON, FromJSON)

Backpack

Edward Yang finished his PhD thesis on Backpack which was integrated in the GHC
tree. The new branch adds support .bkp files, which specify abstract
interfaces which can be instantiated in modules and used to construct Haskell
modules which work polymorphically across multiple module instantiations.

For example an abstract string type can be written which operates over a
module parameter `Str``:

unit abstract-str where
    signature Str where
        data Str
        len :: Str -> Int

    module AStr (alen) where
        import Str

        alen :: Str -> Int
        alen = len

We can create (contrived) instantiations of this module for lists of ints and
chars which expose a polymorphic length function over both.

unit str-string where
    module Str where
        type Str = String

        len :: Str -> Int
        len = length

unit str-list where
    module Str where
        type Str = [Int]

        len :: Str -> Int
        len = length

The modules can then be opened as specific namespaces with the exported functions
able to be called over both module types.

unit main where
    dependency abstract-str[Str=str-string:Str] (AStr as AStr.Int)
    dependency abstract-str[Str=str-list:Str] (AStr as AStr.String)

    module Main (main) where
        import qualified AStr.Int
        import qualified AStr.String

        main :: IO ()
        main = do
            print $ AbstractStr.String.alen "Hello world"
            print $ AbstractStr.Int.alen [1, 2, 3]

With the latest GHC this can be compiled with the --backpack which generates
the sum of all the hi files specified in the .bkp file and resolves
values at link-time.

$ stack exec -- ghc --backpack example.bkp

While the functionality exists today, I'm not aware of any large projects using
Backpack. Bolting this functionality onto an ecosystem that has spent a decade
routing around many of the problems this system aims to solve, poses a huge
engineering cost and may take a while to crystallize.

Summer of Haskell

Google lacked vision this year and did not sponsor the Haskell Organization for
Summer of Code. But the program proceeded regardless with private sponsorship
from industrial users. Fifteen students were paired with mentors and many
successful projects and collaborations
resulted
.

LLVM

The LLVM bindings for Haskell saw quite a bit of work this year and were forked
into a new organization llvm-hs and added
support for LLVM 4.0 - 5.1:

  1. llvm-hs
  2. llvm-hs-pure
  3. llvm-hs-pretty

In a collaboration with Joachim Breitner and myself at Zurihac a type-safe LLVM
library which embeds the semantics of LLVM instructions into the Haskell
type-system was written.

Siddharth Bhat started work on an STG to LLVM backend
simplehxc before doing a
rewrite in C++.

Moritz Angermann has been continuing to develop a Haskell library for emitting
and reading LLVM Bitcode format
as well as work on the llvm-ng
backend which is a major rewrite of the GHC LLVM code generator.

Linear Types

Arnaud Spiwack prototyped a extension of
GHC
which augments the type system with
linear types. Edsko de
Vries
wrote a detailed blog
post about the nature of linearity and its uses.

The proposal extends the typing of functions to include linearity constraints on
arrows, enforcing that variables or references are created and consumed with
constrained reference counts. This allows us to statically enforce reference
borrowing and allocations in the typechecker potentially allowing us to enforce
lifetime constraints on closures and eliminating long-lived memory from being
used with constructed unbounded lifetimes, thereby eliminating garbage
collection for some Haskell functions.

For instance, use of the linear arrow (a ->. b) can enrich the existing raw
memory access functions enforcing the matching of allocation and free commands
statically. The multiplicity of usage is either 0, 1 or ω and the
linear arrow is syntatic sugar for unit multiplicity are aliases for (:'1 ->).

malloc :: Storable a => a ->. (Ptr a ->. Unrestricted b) ->. Unrestricted b
read :: Storable a => Ptr a ->. (Ptr a, a)
free :: Ptr a ->. ()

New abstractions such as movable, consumable and dupable references can be
constructed out of existing class hierarchies and enriched with static linearity
checks:

class Consumable a where
  consume :: a ->. ()

class Consumable a => Dupable a where
  dup :: a ->. (a, a)

class Dupable a => Movable a where
  move :: a ->. Unrestricted a

instance Dupable a => Dupable [a] where
  dup [] = ([], [])
  dup (a:l) = shuffle (dup a) (dup l)
    where
      shuffle :: (a, a) ->. ([a], [a]) ->. ([a], [a])
      shuffle (a, a') (l, l') = (a:l, a':l')

instance Consumable a => Consumable [a] where
  consume [] = ()
  consume (a:l) = consume a `lseq` consume l

As part of the Summer of Haskell Edvard Hübinette used linear types to construct
a safer stream processing library which can enforce resource consumption
statically using linearity.

There is considerable debate about the
proposal
and the nature
of integration of linear types into GHC. With some community involvement these
patches could be integrated quite quickly in GHC.

LiquidHaskell

LiquidHaskell the large
suite of tools for adding refinement types to GHC Haskell continued development
and became considerably more polished. At HaskellExchange several companies were
using it in anger in production. For example, we can enforce statically the
lists given at compile-time statically cannot contain certain values by
constructing a proposition function in a (subset) of Haskell which can refine
other definitions:

measure hasZero :: [Int] -> Prop
hasZero [] = false
hasZero (x:xs) = x == 0 || (hasZero xs)

type HasZero = {v : [Int] | (hasZero v)}

-- Rejected
xs :: HasZero
xs = [1,2,3,4]

-- Accepted
ys :: HasZero
ys = [0,1,2,3]

This can be used to statically
enforce

that logic that consumes only a Just value can provably only be called with a
Just with a isJust measure:

measure isJust :: forall a. Data.Maybe.Maybe a -> Bool
isJust (Data.Maybe.Just x)  = true
isJust (Data.Maybe.Nothing) = false

This year saw the addition of inductive predicates, allowing more complex
properties about non-arithmetic refinements to be checked. Including properties
about lists:

measure len :: [a] -> Int
len [] = 0
len (x:xs) = 1 + (len xs)

-- Spec for Data.List exports refined types which can be statically refined with
-- length constraints.
[] :: {v:[a]| len v = 0}
(:) :: _ -> xs:_ -> {v:[a]| len v = 1 + len xs}

append :: xs:[a] -> ys:[a] -> {v:[a]| len v = len xs + len ys}

The full library of specifications is now quite extensive and adding
LiquidHaskell
to an existing codebase is pretty seamless.

Foundation

Foundation is an alternative Prelude informed by modern design practices and
data structures. It ships a much more sensible and efficient packed array of
UTF8 points as its default String
type. Rethinks the Num numerical tower , and statically distinguishes partial functions.
Also has fledgling documentation beyond just

Last year Foundation was a bit early, but this year at Zurihac several companies
in London reported using it fully in production as a full industrial focused
Prelude.

Editor Tooling

Editor integration improved, adding best in modern tooling to most of the
common editors:

+----------------------+-----------------------------------------------------------------------+
| Editor | Haskell Integration |
+======================+=======================================================================+
| Atom | https://atom.io/packages/ide-haskell |
+----------------------+-----------------------------------------------------------------------+
| Emacs | https://commercialhaskell.github.io/intero/ |
+----------------------+-----------------------------------------------------------------------+
| IntelliJ | https://plugins.jetbrains.com/plugin/8258-intellij-haskell |
+----------------------+-----------------------------------------------------------------------+
| VSCode | https://marketplace.visualstudio.com/items?itemName=Vans.haskero |
+----------------------+-----------------------------------------------------------------------+
| Sublime | https://packagecontrol.io/packages/SublimeHaskell |
+----------------------+-----------------------------------------------------------------------+

Monica Lent wrote a lovely post on the state of art in Vim and Haskell
integration
.

Rik van der Kleij has done an impressive amount of work adapting the IntellJ IDE
to work with Haskell.
Including a custom parser handling all of the syntax extensions, name lookup,
Intero integration and integration with haskell-tools refactoring framework.

Development on the
haskell-ide-engine has picked
up again in the last few months.

Formal Methods

The DeepSpec, a collaboration between
MIT, UPenn, Princeton and Yale, is working on a network of specification that
span many compilers, languages and intermediate representations with the goal of
achieving full functional correctness of software and hardware. Both Haskell and
LLVM are part of this network of specifications. The group has successfully
written a new formal calculus describing the GHC core language and proved it
type sound in Coq. The project is called
corespec and is described in the paper
"A Specification for Dependent Types in Haskell".

In addition the group also published a paper “Total Haskell is Reasonable Coq”
and provided a utlity hs-to-coq which
converts haskell code to equivalent Coq code.

The Galois group started formalizing the
semantics
of Cryptol, a
compiler for high-assurance cryptographic protocols which is itself
written in Haskell.

Michael Burge wrote a cheeky article about extracting a specification for a
"domain specific"
browser

from Coq into Haskell.

Pragma Proliferation & Prelude

Writing Haskell is almost trivial in practice. You just start with the magic
fifty line {-# LANGUAGE ... #-} incantation to fast-forward to 2017,
then add 150 libraries that you've blessed by trial and error to your cabal
file, and then just write down a single type signature whose inhabitant is
Refl to your program. In fact if your program is longer than your import list
you're clearly doing Haskell all wrong.

In all seriousness, Haskell is not the small language it once was in 1998, it's
reach spans many industries, hobbyists, academia and many classes of people with
different incentives and whose ideas about the future of the language are
mutually incompatible. Realistically the reason why the Prelude and extension
situation aren't going to change anytime soon is that no one person or company
has the economic means to champion such a change. It would be enormously
expensive and any solution will never satisfy everyone's concerns and desires.
Consensus is expensive, while making everything opt-in is relatively cheap.
This is ultimately the equilibrium we've converged on and barring some large sea
change the language is going to remain in this equilibrium.

Haskell Survey

Taylor Fausak conducted an unofficial
survey

of Haskell users with some surprising results about widespread use of Haskell.
Surprisingly there are reportedly 100 or more people who maintain 100,000 or
more lines of Haskell code. Not so surprisingly most people have migrated to
Stack while vim and emacs are the editors of choice.

While the majority of respondents are satisfied with Haskell the language the
response are somewhat mixed the quality of libraries and the bulk of respondents
reported Haskell libraries being undocumented , hard to use hard
to find
, and don't integrate well.

Projects

Idris, the experimental dependently typed language, reached 1.0 release and
became one of the larger languages which is itself written in Haskell.

The most prolific Haskell library Pandoc released its version 2.0.

Several other groups published new compilers in the Haskell-family of languages.
Intel finally open sourced the Intell Haskell
compiler
which was a research project in
more optimal compilation techniques. Morgan Stanley also released
Hobbes a Haskell-like language used
internally at the bank featuring several novel extensions to row-types and C++
FFI integration. A prototype Haskell compiler was also written in
Rust.

The SMT solver integration library SBV saw a major rewrite of its internal and
its old tactics
system
.
The library is heavily used in various projects as an interface to Z3 and CVC4
solvers.

Uber released a library for parsing and analysis of Vertica, Hive, and Presto
SQL queries.

Wire released the backend service to
their commercial offering.

The Advanced Telematic Systems group in Berlin released a Quickcheck family
library for doing for property testing of models about state
machines
. Bose
also released Smudge a tool for doing development and analysis of large state
machines
for hardware testing.

Florian Knupfer released a new high-performance HTML combinator
library
for templating.

Galois continued with HalVM unikernel development this year, and
several HalVM Docker
Imagaes

were published allowing a very convenient way to write and test code against
HalVM.

Facebook released a Haskell string parsing library
duckling which parses human input into a
restricted set of semantically tagged data. Facebook also prototyped a technique
for hot-swapping Haskell code at runtime
using clever GHC API trickery.

Chris Done released vado a browser engine
written in Haskell.

Jonas Carpay released apecs an entity
component system for game development in Haskell.

Csaba Hruska continued work on a GRIN
compiler
and code generator, an
alternative core language for compiling lazy functional languages based on the
work by Urban Boquist.

Dima Szamozvancev released mezzo a library and
embedded domain-specific language for music description that can enforce rules
of compositionality of music statically and prevent bad music from being
unleashed on the world.

Conal Elliot and John Wiegley advanced a novel set of ideas on Compiling with
Categories
which allows the bidirectional
representation of Haskell functions as categorical structures. Although
currently implemented as a GHC typechecker extension it is a promising research
area.

Harry Clarke published a paper on Generics Layout-Preserving
Refactoring

using a reprinter. i.e. a tool which
takes a syntax tree, the original source file, and produces an updated source
file which preserves secondary notation. This can be used to build tools like
haskell-tools for arbitrary languages and write complicated refactoring suites.

Joachim Breitner contributed a prolific amount of open source projects including
a new technique (ghc-proofs) for proving the equivalence of Haskell programs
using a GHC plugin, (veggies) a verified simple LLVM code generator for GHC, and
(inspection-testing) new technique for verifying properties of Core.

  1. veggies
  2. ghc-proofs
  3. inspecction-testing

The ghc-proofs plugin allows us
to embed equation and relations into our Haskell module and potentially allow
GHC to prove them correct by doing symbolic evaluation and evaluating them as
far as possible and checking the equational relations of the equations sides.
Right now it works for contrived and simple examples, but is quite a interesting
approach that may yield further fruit.

{-# OPTIONS_GHC -O -fplugin GHC.Proof.Plugin #-}
module Simple where

import GHC.Proof
import Data.Maybe

my_proof1 = (\f x -> isNothing (fmap f x))
        === (\f x -> isNothing x)
$ ghc Simple.hs
[1 of 1] Compiling Simple           ( Simple.hs, Simple.o )
GHC.Proof: Proving my_proof1 …
GHC.Proof proved 1 equalities

Haddock

Haddock is creaking at the seams. Most large Haskell projects (GHC, Stack, Agda,
Cabal, Idris, etc) no longer use it for documentation. The codebase is dated and
long standing issues like dealing with reexported modules are still open.

There is a large vacuum for a better solution to emerge and compatibility with
RestructuredText would allow easy migration of existing documentation.

Databases

This year saw two new approaches to Haskell database integration:

  1. Selda - A library interacting with relational databases inspired by LINQ and Opaleye.
  2. Squeal - A deep embedding of PostgreSQL in Haskell using generics-sop.

This still remains an area in the ecosystem where many solutions end up
needlessly falling back to TemplateHaskell to generate data structures and
typically tend to be a pain point. Both these libraries use generics and
advanced type system features to statically enforce well-formed SQL query
construction.

Data Science & Numerical Computing

Chris Doran presented a concise interpretation of Geometric Algebra, a
generalization of linear algebra in terms of graded algebra written in
Haskell
. A talk on this
project was presented at the Haskell
Exchange
.

At ZuriHac several people expressed interest in forming a Data
Haskell
organization to work on advancing the
state of Haskell libraries for data science and macing learning. There is
considerable interest of the constructing the right abstractions for a
well-typed dataframe library.

Tom Nielson is furiously working on a
suite of of projects, mostly
related to data science, machine learning and statistics in Haskell.

The Accelerate project has been seeing a lot of
development work recently and now has support for latest versions of LLVM and
CUDA through llvm-hs project.

A group at the University of Gothenburg released
TypedFlow a statically typed
higher-order frontend to TensorFlow.

Eta

Typelead continued working on a Java Virtual Machine backend to GHC called
Eta. The project has seen considerable amount of
person-hours invested in compatibility and the firm has raised capital to pursue
the project as a commercial venture.

The project does seem to be diverging from the Haskell mainline language in both
syntax and semantics which raises some interesting questions about viability.
Although the Haskell language standard seems to be stuck and not moving forward,
so there's an interesting conundrum faced by those trying to build on top of GHC
in the long run.

WebAssembly

WebAssembly has been released and is supported by all major browsers as of
December 2017. This is been piquing the interest of quite a few Haskellers
(including myself) who have been looking for a saner way to interact with
browsers than the ugly hack of ejecting hundreds of thousands of lines of
generated Javascript source. WebAssembly in either standalone projects or as a
target from GHC's intermediate form STG to the browser offers an interesting
path forward.

  1. Haskell WASM
  2. WebGHC
  3. ministgwasm
  4. forest-lang

Industry

An incomplete list of non-consulting companies who are actively using Haskell or
have published Haskell open sources libraries follows:

+----------------------+
| Product Companies |
+======================+
| Standard Chartered |
+----------------------+
| Galois |
+----------------------+
| Intel |
+----------------------+
| Target |
+----------------------+
| Uber |
+----------------------+
| Vente Privee |
+----------------------+
| FrontRow |
+----------------------+
| NStack |
+----------------------+
| Morgan Stanley |
+----------------------+
| Takt |
+----------------------+
| Fugue |
+----------------------+
| Habito |
+----------------------+
| Asahi Net |
+----------------------+
| Sentenai |
+----------------------+
| IOHK |
+----------------------+
| Awake Networks |
+----------------------+
| Facebook |
+----------------------+
| Adjoint |
+----------------------+
| DigitalAsset |
+----------------------+
| AlphaSheets |
+----------------------+
| Channable |
+----------------------+
| SlamData |
+----------------------+
| Wire |
+----------------------+
| JP Morgan |
+----------------------+
| Bose |
+----------------------+

There are four consulting companies also supporting Haskell in industry.

+----------------------+
| Consultancies |
+======================+
| Well-Typed |
+----------------------+
| Tweag |
+----------------------+
| FP Complete |
+----------------------+
| Obsidian Systems |
+----------------------+

Conferences

Both the Haskell
Exchange

and ZuriHac conference had record attendance this year. Haskell was
well-represented in many interdisciplinary conferences include StrangeLoop,
CurryOn, LambdaDays, LambdaConf, and YOW.

Most of the videos are freely available online.

  1. Haskell Exchange Videos
  2. ComposeConf Videos
  3. ZuriHac Videos
  4. ICFP Videos
Haskellers by the lake in Rapperswil, Switzerland

In 2018 I have plans to visit Zurich, Sydney, Budapest, Amsterdam, San
Francisco, Paris, Copenhagen and Tokyo this year. Hopefully I get to meet more
Haskellers and share some ideas about the future of our beautiful language.
Until then, Merry Christmas and Happy New Haskell Year.