Haskell Poll Results

I put out a call for data and comments about topics that Haskell people felt
were under represented. I'm sure I'll take some flak for the informal poll and
methodology, but I feel that having at least some concrete data about the
Haskell zeitgeist is better than nothing.

As my friends have noted, baked into the poll is a hypothesis that people will
give different responses based on which domain they use Haskell for ( i.e.
compiler developers have different focus than web developers). This seems to be
confirmed by the data, but at the same time also self-selects for people who
have a narrow focus. Also given the nature of the collection, we're only going
to select people who are willing to respond and active on Haskell forums.

The total counts include all individuals, even those who gave a "None of the
above" or write-in for their domain. These are some of the most bizarre results,
but when these points are excluded and then factored on domain the results seem
much more sensible.

Thanks again for all those who volunteered their opinions.

Domains

What category best classifies the domain of problems you use Haskell for?

The first question concerned which domain of programming the questionee is
involved in. This field was exclusive choice so that we could bin on it when
doing statistics later. The most popular domains in order are:

  1. Web Development
  2. Compiler Design
  3. Pure Mathematics or CS Theory
  4. Data Analysis
  5. Numerical Computing
  6. Education
  7. Financial Modeling

The number of people involved in compiler development was somewhat surprising
result to say the least. The other domains seemed to fall out fairly natural.
There were also quite a few write-ins in various forms, and many comments
indicating multidisciplinary fields. The write-ins were excluded from the later
binning on the various factors and only included in the total count.

Skill Level

How would you subjectively rate your Haskell skill level?

The self-rated skill level turned out to be a fairly typical distribution with a
median of 5, mode of 5, and mean of 5.3. 53% of Haskellers in the poll rated
themselves 5 or lower while 32% rated themselves as 7 or higher.

Given the self-selecting nature of this question, this result is probably not
meaningful or accurate.

Type System

What Haskell type system features do you feel are under represented or often misunderstood?

The six most mentioned type systems features surprised me a bit, they were:

  1. Impredicative Types
  2. Kind Polymorphism
  3. Singletons
  4. Rank-N Types
  5. GADTs
  6. Type Families

Impredicative Types is a curious answer. I'm baffled why it seems so
dominant. The extension is widely considered to be broken or a misfeature, and I
don't think I've never seen it used in the wild. My only guess is that it
relates to it showing up so frequently in GHC error reporting that many people
are curious about it having never actually used it.

Kind Polymorphism is understandable since it's a fairly recent addition to
GHC and already there seems to a need for many poly-kinded versions of existing
structures in Base. Kind promotion itself is still a very under-used feature.

Singletons is also a rather fruitful modern area of research in bringing
some semblance of dependent types to Haskell. The
singletons library has been the
subject of several ICFP and meetup talks.

Rank N-Types invariably seems to always be a point of confusion in some
discussions. I would indeed say that higher-ranked polymorphism is not widely
understood and can be very subtle.

Type Families is also a fairly new feature in GHC, and the subject of much
active exploration. Only a few months ago did GHC 7.8 get closed type families,
giving us the ability to encode much more complex
logic
at the type level

Binned amongst the Web Development user group, the most mentioned topics
are:

  1. Impredicative Types
  2. GADTs
  3. Type Families

Binned amongst the Compiler Design user group, the most mentioned topics
are:

  1. Impredicative Types
  2. Kind Polymorphism
  3. Type Families

Binned amongst the Pure Mathematics or CS Theory user group, the most
mentioned topics are:

  1. Singletons
  2. Kind Polymorphism
  3. Type Families

Binned amongst the Data Analysis user group, the most mentioned topics are:

  1. Kind Polymorphism
  2. Type Families
  3. Rank-N Types

Binned against individuals who self-rated themselves as 7 or higher skill
the most mentioned topics are:

  1. Singletons
  2. Impredicative Types
  3. Kind Polymorphism

Patterns

Which common language patterns do you feel are under represented or misunderstood?

The pattern results were also somewhat surprising as well, they were:

  1. F-Algebras
  2. Cont
  3. GHC.Generics
  4. Profunctors
  5. Final Interpreters
  6. Arrows

F-Algebras is also a puzzling response, but was overwhelming the most
mentioned response from the total count. There are some great
articles
about
the relations between F-Algebras and catamorphisms. They are used somewhat
rarely and I'm genuinely surprised about why this answer is the most popular in
the total count.

Continuation passing and CPS conversion seems to be one of those thuddingly
concrete topics that seems to confuse more than it should. Continuations do
invert the way we normally think about control flow which can be confusing.

GHC.Generics are another topic which is indeed rather under represented, at
the time of writing this I cannot actually think of a resource to point anyone
at that explains how to use Generics beyond what the GHC
manual

explains. At the same time Generics are incredibly powerful and useful.

Profunctors is understandably confusing, and puzzlingly it seems to be a
dependency of a large number of libraries on Hackage while the library itself
has limited documentation.

Arrows is also very understandable. They seem to have been a very active
area of research 10 or so years ago. Leaving us with a lot of half-baked
libraries around seemingly beautiful ideas, that then died out leaving us only
with hints of the possibilities of arrows. That and ArrowSyntax extensions
which are very odd and seem to be understood or used by shockingly few people.

Binned amongst the Web Development user group, the most mentioned topics
are:

  1. van Laarhoven Lenses
  2. Exception Handling
  3. Template Haskell

Binned amongst the Compiler Design user group, the most mentioned topics
are:

  1. Cont
  2. Free Monads
  3. Profunctors

Binned amongst the Pure Mathematics or CS Theory user group, the most
mentioned topics are:

  1. F-Algebras
  2. Cont
  3. Profunctors

Binned amongst the Data Analysis user group, the most mentioned topics are:

  1. Free Monads
  2. van Laarhoven Lenses
  3. Heterogeneous Lists

Binned against individuals who self-rated themselves as 7 or higher skill
the most mentioned topics are:

  1. GHC.Generics
  2. F-Algebras
  3. Cont

There were no write-ins for this category.

Libraries

Which common ( non-web ) libraries do you feel are under represented or misunderstood? Be generous in your responses!

The libraries section was admittedly a bit of a grab bag, there is no way to
poll on all of Hackage so inevitably I had to chose an arbitrary sample of
cross-domain libraries. A more exhaustive poll of all of Hackage libraries
people are interested in is something I would be interested in doing, but I'm
not sure how to do it in a methodological way.

I chose not to include web libraries since they often
tend to fall under an umbrella project ( yesod, snap, happstack ) and exhibit
some odd clustering behavior that makes them somewhat unique amongst other
packages. The top 20 packages are listed below:

  1. repa - A numerical library for high performance, regular,
    multi-dimensional, shape polymorphic parallel arrays.
  2. uniplate - A generics library for traversals and rewrites.
  3. mmorph - Monad morphisms, a utility library for working with monad
    transformers.
  4. free - A implementation of free monads.
  5. lens-family - A lightweight minimalistic lens library in the van
    Laarhoven style.
  6. unbound - A binder library for capture avoiding substitution for building
    type checkers and interpreters.
  7. operational - A monadic utility library for building complex monadic
    control flow.
  8. pipes - A coroutine streaming library with strong categorical
    foundations.
  9. parsec - A parser combinator library.
  10. esqueleto - A SQL query embedded DSL.
  11. safe - A utility library providing total function variants for many
    Prelude partial functions.
  12. accelerate - A numerical library for parallel array computing with
    various backends.
  13. resourcet - Deterministic allocation and freeing of scarce resources
  14. fgl - Functional graph theory library.
  15. optparse-applicative - Command line option parsing.
  16. quickcheck - Property based testing framework.
  17. hakyll - Static website generator.
  18. vector - Generic computing library providing boxed and unboxed
    contigious memory arrays and fusion.
  19. llvm-general - Bindings to the LLVM code generation and compiler
    framework.
  20. diagrams - Drawing library and embedded domain language for vector
    graphics.

Binned amongst the Web Development user group, the most mentioned libraries
are:

  1. pipes
  2. esqueleto
  3. mmorph

Binned amongst the Compiler Design user group, the most mentioned libraries
are:

  1. uniplate
  2. graphscc
  3. llvm-general

Binned amongst the Pure Mathematics or CS Theory user group, the most mentioned libraries
are:

  1. repa
  2. uniplate
  3. free

Binned amongst the Data Analysis user group, the most mentioned libraries
are:

  1. repa
  2. accelerate
  3. lens-family

Binned against individuals who self-rated themselves as 7 or higher skill
the most mentioned libraries are:

  1. mmorph
  2. repa
  3. uniplate
  4. free
  5. alex/happy
  6. criterion

The most popular write-ins were:

  1. reactive
  2. uu-parsinglib
  3. lambdacube-gl
  4. trifecta
  5. machines
  6. recursion-schemes

Most of these results are self explanatory and reflect my intuition about
Hackage as well. There are some weird anomalies though:

Repa seemingly has a large amount of
tutorials
and worked examples so this
result has me scratching my head a little bit.

On a personal note, I'm somewhat saddened by how often llvm-general shows up
given how much time I spent on, what I thought, was a very extensive
tutorial
on the subject.

A lot of people don't understand the difference between mtl and transformers and think mtl is the only way to do monad transformers

hsc3, the supercollider library, could really use some documentation IMO. One can rely on the supercollider docs, but that's an extra layer of lookup, and you have to infer the meaning of arguments that don't always correspond exactly. Seems to me sound is one area where new programmers might be interested in playing with haskell, unfortunately its not too noob friendly.

Language Features

Which aspects of GHC do you feel are under represented or misunderstood?

For language features I tried to poll on topics specific to GHC's implementation
details. The results were overwhelmingly about performance and profiling:

  1. Profiling Memory
  2. Rewrite Rules / Fusion
  3. Cross Compilation
  4. Profiling CPU
  5. Memory Representation
  6. Inlining

Binned amongst the Web Development user group, the most mentioned topics
are:

  1. Profiling Memory
  2. Profiling CPU
  3. Laziness ( Strictness Annotations )

Binned amongst the Compiler Design user group, the most mentioned topics
are:

  1. Cmm
  2. STG
  3. Memory Representation

Binned amongst the Pure Mathematics or CS Theory user group, the most
mentioned topics are:

  1. Profiling Memory
  2. Profiling CPU
  3. Inlining

Binned amongst the Data Analysis user group, the most mentioned topics are:

  1. Profiling Memory
  2. Laziness ( Strictness Annotations )
  3. Inlining

Binned against individuals who self-rated themselves as 7 or higher skill
the most mentioned topics are:

  1. Profiling Memory
  2. Cross-Compilation
  3. Inlining

The most mentioned write-ins were:

  1. SIMD
  2. Compiler Passes
  3. Compiler Plugins

Along the lines of performance profiling, I think GHC's execution model and heap representation are discussed less frequently than they deserve.

Quality overviews on term rewriting and optimization steps on Haskell Core(System FC) in the GHC. I can tell it's out there, but information seems fragmented and a good quality article on the wiki would be very appreciated.

Language interop, the C FFI is just the start of the story. How to play nice with the GC with foreign data? How to play nice with Haskell data from the other side?

Critical Comments

In my work, arrows and categories are most useful in constructing lenses. I think lenses are actually a pretty simple idea but the most popular lens library is bloated and defines a multitude of esoteric infix operators.

There doesn't seem to be any areas in any of the categories above which wouldn't benefit from more documentation. Almost all areas suffer from a lack of explained examples. The more I use almost any library, the more it seems to be lacking in good extensive documentation and examples.

Conventions in web API client design and trade offs for different choices. E.g. Typeclasses, free monads, etc. for example, suppose you want to make a web client agnostic of the underlying HTTP client, what's the best approach? Most people use typeclasses for this, but Haskell has many tools to tackle this problem.

The "reactive" library seems to be very useful, but it is still very abstract. It would be nice to see more focus on this, providing more examples for how it can be used.

I think the biggest use to myself and the community would be more articles like Gabriel Gonzalez has done that show how to use important Haskell constructs, like monoids or free monads, to structure help program design.

There's some exciting developments in this area--see the Haste presentation by StrangeLoop and others). And the efforts to bring React bindings to Haskell. All of the pieces exists in some form currently, but we have a ways to go before they mature.

Takeaway

This is of course an unscientific poll and please don't read too much into the
data. The goal was to generate a rough list of the topics that people are
interested in and feel need some more context.

On that note, if you are looking for topics for your next blog and want to
maximize the coverage of misunderstood topics and advance the state of Haskell
knowledge; consider one of the following subjects:

  1. Types: Impredicative Types
  2. Types: Kind Polymorphism
  3. Types: Singletons
  4. Language: Profiling Memory
  5. Language: Rewrite Rules / Fusion
  6. Language: Cross Compilation
  7. Library: repa
  8. Library: uniplate
  9. Library: mmorph
  10. Library: free
  11. Library: lens-family
  12. Pattern: F-Algebras
  13. Pattern: Cont Monad
  14. Pattern: GHC.Generics