F# microbenchmark study

Recently, there was some discussion about a set of microbenchmarks  reported in a study called Clash of the Lambdas which compared a simple stream/sequence benchmark using Java 8 Streams, Scala, C# LINQ and F#. I am learning F# and as a learning exercise I decided to re-implement one of the benchmarks (Sum of Squares Even) myself in F# without referring to the code provided by the authors.

The source of my implementation can be found on Bitbucket and binaries are also provided.  My interest was to test/compare various F# implementations and not cross-language comparison.  I implemented it in four different ways:

  • Imperative sequential for-loop
  • Imperative parallel version using Parallel.For from Task Parallel Library
  • Functional sequential version using F# sequences
  • Functional parallel version using F# PSeq from FSharp.ParallelSeq
  • UPDATE: I added a functional version using the Nessos Streams package as suggested by Nick Palladinos on twitter

I compiled using VS 2013 Express and F#3.1 with “Release” settings, Any CPU (32-bit not preferred) and ran it on my machine on 3 different CLR implementations:  MS CLR from .net SDK 4.5.2 running on Windows 8.1, MS CLR RyuJIT CTP4 and finally on OpenSUSE 13.1 using Mono 3.4 (sgen GC, no LLVM).

The results are as follows:

Imperative Functional
Sequential Parallel Sequential Parallel Streams
MS CLR 17 8 172 81 45
MS RyuJIT CTP4 18 7 168 76 44
Mono 88 23 240 797 97

Some observations for this microbenchmark:

  • Imperative version is far faster than the functional version, but the functional version was shorter and clearer to me. I wonder if there is some opportunity for compiler optimizations in the F# compiler for the functional version, such as inlining sequence operations or fusing a pipeline of operations where possible.
  • MS RyuJIT CTP4, which is the beta version of the next-gen MS CLR JIT, is performing similar to the current MS CLR. This is good to see
  • Mono is much slower than the MS CLR. Also, it absolutely hates F# parallel sequences for some reason.  I guess I will have to try and install Mono with LLVM enabled and then check the performance again.
  • Streams package from Nessos looks to be faster than F# sequences in this microbenchmark. It is currently sequential only but performs much faster than even PSeq.

These observations only apply to this microbenchmark, and probably should not be considered as general results.  Overall, it was a fun learning experience, especially as a newcomer to both F# and the .net ecosystem. F# looks like a really elegant and powerful language and is a joy to write.  There is still a LOT more to learn about both. For example, I am not quite clear what the best way to distribute .net projects as open-source. Should I distribute VS solution files? I am more used to distributing build files for CMake, Make, scons, ant etc. and looking more into FAKE. NuGet is also nice-ish though appears to be useful but not very powerful (eg: can’t remove packages) and merits further investigation.

Getting F# running on Linux

Getting F# running on Linux took a lot more effort than I anticipated. I am documenting the process here in the hope it may benefit someone (maybe myself) in the future. For reference, I am using OpenSuse 13.1.

  • F# is not compatible with all versions of Mono. For example, my distro repos have Mono 3.0.6 which appears to have some issues with F#. Instead, I found some people make new Mono packages available for various distros using Opensuse Build Service (OBS). For example, check out tpokorra repos for various distros such as OpenSUSE, CentOS,  Debian etc. I installed “mono-opt” and related packages. It installed mono 3.4 into /opt/mono directory.
  • If you install mono into /opt/mono, then ensure that you do append “/opt/mono/lib” into the LD_LIBRARY_PATH environment variable and /opt/mono/bin to the PATH variable. I did this in my .bashrc.
  • By default, /opt/mono/bin/mono turned out to be a symlink to /opt/mono/bin/mono-sgen. Now it appears that Mono has two versions: one using sgen GC and one using Boehm GC. I have had trouble with compilng F# using mono-sgen so I removed /opt/mono/bin/mono and then created it as a symlink to /opt/mono/bin/mono-boehm.
  • Now open up a new shell. In this shell, set up a few environment variables temporarily required for building F#. First, “export PKG_CONFIG_PATH=/opt/mono/lib/pkgconfig”. Next, we need to setup some GC parameters for Mono. It turns out compiling F# requires a lot of memory and Mono craps out with default GC parameters. I have a lot of memory in my laptop, so I set the Mono GC to use upto 2GB as follows: “export MONO_GC_PARAMS=max-heap-params=2G”. These two settings likely won’t be required after you have compiled and installed F#.
  • Now you can follow the instructions given on the F# webpage.

Specifically I did

  • git clone https://github.com/fsharp/fsharp
  • cd fsharp
  • ./autogen –prefix /opt/mono  #Keep things consistent with rest of mono install
  • make  #Takes a lot of time
  • su
  • make install