Count Your BLAS-ings

One nice thing about open-source software is that users often have a lot of choices.  Such is the case with R, for instance the thousands of contributed packages available on CRAN.  My focus here is on BLAS, the core of matrix operations in R, where again there are interesting choices available to users who wish to take advantage of them.

The Basic Linear Algebra Subroutines have been around for many decades, used throughout science and engineering, in various implementations.  A fairly efficient BLAS implementation is included with R, but for those with heavy linear algebra needs, several open-source alternatives are available, such as ATLAS, ACML and OpenBLAS, as well as the commercial Intel MKL.  (Recently Revolution Analytics announced an open version of their R platform that includes the MKL.)

Here I will discuss  OpenBLAS, a library currently attractive to many, due to its open-source nature and ability to make use of the multicore machines that are so common today.   I’ll focus on numerical accuracy.

This should not be considered a detailed tutorial on OpenBLAS, but here is a “hand-waving” overview of its usage and installation. Usage couldn’t be simpler, actually; you just continue business as usual,  with OpenBLAS transparently doing what base-R BLAS has always done for you.   Under more advanced usage, you might try to tweak things by setting the number of cores.

Installation is only a bit more elaborate, if you are comfortable building R from source.  At the configure stage, I ran

configure --prefix=/home/matloff/MyR311 --enable-BLAS-shlib

After running the usual make and make install.  I then needed to do a symbolic link of libRblas.so to the OpenBLAS library.  Since I’ve been doing timing comparisons for my book, I’ve made shell aliases to run either stock BLAS or OpenBLAS; a more sophisticated approach would have been to use update-alternatives.  

No doubt about it, OpenBLAS is fast, and many timing comparisons for R are to be found on the Web, including the Revo link above.  But what about numerical accuracy?  After seeing Mike Hannon’s recent post on R-help, along with Brian Ripley’s reply, I became curious, I searched the Web for information on this aspect, and came up empty-handed.  So, I  will present here the results of some simple experiments I’ve done as a result.

First, though, a disclaimer:  Although I know the basics of numerical analysis, I am not an expert in any sense, including the sense of being an expert on the various BLASes.  If anyone out there has more to add, that would be highly appreciated.

OpenBLAS derives its speed not just from making use of multiple cores, but also from various tweaks of the code, yielding a very fine degree of optimization.  One can thus envision a development team (which, by the way, took over the old Goto BLAS project) so obsessed with speed that they might cut some corners regarding numerical accuracy.  Thus the latter is a subject of legitimate concern.

For my little test here, I chose to compute eigenvalues, using R’s eigen() function.  I generated p x p unit covariance matrices (1s on the diagonal, ρ everywhere off the diagonal) for my test:

covrho <- function(p,rho) {
 m <- diag(p)
 m[row(m) != col(m)] <- rho
 m
}

I tried this with various values of p and ρ; here I’ll show the results for 2500 and 0.95, respectively.  The machine used has 16 cores, plus a hyperthreading degree of 2; OpenBLAS likely used 32 threads.

With the standard R BLAS, the elapsed time was 57.407.  Under OpenBLAS, that time was reduced to 12.101.

But interestingly, the first eigenvalue was found to be 2375.05 in both cases.  (This was the exact value in eout$values[1], where eout was the return value from eigen().)

Changing ρ to 0.995, I got reported principal eigenvalues of 2487.505 in both cases.  (Timings were roughly as before.)

As another example, I also tried finding the matrix inverse for this last matrix, using solve().   Both versions of BLAS gave 199.92 as the [1,1] element of the inverse.  Interestingly, though, there was a wider time discrepancy, 53.955 seconds versus 0.933.

It is a little odd that the numbers come out with so few decimal places.  I wonder whether R is deliberately doing some rounding, based on estimates of accuracy.  In any event, I would generally caution against looking at too many decimal places, no matter how good the accuracy is, since typically the input data itself is not so accurate.

So, it seems, at first glance, that OpenBLAS is doing fine.  But Brian has an excellent point about the value of sticking with the tried-and-true, in this case meaning, R’s default BLAS implementation.

I invite you to try your own accuracy comparisons, and post them here.

Advertisements

5 thoughts on “Count Your BLAS-ings”

  1. I don’t think that comparing a few numbers with 2 or 3 decimal digits is a valid way of comparing. It would be most surprising if the difference between computed numbers would show any difference at all in the first 2 or 3 decimal digits.

    One should compare all the numerical outcomes on a binary level or at least with all.equal and a tight tolerance level.

    1. Excellent idea, thanks.

      I just tried it on the examples in my post, and all.equal() returned TRUE in all cases, with the default value of tolerance, about 10-8. I tried a more stringent value in one case, and still got TRUE.

      As I said, it may well be that R has already rounded off the results, based on an estimated accuracy level. If so, all.equal() would not catch discrepancies between the two BLAS implementations.

  2. You should be able to see many more digits by invoking the following in R:
    options(digits = 20)

    This should allow you to see the values as stored in memory in the C SEXP object underlying the R object. R isn’t doing any rounding except in the sense of only printing out a limited number of digits. The numerical precision of computing with and storing numbers as 64 bit double precision floating points is the real limitation here.

    1. I tried it, again with a 2500×2500 correlation matrix, off-diagonal elements equal to 0.95. For plain R, the first eigenvalue was reported to be 2375.0500000000147338, while OpenBLAS gave it as 2375.049999999991087. The proportional difference, about 10-14, seems pretty good for such an ill-conditioned matrix. And of course, we don’t know which reported eigenvalue is closer to the “true” one.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s