The “Secret Sauce” Used in Many qeML Functions

In writing an R package, it is often useful to build up some function call in string form, then “execute” the string. To give a really simple example:

> s <- '1+1'
> eval(parse(text=s))
[1] 2

Quite a lot of trouble to go to just to find that 1+1 = 2? Yes, but this trick can be extremely useful, as we’ll see here.

data(svcensus)
z <- qePCA(svcensus,'wageinc','qeKNN',pcaProp=0.5)

This says, “Apply Principal Component Analysis to the ‘svcensus’ data, with enough PCs to get 0.5 of the total variance. Then do k-Nearest Neighbor Analysis, fitting qeKNN to the PCs to predict wage income.”

So we are invoking a qeML function that the user requested, on the user’s requested data. Fine, but the requested qeML function is called using its default values, in this case k = 25, the number of neighbors. What if we want a different value of k, say 50? We can run

z <- qePCA(svcensus,'wageinc','qeKNN',pcaProp=0.5),
   opts=list(k=50))

But how does the internal code of qePCA handle this? Here is where eval comes in. Typing qePCA at the R > prompt shows us the code, two key lines of which are

cmd <- buildQEcall(qeName,"newData",yName, 
   opts=opts,holdout=holdout)
qeOut <- eval(parse(text = cmd))

So qeML:::buildQEcall builds up the full call to the user-requested function, including optional arguments, in a character string. Then we use eval and parse to execute the string. Inserting a call in qePCA to browser (not shown), we can take a look at that string:

Browse[1]> cmd
[1] "qeKNN(data = newData,yName=\"wageinc\",holdout = 1000,k=50))"

So qeML:::buildQEcall pieced together the call to the user’s requested function, qeKNN, on the user’s requested data–and with the user’s requested optional arguments. In handling the latter, it among things called names(opts), which got the argument names, ‘k’ here, in string form, exactly what we need.

Note too R’s call function, which similarly creates a function call. (Yes, call produces a call, just like function produces a function.) E.g.,

> sqrt(2)
[1] 1.414214
> sqcall <- call('sqrt',2)
> class(sqcall)
[1] "call"
> eval(sqcall)
[1] 1.414214

All of this is an example of R’s metaprogramming capabilities–code that produces code–a really cool feature of R that computer science people tend to be ignorant of. Next time you encounter a CS Pythonista who dismisses R as “not a real language,” reply “R is built on functional programming and OOP models, with powerful metaprogramming facilities…”

7 thoughts on “The “Secret Sauce” Used in Many qeML Functions”

  1. It may be good to note you can pass in functions directly, it does not have to be the string. So for example:

    f2 <- function(x,f){
    x2 <- f(x)
    return(x2/2)
    }

    # Built in dnorm
    f2(1,dnorm)

    # custom function
    cf <- function(x){return(x+1)}
    f2(1,cf)

    1. Right, but the key point is that function in question may have optional arguments. If qeML were not to offer the ‘opts’ argument here, the user would have to create his/her own function with those arguments set, essentially what you are doing here with ‘cf’. I always say, “Convenience is the name of the game in computing,” and the ‘opts’ argument is conveniencing the user.

  2. Horses for courses I know, python makes this easier via **kwargs, but here is how I like to do data pipelines with multiple steps.

    # Example with multiple functions
    pipeline <- list(Step1=list(func=dnorm,kwargs=list(mean=2,sd=3)),
    Step2=list(func=dexp,kwargs=list(rate=0.5)),
    Step3=list(func=dnorm,kwargs=list())
    )

    fpipe <- function(x,fs){
    for (step in fs){
    x <- do.call(step$func,c(list(x),step$kwargs))
    }
    return(x)
    }

    d dnorm(mean=2,sd=3) |> dexp(rate=0.5) |> dnorm()

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.