The “Secret Sauce” Used in Many qeML Functions

In writing an R package, it is often useful to build up some function call in string form, then “execute” the string. To give a really simple example:

> s <- '1+1'
> eval(parse(text=s))
[1] 2

Quite a lot of trouble to go to just to find that 1+1 = 2? Yes, but this trick can be extremely useful, as we’ll see here.

data(svcensus)
z <- qePCA(svcensus,'wageinc','qeKNN',pcaProp=0.5)

This says, “Apply Principal Component Analysis to the ‘svcensus’ data, with enough PCs to get 0.5 of the total variance. Then do k-Nearest Neighbor Analysis, fitting qeKNN to the PCs to predict wage income.”

So we are invoking a qeML function that the user requested, on the user’s requested data. Fine, but the requested qeML function is called using its default values, in this case k = 25, the number of neighbors. What if we want a different value of k, say 50? We can run

z <- qePCA(svcensus,'wageinc','qeKNN',pcaProp=0.5),
   opts=list(k=50))

But how does the internal code of qePCA handle this? Here is where eval comes in. Typing qePCA at the R > prompt shows us the code, two key lines of which are

cmd <- buildQEcall(qeName,"newData",yName, 
   opts=opts,holdout=holdout)
qeOut <- eval(parse(text = cmd))

So qeML:::buildQEcall builds up the full call to the user-requested function, including optional arguments, in a character string. Then we use eval and parse to execute the string. Inserting a call in qePCA to browser (not shown), we can take a look at that string:

Browse[1]> cmd
[1] "qeKNN(data = newData,yName=\"wageinc\",holdout = 1000,k=50))"

So qeML:::buildQEcall pieced together the call to the user’s requested function, qeKNN, on the user’s requested data–and with the user’s requested optional arguments. In handling the latter, it among things called names(opts), which got the argument names, ‘k’ here, in string form, exactly what we need.

Note too R’s call function, which similarly creates a function call. (Yes, call produces a call, just like function produces a function.) E.g.,

> sqrt(2)
[1] 1.414214
> sqcall <- call('sqrt',2)
> class(sqcall)
[1] "call"
> eval(sqcall)
[1] 1.414214

All of this is an example of R’s metaprogramming capabilities–code that produces code–a really cool feature of R that computer science people tend to be ignorant of. Next time you encounter a CS Pythonista who dismisses R as “not a real language,” reply “R is built on functional programming and OOP models, with powerful metaprogramming facilities…”

7 thoughts on “The “Secret Sauce” Used in Many qeML Functions”

Pingback: The “Secret Sauce” Used in Many qeML Functions – Data Science Austria

It may be good to note you can pass in functions directly, it does not have to be the string. So for example:

f2 <- function(x,f){
x2 <- f(x)
return(x2/2)
}

# Built in dnorm
f2(1,dnorm)

# custom function
cf <- function(x){return(x+1)}
f2(1,cf)

matloff says:

November 22, 2023 at 3:00 pm

Right, but the key point is that function in question may have optional arguments. If qeML were not to offer the ‘opts’ argument here, the user would have to create his/her own function with those arguments set, essentially what you are doing here with ‘cf’. I always say, “Convenience is the name of the game in computing,” and the ‘opts’ argument is conveniencing the user.

Reply
1. carlwitthoft says:
  
  November 22, 2023 at 5:03 pm
  
  That’s what the “…” argument is for
  
  Reply
  1. matloff says:
    
    November 22, 2023 at 5:09 pm
    
    Definitely another way to do it.
    
    Reply

Horses for courses I know, python makes this easier via **kwargs, but here is how I like to do data pipelines with multiple steps.

# Example with multiple functions
pipeline <- list(Step1=list(func=dnorm,kwargs=list(mean=2,sd=3)),
Step2=list(func=dexp,kwargs=list(rate=0.5)),
Step3=list(func=dnorm,kwargs=list())
)

fpipe <- function(x,fs){
for (step in fs){
x <- do.call(step$func,c(list(x),step$kwargs))
}
return(x)
}

d dnorm(mean=2,sd=3) |> dexp(rate=0.5) |> dnorm()

matloff says:

November 22, 2023 at 4:50 pm

I’m a big Python fan, but NOT for Data Science https://tinyurl.com/y3tbozlv, and wish to keep my remarks on R.

Reply

	Anonymous on Just How Good Is ChatGPT in Da…
	Quantile Regression… on Quantile Regression with Rando…
	Anonymous on Quantile Regression with Rando…
	Sina Özdemir on qeML Example: Nonparametric Qu…
	Anonymous on qeML Example: Nonparametric Qu…

Mad (Data) Scientist

The “Secret Sauce” Used in Many qeML Functions

7 thoughts on “The “Secret Sauce” Used in Many qeML Functions”

Leave a comment Cancel reply

Musings, useful code etc. on R and data science

Share this:

Related

7 thoughts on “The “Secret Sauce” Used in Many qeML Functions”

Leave a comment Cancel reply

Musings, useful code etc. on R and data science