In my last post, I discussed R software, including mine, that handles heteroscedastic settings for linear and nonlinear regression models. Several readers had interesting comments and questions, which I will address here.
To review: Though most books and software assume homoscedasticity, i.e. constancy of the variance of the response variable at all levels of the predictor variables, this condition seldom if ever holds in real life. This can make a mockery of the reported standard errors, and thus render confidence intervals and tests rather meaningless. But fortunately, there is asymptotic theory that gives use valid standard errors in the heteroscedastic case, due to Eickert, White and others, for the linear case. In R, this is available via the functions vcovHC() and hccm() in the CRAN packages sandwich and car, respectively.
I mentioned that I had adapted the E-W idea to the nonlinear realm, leveraging the linear-approximation core of the Gauss-Newton method for nonlinear least-squares computation, by calling vcovHC() on the linear approximation. The result is the function nlshc() in my package regtools. (Unfortunately, I erroneously used its old name, nlsvcovhc() in my post.) One simply calls nlshc() on an object of class ‘nls’ (output of R’s nls() function) to obtain standard errors that are valid under heteroscedasticity.
Achim Zeileis, one of the authors of sandwich, then made some comments on my blog post. He pointed out that one can actually call sandwich() directly on an ‘nls’ object (again, due to the linear approximation). After reading my post, he tried that, and found that the resulting standard errors were more realistic than what one gets directly from nls(), but they still were not as good as mine, where “good” means correct confidence levels for confidence intervals and so on. He speculated that this was due to the particular variant of E-W that I use. Well, here is an update.
First, Achim’s speculation was right. I tried some of the other variants of E-W, and they didn’t do as well. Of course, there may be other factors at work as well.
Upon further investigation, I found another intriguing phenomenon. I tried nlshc() on one of the NIST datasets, as follows:
library(nlstools)
data(vmkmki)
regftn <- function(t,u,b1,b2,b3) b1 * t / (t + b2 * (1 + u/b3))
bstart <- list(b1=1,b2=1, b3=1)
z <- nls(v ~ regftn(S,I,b1,b2,b3),data=vmkmki,
start=list(b1=1,b2=1, b3=1))
To my surprise, the output was an all-NAs matrix. This did not happen with hccm(), This is strange, as vcovHC() works fine otherwise, e.g. in the simulation I posted last time.
So, for the time being, in the new nlshc() I have replaced the call to vcovHC() by a call to hccm(). I’ve also added an option for the user to choose a different E-W variant, though given what we know so far, I don’t recommend it.
A couple of commenters asked about using alternatives to nls(), which in fact I had done with nlsLM(). I had done that because I had had some convergence problems with nls(), but actually have reverted to nls() in the new version of the package. (I will also make my function an S3 method later on, which I had originally decided not to do, but have changed my mind with Achim’s encouragement 🙂 )
What was the reason to revert from nlsLM to nls again? Any issues I should know of?
Cheers,
Andrej (one of the nlsLM developers)
I was having some convergence problems with one particular data set, and thought that the L-M approach may be more promising. However, this is counterbalanced by a desire to minimize the number of external packages that my regtools package will depend on. So, I stopped insisting that nlsLM() be used instead of nls().
However, users certainly can still use nlsLM() if they choose to do so, since it is compatible with nls() in the ways I need, and I will add a comment to that effect in the help file.
The problem with vcovHC() is due the last sixteen observations where all regressors are exactly zero. I’ve added a bug fix in the sandwich package, will do some testing, and then make a new CRAN release.
Version 2.3-4 of the “sandwich” package fixes the problem and is now online on the CRAN master site.
Interesting, thanks for checking. Odd that the problem didn’t occur with sandwich().
In sandwich() I just compute the outer product of gradients. But in vcovHC() I want to determine the working residuals and do that by dividing the gradients by the regressors which led to a 0/0 division in this case.
I chose this solution because some other conceivable routes did not work for all objects I wanted to have supported in the same way. In hindsight it would have been better to encapsulate the standardization in some method dispatch rather than trying to unify everything through bread() plus estfun()…