Most books on regression analysis assume homoscedasticity, the situation in which Var(Y | X = t), for a response variable Y and vector of predictor variables X, is the same for all t. Yet, needless to say, almost all data in real life is heteroscedastic. For Y = human weight and X = height, say, we know that the assumption of homoscedasticity can’t be true, even approximately.
Typical books discuss assessment of that assumption using residual plots and the like, but then leave it at that. Rather few books mention Eicker-White theory, which develops valid asymptotic inference for heteroscedastic data. E-W is really nice, and guess what! — it’s long been available in R, in the sandwich and car packages on CRAN. (Note that the latter package is intended for use with a book that does cover this topic, J. Fox and S. Weisberg, An R Companion to Applied Regression, Second Edition, Sage, 2011.) Then instead of using R’s standard vcov() function to obtain estimated variances and covariances of the estimates of the βi, we use vcovHC() and hccm(), respectively.
One can make a similar derivation for nonlinear regression, which is available as the function nlshc() in my regtools package. The package will be associated with my own book on regression, currently in progress. (The package is currently in progress as well, but already contains several useful functions.) The rest of this post is adapted from an example in the book.
The model I chose for this simple example was E(Y | X = t) = 1 / t’β, where the distributions of the quantities can be seen in the following simulation code:
sim <- function(n,nreps) {
b <- 1:2
res <- replicate(nreps,{
x <- matrix(rexp(2*n),ncol=2)
meany <- 1 / (x %*% b)
y <- meany + (runif(n) - 0.5) * meany
xy <- cbind(x,y)
xy <- data.frame(xy)
nlout <- nls(X3 ~ 1 / (b1*X1+b2*X2),
data=xy,start=list(b1 = 1,b2=1))
b <- coef(nlout)
vc <- vcov(nlout)
vchc <- nlsvcovhc(nlout)
z1 <- (b[1] - 1) / sqrt(vc[1,1])
z2 <- (b[1] - 1) / sqrt(vchc[1,1])
c(z1,z2)
})
print(mean(res[1,] < 1.28))
print(mean(res[2,] < 1.28))
}
The results were striking:
> sim(250,2500)
[1] 0.6188
[1] 0.9096
Since the true value should be 0.90 (1.28 is the 0.90 quantile of N(0,1)), the Eicker-White method is doing an outstanding job here — and R’s built-in nonlinear regression function, nls() is not. The latter function’s reported standard errors are way, way off the mark.
Hi Norm: I don’t have time to try it right now but a better comparison might be use say Rvmmin ( I guess you could try optim and say BFGS but I know for a fact that L-BFGS-B is buggy and I’ve had good results with Rvmmin so I currently stick with that and stay away from anything related to BFGS including BFGS itself ) and numerically optimize the likelihood and then use the hessian to calculate the asymptotic covariance matrix and see how that compares. I’ve done things like that and, in some cases, the results were pretty comparable. But I was dealing with simple error terms like ma(1) and ar(1) which weren’t a function of X.
Using something like nls is kind of black boxy and I was never successful ( emailed douglasss bates but never heard back ) in figuring out exactly what it does under the hood. But I also wasn’t familar with that function you used to obtain vchc so, in retrospect, that could have been quite useful. Where did you find that function ? Thanks.
I’ve always found nonlinear regression to be rather treacherous. Often it is difficult to find starting values that lead to convergence, and if one does finally achieve convergence in such a setting, will the result mean anything statistically? I note such issues in my book.
I generally try to stay away from likelihood-based models. They can never be exactly correct, and will typically be way wrong in the tails.
I’m not sure what the query in your final paragraph means.
Possibly Mark was confused by the name of the function. Mark, you need to run devtools::install_github(“matloff/regtools”) and then you can say nlshc(nlout) – not nlsvcovhc(nlout) – to obtain the HC covariance matrix estimator.
Norm, thanks for the interesting post.
Note that the sandwich() function from the package of the same name, can also be applied to “nls” objects. Using sandwich’s object-oriented approach – see vignette(“sandwich-OOP”, package = “sandwich”) – the package just provides bread() and estfun() objects for “nls” objects and then sandwich() works automatically.
If I add the resulting simple sandwich() estimator to your simulation, it performs much better than the usual standard errors but not as well as your HC3-adapted version. I think this may be due to using the hat values in HC3…but I didn’t investigate this in more detail.
Further comments: The replication code in your blog calls the function nlsvcovhc() although it is actually called nlshc(). Maybe it would make more sense to call it vcovHC.nls and declare it an S3 method? Also you could allow the type argument to be set by the user instead of hard-coding HC3. Then vcovHC(nlout, type = “HC2”) etc. would also work. Finally, the nlshc() manual page has % signs which need to be escaped in .Rd format. Otherwise example(nlshc) does not work.
Interesting remarks! Thanks very much, Achim.
As I recall, I did consider applying the sandwich functions directly. I’ll look into it again, and report the results back here on the blog.
I’ll also consider a more OOP approach.
From what I see in nlshc code this function would work with nls objects too. So why is there a dependence on minpack.lm package? Also is there a reference for your approach used, i.e. that solving a certain regression gives a heteroscedasticity robust inference for nls problem? I am asking since I maintain a R package midasr which solves a special case of NLS, so I always look out for ways to improve it. Robust inference and superior optimisation routines are of great interest for me.
Thanks very much for the comments. I’ll reply in a new post, probably either today or tomorrow.