# Why Are We Still Teaching t-Tests?

My posting about the statistics profession losing ground to computer science drew many comments, not only here in Mad (Data) Scientist, but also in the co-posting at Revolution Analytics, and in Slashdot.  One of the themes in those comments was that Statistics Departments are out of touch and have failed to modernize their curricula.  Though I may disagree with the commenters’ definitions of “modern,” I have in fact long felt that there are indeed serious problems in statistics curricula.

I must clarify before continuing that I do NOT advocate that, to paraphrase Shakespeare, “First thing we do, we kill all the theoreticians.”   A precise mathematical understanding of the concepts is crucial to good applications.  But stat curricula are not realistic.

I’ll use Student t-tests to illustrate.  (This is material from my open-source book on probablity and statistics.)  The t-test is an exemplar for the curricular ills in three separate senses:

• Significance testing has long been known to be under-informative at best, and highly misleading at worst.  Yet it is the core of almost any applied stat course.  Why are we still teaching — actually highlighting — a method that is recognized to be harmful?
• We prescribe the use of the t-test in situations in which  the sampled population has an exact normal distribution — when we know full well that there is no such animal.  All real-life random variables are bounded (as opposed to the infinite-support normal distributions) and discrete (unlike the continuous normal family).  [Clarification, added 9/17:  I advocate skipping the t-distribution,  and going directly to inference based on the Central Limit Theorem.  Same for regression.  See my book.]
• Going hand-in-hand with the t-test is the sample variance. The classic quantity s2 is an unbiased estimate of the population variance σ2, with s2 defined as 1/(n-1) times the sum of squares of our data relative to the sample mean.  The concept of unbiasedness does have a place, yes, but in this case there really is no point to dividing by n-1 rather than n.  Indeed, even if we do divide by n-1, it is easily shown that the quantity that we actually need, s rather than s2, is a BIASED (downward) estimate of σ.  So that n-1 factor is much ado about nothing.

Right from the beginning, then, in the very first course a student takes in statistics, the star of the show, the t-test, has three major problems.

Sadly, the R language largely caters to this old-fashioned, unwarranted thinking.  The var() and sd() functions use that 1/(n-1) factor, for example — a bit of a shock to unwary students who wish to find the variance of a random variable uniformly distributed on, say, 1,2,…,10.

Much more importantly, R’s statistical procedures are centered far too much on significance testing.  Take ks.test(), for instance; all one can do is a significance test, when it would be nice to be able to obtain a confidence band for the true cdf.  Or consider log-linear models:  The loglin() function is so centered on testing that the user must proactively request parameter estimates, never mind standard errors.  (One can get the latter by using glm() as a workaround, but one shouldn’t have to do this.)

I loved the suggestion by Frank Harrell in r-devel to at least remove the “star system” (asterisks of varying numbers for different p-values) from R output.  A Quixotic action on Frank’s part (so of course I chimed in, in support of his point); sadly, no way would such a change be made.  To be sure, R in fact is modern in many ways, but there are some problems nevertheless.

In my blog posting cited above, I was especially worried that the stat field is not attracting enough of the “best and brightest” students.  Well, any thoughtful student can see the folly of claiming the t-test to be “exact.”  And if a sharp student looks closely, he/she will notice the hypocrisy of using the 1/(n-1) factor in estimating variance for comparing two general means, but NOT doing so when comparing two proportions.  If unbiasedness is so vital, why not use 1/(n-1) in the proportions case, a skeptical student might ask?

Some years ago, an Israeli statistician, upon hearing me kvetch like this, said I would enjoy a book written by one of his countrymen, titled What’s Not What in Statistics.  Unfortunately, I’ve never been able to find it.  But a good cleanup along those lines of the way statistics is taught is long overdue.

# Good for TI, Good for Schools, Bad for Kids, Bad for Stat

In my last post, I agreed with Prof. Xiao-Li Meng that Advanced Placement (AP) Statistics courses turn off many students to the statistics field, by being structured in a manner that makes for a boring class.  I cited as one of the problems the fact that the course officially requires TI calculators.  This is a sad waste of resources, as the machines are expensive while R is free, and R is capable of doing things that are much more engaging for kids.

Interestingly, this week the Washington Post ran an article on the monopoly that TI calculators have in the schools.  This was picked up by a Slashdot poster, who connected it to my blog post on AP Stat.  The Post article has some interesting implications.

As the article notes, it’s not just an issue of calculators vs. R.  It’s an issue of calculators in general vs. the TI calculator.  Whether by shrewd business strategy or just luck, TI has attained a structural monopoly.  The textbooks and standardized exams make use of TI calculators, which forces all the teachers to use that particular brand.

Further reinforcing that monopoly are the kickbacks, er, donations to the schools.  When my daughter was in junior high school and was told by the school to buy a TI calculator, I noticed at the store that Casio calculators were both cheaper and had more capabilities.  I asked the teacher about this, and she explained that TI makes donations to the schools.

All this shows why Ms. Chow, the Casio rep quoted in the article, is facing an uphill battle in trying to get schools to use her brand. But there is also something very troubling about Chow’s comment, “That is one thing we do struggle with, teachers worried about how long it is going to take them to learn [Casio products].”  Math teachers would have trouble learning to use a calculator?  MATH teachers?!  I am usually NOT one to bash the U.S. school system, but if many math teachers are this technically challenged, one must question whether they should be teaching math in the first place.  This also goes to the point in my last blog post that kids generally are not getting college-level instruction in the nominally college-level AP Stat courses.

Chow’s comment also relates to my speculation that, if there were a serious proposal to switch from TI to R, the biggest source of resistance would be the AP Stat teachers themselves.  Yet I contend that even they would find that it is easy to learn R to the level needed, meaning being able to do what they currently do on TIs—and to go further, such as analyzing large data sets that engage kids, producing nice color graphics.  This is not hard at all; the teachers don’t need to become programmers.

The Post article also brings up the issue of logistics.  How would teachers give in-class tests in an R-based AP Stat curriculum?  How would the national AP Stat exam handle this?

Those who dismiss using R for AP Stat on such logistical grounds may be shocked to know that the AP Computer Science exam is not conducted with a live programmable computer at hand either. It’s all on paper, with the form of the questions being designed so that a computer is not needed.  (See the sample test here.)  My point is that, if even a test that is specifically about programming can be given without a live computer present, certainly the AP Stat course doesn’t need one either.  For that matter, most questions on the AP Stat exam  concentrate on concepts, not computation, anyway, which is the way it should be.

The teachers should demand a stop to this calculator scam, and demand that the textbooks, AP Stat exam etc. be based on R (or some other free software) rather than on expensive calculators. The kids would benefit, and so would the field of statistics.