Innumeracy, Statistics and R

A couple of years ago, when an NPR journalist was interviewing me, the conversation turned to quantitative matters. The reporter said, only half jokingly, “We journalists are innumerate and proud.” 🙂

Some times it shows, badly. This morning a radio reporter stated, “Hillary Clinton beat Bernie Sanders among South Carolina African-Americans by an almost 9-to-1 ratio.” Actually, that vote was 86% to 14%, just above 6-to-1, not 9-to-1. A very troubling outcome for Bernie, to be sure, but an even more troubling error in quantitative reasoning by someone in the Fourth Estate who should know better.

One of my favorite quotes is from Chen Lixin, an engineering professor at Northwestern Polytechnic University in Xian, who has warned that China produces students who can’t think independently or creatively, and have trouble solving practical problems. He wrote in 1999 that the Chinese education system “results in the phenomenon of high scores and low ability.” I must warn that we in the U.S. are moving in that same disastrous direction.

But I would warn even more urgently that the solution is NOT to make math education “more practical,” as proposed recently by noted political scientist Andrew Hacker. His solution is that, instead of requiring Algebra II of high school kids, we should allow them to substitute — you know what’s coming, don’t you? — statistics.  Ah, yes. Well, I disagree.

Hacker’s rationale is explained by the article I’ve linked to above:

Most CUNY students come from low-income families, and a 2009 faculty report found that 57 percent fail the system’s required algebra course. A subsequent study showed that when students were allowed to take a statistics class instead, only 44 percent failed.

Most statistics courses are taught, sad to say, in a formula-plugging manner.  So, aside from the dubious, not to mention insulting, attitude that the above passage sends about students from the lower class, it’s just plain wrong in terms of the putative goal of achieving numeracy. In my teaching experience, having students take so-called “practical” courses will not avoid producing innumerate people who come up with things like the pathetic “9-to-1” statistic I referred to earlier in this article.

What does achieve the numeracy goal much better, in my opinion, is intensive hands-on experience, and current high-school statistics courses are NOT taught in that manner at all. They use handheld calculators, which are quite expensive — somehow that doesn’t seem to bother those who are otherwise concerned about students from financially strapped families — and which are pedagogical disasters.

My solution has been to use R as the computational vehicle in statistics courses, including at the high school level. Our real goal is to develop in kids an intuitive feel for numbers, how they work, what they are useful for and so on. Most current stat courses fail to do that, and as we know, actually dull the senses. We should have the students actively explore data sets, both with formal statistical analyses and with graphical description.

Both in no way should it be “easy.” It should challenge the students, get them to think, in fact to THINK HARD. I strongly disagree with the notion that some kids are “incapable” of this, though of course it is easier to achieve with kids from stronger backgrounds.

I agree with the spouse of the author of the article, whose point is that Algebra II — and even more so, Geometry, if properly taught — develops analytical abilities in students. Isn’t that the whole point?

Finally, the formal aspects — the classical statistical inference procedures — DO matter. Data rummaging with R is great, but it should not replace formal concepts such as sampling, confidence intervals and so on. I was quite troubled by this statement by a professor who seems otherwise to be doing great things with R:

Creating a student who is capable of performing coherent statistical analysis in a single semester course is challenging. We [in the profession of teaching statistics[ spend a fair amount of time discussing topics that may not be as useful as they once were (e.g., t-tests, inference for a single proportion, chi-squared tests) and not enough time building skills students are likely to use in their future research (e.g., a deeper understanding of regression, logistic regression, data visualization, and data wrangling skills).

It does NOT have to be either/or.

The innumeracy problem is quite pressing. We might even say we are in a crisis. But let’s take care to find solutions that really do solve the problem.








22 thoughts on “Innumeracy, Statistics and R”

  1. I suggest all high school students be taught to calculate the odds properly in blackjack, poker, and craps.

  2. Professor Matloff, thank you ever so much for sharing your thoughts. I am a ‘machine learner’ as they call me, but I haven’t yet developed the ‘intuitive feel for numbers, what they are for, how they work, etc.’. Could you please suggest how I ought to make a start to achieve this? Any guidance at all, and I shall be immensely thankful.
    I agree that currently Stats. courses are taught in a way that dull the senses. I want to learn the subject in a way that our predecessors might have approached it – Descartes, Wittgenstein, Liebniz, and others.

    1. This really comes from within. As I said, work a lot with data, think about what we are really trying to do, and whether existing methods achieve that goal.

  3. At least in the US, the word “statistics”, in common parlance, is taken as synonymous with sports performance data; batting average and the like, advanced analytics for crying out loud. What is normally covered in chap. 1 or 2 of any Baby Stat book: descriptive statistics. Even then, I object to “descriptive” statistics.

    The other issue is that high school math is seldom well connected into college math. For many students, high school is the last time any see “math”. And we’re all aghast that the London Whale, et al, can’t even run an Excel macro (much less write one)? What should be settled progression of material, semantics, and syntax just isn’t. Thus, the low level prof condemned to teach freshmen is always dealing with a Tower of Babel student body. There is no common foundation of prior learning, and a strong instinct to “drown the bunnies” (I know, I was in just such a class, and not recently, but decades ago; bunny-cide has been part and parcel of math ed for more decades than I’ve been alive).

    As to t-tests vs. regression and such. It is a legitimate controversy. I don’t know of a text, whether just stat or just R or R/stat that makes the intellectual connection from one to the other; only that t-tests are early in the book while regression is later. Fact is, I tell my students the ultimate apostasy, “it’s all just squared differences, variously configured”.

  4. Good teachers are hard to come by, but curricula could be better constructed to support mediocre teachers. Unfortunately, there is not yet enough agreement on what constitutes a “good” curriculum. There is too much sway held by professional “educators” who have no idea what real mathematics literacy entails, nor what is essential to learn and absorb in that field or science in general. Almost all new curriculum I have seen waste way too much time on what really are “junk” activities that have no lasting value.

  5. I couldn’t agree more. Even statistics BSc courses here in Australia tend to be overly `practical’ – which means that graduates have a very sketchy understanding of inference. There is nothing more practical than a solid theoretical basis, and well-honed analytical ability. As an employer, I never minded teaching young statisticians consulting skills, but I really resented having to teach them what a confidence interval actually is.

  6. “‘results in the phenomenon of high scores and low ability.’ I must warn that we in the U.S. are moving in that same disastrous direction.”

    My wife is Chinese & currently tenured in the US higher education system. I am a statistician/”data scientist”/insert-markety-buzzword-here in the healthcare sector. She & I have conversations about this pretty much daily. The main point of our discussions can be summed up in your line that I quoted above, which I would love to tweet to the US Dept of Education. Over. And over. And over again. Although your blog post is particularly focused on math (which i can relate to since i discovered my love of math & statistics when I was already in the workforce, after growing up being told that I wasn’t good at math), I think it is indicative of our entire educational system. And i would echo, it is indeed, dangerous. I could write paragraphs about this, but suffice it to say, my wife lived through this educational approach at its finest (read: most unhealthily regimented & competitive), & what we are creating in the US is merely a knockoff of another culture’s system. I cynically believe it to be a race toward mediocrity. Ultimately my wife & I have put our own kids in an alternative private school which is focused on internalizing the love & relevance of learning, & connecting academic/theoretical concepts to everyday life.

    As a previous commenter noted, it’s hard to find “good” educators; what we really need is a better system to support teaching math/STEM. It’s hard to see how initiatives like Common Core are even remotely aligned to that, since the purpose of these kinds of programs is simply to further standardize & increase requirements in the curricula. If you’re using R or any program to help reinforce or impart mathematical concepts, more power to you. But when I read that there will be a nationwide initiative to implement programming in all schools, and this is our solution to being a STEM-deficient nation, my first thought is: wow, so now we’ll make kids hate programming just as much as they hate math, english, & history. Awesome! Personally, I am really hoping I can introduce my kids to stats & computational statistics, using R (or maybe Python). But I’d rather be doing it in a very personalized way that they can relate to and at their own pace.

    You seem to be taking a very real-world & integrated approach to teaching statistics, & I wish i had an educator like you 10-15 years ago. I hope my kids will find educators such as yourself (and my wife) sooner rather than later in their educational journeys, who can help them to truly understand & appreciate the principles of what they are learning, and not just cram for the exam.

    Sorry my comment was so long & tangential to the point of your post. This topic stirs up a great amount of emotion in me…

    1. Thanks for the interesting remarks.

      I think that both you and your wife are putting too much blame on the schools themselves (U.S. and China). Ideally the schools would do all kinds of things, but it really goes back to the parents, who (usually unconsciously) set examples for their kids. Hopefully those examples include intellectual curiosity, a love of puzzles, enjoyment of reading, and above all, a healthy skepticism.

      The quote of Prof. Chen is indeed great at encapsulating the problem, but again I think he is not going to the root of the problem, which is far deeper than the actual schools. It is really more a matter of culture. One illustration of this that I like to make is the fact that in Chinese one way to say “imitate” is the word for “learn” (学) symbolizes the rote-memory education problem. Education professor Yong Zhao of the University of Oregon has written extensively about this; you and your wife may enjoy reading his books. Also, Nobel laureate CN Yang has remarked about this problem quite a bit.

      I really need to learn more about Common Core. I’ve heard a lot of criticism of it, but it is my understanding that in math it requires the kids to explain WHY their solution works, which certainly would be a step in the right direction. On the other hand, I agree that standardized testing can be harmful, a good example being the subject I mentioned, Geometry. At least in California, most high schools today put rather little emphasis on proofs, which I think is a real shame. Proofs really develop analytical ability! Yet one reason for their de-emphasis today is that proofs are not on (and cannot be on) the SAT and the other standardized tests. (Note: By proofs, I mean assigning proofs in homework, not a short proof here and there in the lecture.)

      We don’t have a STEM-deficient nation, at least in terms of producing enough graduates with STEM degrees. But that is a different subject.

      Students can’t cram for my exams, because I give one every week. 🙂 Seriously. I give a quiz every week, and no midterm and no final. There is a term project. The quiz problems range from the trivial (to make sure everyone gets some points) to those requiring more thought. You may be interested in my open source textbook on probability and statistics.

  7. Imagine if we could have more teachers who would start by illustrating practical aspects of statistics. Think of the paper airplanes (or helicopters?) experimentation in Box, Hunter and Hunter: in a multiweek long experiment you can start with DOE, data collection and management, different experimental conditions, then you move to R, do some data cleaning, descriptive stats and plots, perform ANOVA, t-tests, regression, add interactions, interaction surfaces or 3d plots. It’s practical, fun, can be done with pen and paper as well as R, and covers all key aspects of a real analysis project. Not the least, it help students think in terms of variation and probability.

  8. Dear Norm,

    I can tell you how that reporter did the calculation:
    “What’s 86 divided by 14? Beats me, but 86 is close to 90 and 14 is close to 10, so 90 divided by 10 is a 9 to 1 ratio.”

  9. > He wrote in 1999 that the Chinese education system “results in the phenomenon of high scores and low ability.” I must warn that we in the U.S. are moving in that same disastrous direction.

    I see the same concern risen in public discourse here in Poland since education reform in 1999. Funny thing is, PISA people say that Poland made huge progress in last 15 years and set is as example for other countries.

    I wonder how much of this concern is legitimate, and how much is just adults complaining about kids these days.

    1. There is no real contradiction in what you have observed in Poland.

      First, the PISA test measures fundamental skills, rather than the ability to apply them in real life. The designers do attempt to measure the latter, but it really is not possible in the highly constrained context of a standardized test. Look again at Professor Chen’s wording, “High scores, low ability.”

      Second,nsince this is a statistics blog here, I should point out the crucial necessity of looking at subgroups. I don’t know about Poland, but in the U.S. we are very worried about the academic performance of the underclass. It would be wonderful if we could bring up even the scores of children in the underclass, even if the higher scores do not necessarily mean full facility in practical usage.

  10. I don’t see what is insulting about Hacker’s claim that “a faculty report” found that 57 percent of low-income students fail a required algebra course. Am I missing something here? Seems to me it’s either true or false and, if it is the latter, perhaps either Hacker or the faculty have some possibly insulting motive which is unstated in your quote.

    As someone who taught “developmental math” (formerly bonehead math) at a university, I can attest that a large percentage of, yes predominantly low-income students, could not pass the course. The only insult I can see there is the contempt the public school system, from which they by-and-large graduated, showed them by not giving them what they were owed and needed.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.