Neural Networks Are Essentially Polynomial Regression

You may be interested in my new arXiv paper, joint work with Xi Cheng, an undergraduate at UC Davis (now heading to Cornell for grad school); Bohdan Khomtchouk, a post doc in biology at Stanford; and Pete Mohanty,  a Science, Engineering & Education Fellow in statistics at Stanford. The paper is of a provocative nature, and we welcome feedback.

A summary of the paper is:

  • We present a very simple, informal mathematical argument that neural networks (NNs) are in essence polynomial regression (PR). We refer to this as NNAEPR.
  • NNAEPR implies that we can use our knowledge of the “old-fashioned” method of PR to gain insight into how NNs — widely viewed somewhat warily as a “black box” — work inside.
  • One such insight is that the outputs of an NN layer will be prone to multicollinearity, with the problem becoming worse with each successive layer. This in turn may explain why convergence issues often develop in NNs. It also suggests that NN users tend to use overly large networks.
  • NNAEPR suggests that one may abandon using NNs altogether, and simply use PR instead.
  • We investigated this on a wide variety of datasets, and found that in every case PR did as well as, and often better than, NNs.
  • We have developed a feature-rich R package, polyreg, to facilitate using PR in multivariate settings.

Much work remains to be done (see paper), but our results so far are very encouraging. By using PR, one can avoid the headaches of NN, such as selecting good combinations of tuning parameters, dealing with convergence problems, and so on.

Also available are the slides for our presentation at GRAIL on this project.

Women in R

Last week I gave one of the keynote addresses at R/Finance 2018 in Chicago. I considered it an honor and a pleasure to be there, both because of the stimulating intellectual exchange and the fine level of camaraderie and hospitality that prevailed. I mentioned at the start of my talk that the success of this conference, now in its tenth year, epitomized the wonderful success enjoyed nowadays by the R language.

On the first day of the conference, one of the session chairs announced that a complaint had been made by the group R-Ladies, concerning the fact that all of the talks were given by men. The chair apologized for that, and promised efforts to remedy the situation in the future. Then on the second day, room was made in the schedule for two young women from R-Ladies to make a presentation. There also was a research paper presented by a woman, added at the last minute; she had presented work at the conference in the past.

I have been interested in status-of-women issues for a long time, and I spoke briefly with one of the R-Ladies women after the session. I suggested that she read a blog post I had written that raised some troubling related issues.

But I didn’t give the matter much further thought until Tuesday of this week, when a friend asked me about the “highly eventful” conference. That comment initially baffled me, but it turned out that he was referring to the R-Ladies controversy, which he had been following in the “tweetstorm” on the issue in #rfinance2018 . Not being a regular Twitter user, I had been unaware of this.

Again, issues of gender inequity (however defined) have been a serious, proactive concern of mine over the years. I have been quite active in championing the cases of talented female applicants for faculty positions at my university, for instance. Of my five current research students, four are women. In fact, one of them is a coauthor with me of the partools package that played a prominent role in my talk at this conference. I’ve taught my department’s ethics course a couple of times, and gender issues form a major part of the reading list I assign.

That said, I must also say that those tweets criticizing the conference organizers were harsh and unfair. As that member of the program committee pointed out, other than keynote speakers, the program is formed from papers submitted for consideration by potential authors, and it turned out that no papers had been submitted by women. Many readers of those tweets will think that the program committee is prejudiced against women, which I really doubt is the case.

The women who complained also cited lack of a Code of Conduct for the conference. This too turned out to be a misleading claim, as there had been a Code of Conduct posted by the University of Illinois at Chicago, the host of the conference.

So, apparently there was no error of commission here, but some may feel an error of omission did occur. Arguably any conference should make more proactive efforts to encourage female potential authors to submit papers for consideration in the program. Many conferences have invited talks, for instance, and R/Finance may wish to consider this.

However, there is, as is often the case, an issue of breadth of the pool. Granted, things like applicant pools are often used as excuses by, for example, employers for significant gender imbalances in their workforces. But as far as I know, the current state of affairs is:

  • The vast majority of creators (i.e. ‘cre’ status) of R packages in CRAN etc. are men.
  • The authors of the vast majority of books involving R are men.
  • The authors of the vast majority of research papers related to R are men.

It is these activities that lead to giving conference talks, and groups like R-Ladies should promote more female participation in them. We all know some outstanding women in those activities, but to truly solve the problem, many more women need to get involved.

Addendum: Unfortunately, this blog post brought me under criticism in that Twitter discussion. My comments urging women to become more involved in writing packages and papers were interpreted by some as “blaming the victim.” I strongly disagreed, considering my advice to be good mentoring; I still do.

(Some material here was updated on July 21, 2018 and on May 18, 2019. Also, see my post about the 2019 R/F meeting, )