I’m deeply greatful to Hui Lin and the inimitable Yihui Xie for arranging for me to give a “virtual seminar talk” to the Central Iowa R Users Group. You can view my talk, including an interesting Q&A session, online. (The actual start is at 0:34.) There are two separate topics, my regtools package (related to my forthcoming book, From Linear Algebra to Machine Learning: Regression and Classification, with Examples in R), and the recent ASA report on p-values.

Really nice talk. You mentioned a good comparison study on classification that used cross validation. The winners were SVM and logistic regression. Could you provide a reference?

That’s a very interesting talk. I especially agree with the point you make at 51:30 that one problem with change in the use of p-values is that “people like simple, crisp answers”. I think that the three asterisks that denote a “very significant” p-value are often looked at in much the same way as three cherries on a slot machine. It’s taken to mean that you have a winner, at least if your goal is to be published. But as suggested by the video at http://fivethirtyeight.com/features/not-even-scientists-can-easily-explain-p-values/ , even scientists that work with p-values are often at a loss to explain what they are. As best as I can understand, a p-value is roughly the chance that there is absolutely no relationship between the variables being studied. But this tells you nothing about the size of any relationship or whether that relationship is actually due to a common relationship to some additional variable(s).

Because p-values represent such “simple, crisp answers”, I think that the six ASA guidelines at https://www.amstat.org/newsroom/pressreleases/P-ValueStatement.pdf , while they may provoke some discussion, will do little to reform their use. It might be helpful if the ASA could suggest some relatively simple and concrete tests that should be included with the use of any p-values. For example, I looked at one study that used regression to try to quantify the relationship between two variables over a certain time span. It came up with a significant p-value for that span. However, when I expanded the time period to include all of the available data and looked at all possible time spans greater than some minimal length, the same model came up with widely conflicting relationships, all with significant p-values. Hence, it would seem that some type of cross-validation of p-values would be helpful.

Really nice talk. You mentioned a good comparison study on classification that used cross validation. The winners were SVM and logistic regression. Could you provide a reference?

Unfortunately, I can’t find it now. It was the results of an investigation by a consultant on a large number of diverse data sets.

That’s a very interesting talk. I especially agree with the point you make at 51:30 that one problem with change in the use of p-values is that “people like simple, crisp answers”. I think that the three asterisks that denote a “very significant” p-value are often looked at in much the same way as three cherries on a slot machine. It’s taken to mean that you have a winner, at least if your goal is to be published. But as suggested by the video at http://fivethirtyeight.com/features/not-even-scientists-can-easily-explain-p-values/ , even scientists that work with p-values are often at a loss to explain what they are. As best as I can understand, a p-value is roughly the chance that there is absolutely no relationship between the variables being studied. But this tells you nothing about the size of any relationship or whether that relationship is actually due to a common relationship to some additional variable(s).

Because p-values represent such “simple, crisp answers”, I think that the six ASA guidelines at https://www.amstat.org/newsroom/pressreleases/P-ValueStatement.pdf , while they may provoke some discussion, will do little to reform their use. It might be helpful if the ASA could suggest some relatively simple and concrete tests that should be included with the use of any p-values. For example, I looked at one study that used regression to try to quantify the relationship between two variables over a certain time span. It came up with a significant p-value for that span. However, when I expanded the time period to include all of the available data and looked at all possible time spans greater than some minimal length, the same model came up with widely conflicting relationships, all with significant p-values. Hence, it would seem that some type of cross-validation of p-values would be helpful.