Saturday, March 23, 2019

Teaching Undergrad Stats without p, F, or t.

I taught a 10 week intro-level stats course without p, F, or t.

The course proceeded in the usual way:
• one-group designs
• two-group designs
• many-group designs
• factorial designs
• OLS regression
• multiple regression
• mixed factors and continuous covariates (ANCOVA)
The key to the course was the concept of a model.  I spent quite a bit of time on a distribution and an observation from it.  I also talked about the random-variable notation like $$Y_i$$ for each of $$i$$ observations in a data set.    Then, all inference in the class was model comparison.  No hypotheses, only models that instantiate theoretically interesting positions.  The models always have to be written down before analysis.

So, for the one-group design, we carried two models.  Students had to master the following notation:

$$\mbox{Null Model}: \quad Y_i \sim \mbox{Normal}(0,\sigma^2)$$
$$\mbox{Effect Model}: \quad Y_i \sim \mbox{Normal}(\mu,\sigma^2)$$

I then asked them how the model accounted for each observation.  For example, if the data were the observations  (-2,-1,0,1), we would make the following tables:

Null-Model Table

Data Account Error Squared Error
-2 0 -2 4
-1 0 -1 1
0 0 0 0
1 0 1 1

Effects-Model Table

DataAccountError Squared Error
-2-.5-1.52.25
-1-.5-.5.25
0-.5.5.25
1-.51.52.25

The next step is to calculate $$SS_E$$, $$R^2$$, and BIC, with $$R^2$$ serving as a measure of effect size and BIC as a relative model comparison statistic.  Of course, I taught the formulas and meanings of these three quantities.  BIC, for example, was taught in terms of error and penalty, e.g., $$\mbox{BIC}=n\log(SS_E/n)+k\log(n)$$.   Continuing, the following model-comparison table was produced:

Model Parameters SSE R^2 BIC
Null
0
6
0
1.62
Effect
1
5
.167
2.27

Here we the effect was actually pretty big in $$R^2$$, but we do not have the resolution to prefer the effects model over the null model given the small sample size.  To help understand what $$R^2$$ means, I provide a list of accounted-variances for various phenomena, say how much variance foes smoking account for in lung-cancer rates.

Students were taught how to calculate all the values by calculator in small data sets and by spreadsheet in large data sets.

Interpretation

For interpretation, students were taught that:
a. their inference was only as good as their models
b. no model was true or false, all were just helpful abstractions
c. their primary goal was model comparison rather than testing
d. they need not make decisions, just assess evidence judiciously
e. they should consider both model comparison (BIC) and effect size $$R^2$$ in assessment

Extension

The above model-comparison-through-error approach extends gracefully to all linear model applications.  In this sense, once the mechanics and interpretations are mastered in the one-sample case, the extensions to more complex models, including multiple regression and multi-factor ANOVA are straightforward.  By getting the mechanics out early, we can focus on the models and how they account for phenomena.  Contrast this to the usual case where you may teach one set of mechanics for the t-test, another for F-tests, and a third for regression.

Results

The quarter went well.  The students mastered the material effectively.  I had fun.  They had fun.  And I never lost any sleep at night wondering why I was teaching what I was teaching.

JR said...

This is really interesting! Is there a way to get the lecture notes of your class?

MichiganWater said...

JR, I was going to ask the same, but then saw that professor Rouder posted a link the files on his Twitter account.