Thursday, March 24, 2016

The Effect-Size Puzzler

Effect sizes are bantered around as useful summaries of the data.  Most people think they are straightforward and obvious.  So if you think so, perhaps you won't mind a bit of a challenge?  Let's call it "The Effect-Size Puzzler," in homage to NPR's CarTalk.  I'll buy the first US winner a nice Mizzou sweatshirt (see here).  Standardized effect size please.

I have created a data set with 25 people each observing 50 trials in 2 conditions.  It's from a priming experiment.  It looks about like real data.  Here is the download.

The three columns are:

  • id (participant: 1...25)
  • cond (condition: 1,2)
  • rt (response time in seconds).  

There are a total of 2500 rows.

I think it will take you just a few moments to load it and tabulate your effect size for the condition effect.  Have fun.  Write your answer in a comment or write me an email.

I'll provide the correct answer in a blog next week.

HINT: If you wish to get rid of the skew and stabilize the variances, try the transform y=log(rt-.3)


jeromyanglim said...

# import data
rlong <- effectSizePuzzler

# aggregate to person by condition stats
r2long <- aggregate(rt ~ cond + id, rlong, mean)

means <- sapply(split(r2long$rt, r2long$cond), mean)
sds <- sapply(split(r2long$rt, r2long$cond), sd)

# difference in means using sd based on poooled variance
es <- diff(means) / sqrt(mean(sds^2))

round(es, 2)

# Answer
# d = .84

Jake Westfall said...

I am aware of 4 different and generally non-equivalent ways that people might commonly compute even just a d-like effect size for this dataset. (Let alone all the possibilities for variance-explained-type measures!) I've actually been meaning to blog about this, so I guess it's time I finally do so.

I think standardized effect sizes are generally a bad idea for data summary and meta-analytic purposes, but can be useful if you want to do a power analysis or define reasonably informative priors, but don't have previous experimental data.

Anyway, of the possible ways to compute a d-like statistic here, I think the least crazy way is to use...wait for it...the classical definition of cohen's d. Crucially, this ignores information about the experimental design at hand -- it is always computed simply as the mean difference over the standard deviation of a single observation (pooled across conditions). In R that would look like:

with(df, diff(tapply(rt, cond, mean)) / sqrt(mean(tapply(rt, cond, var)))) # about .25

where df is the effectSizePuzzler data.frame. This differs from Jeremy Anglim's method, which first aggregates the responses within subject-by-condition, as well as from other possible approaches that I'll hopefully discuss in my blog post.

Simon Columbus said...

This looks like a mixed effects model would be appropriate; but I don't know of any methods that would provide effect sizes for individual fixed effects. So my answer would be: NA.

Jeff Rouder said...

Hi All, The answer is up!