Sunday, May 31, 2015

Simulating Bayes Factors and p-Values

I see people critiquing Bayes factors based on simulations these days, and example include recent blog posts by Uri Simonsohn and Dr.-R. These authors assume some truth, say that the true effect size is .4, and then simulate the distribution of Bayes factors are like across many replicate samples.  The resulting claim is that Bayes factors are biased, and don't control long run error rates.   I think the use of such simulations is not helpful.  With tongue-in-cheek, I consider them frequocentrist.  Yeah, I just made up that word.  Let's pronounce it as  "freak-quo-centrists.  It refers to using frequentist criteria and standards to evaluate Bayesian arguments.

To show that frequocentric arguments are lacking, I am going to do the reverse here.  I am going to evaluate p-values with a Bayescentric simulation.

I created a set of 40,000 replicate experiments of 10 observations each.  Half of these sets were from the null model; half were from an alternative model with a true effect size of .4.   Let's suppose you picked one of these 40,000 and asked if it were from the null model or from the effect model.  If you ignore the observations entirely, then you would rightly think it is a 50-50 proposition.  The question is how much do you gain from looking at the data.

Figure 1A shows the histograms of observed effect sizes for each model.  The top histogram (salmon) is for the effect model; the bottom, downward going histogram (blue) is for the null model.  I drew it downward to reduce clutter.

The arrows highlight the bin between .5 and .6.  Suppose we had observed an effect size there.  According to the simulation, 2,221 of the 20,000 replicates under the alternative model are in this bin.  And 599 of the 20,000 replicates under the null model are in this bin.   If we had observed an effect size in this bin, then the proportion of times it comes from the null model is 599/(2,221+599) = .21.  So, with this observed effect size, the probability goes from 50-50 to 20-80.  Figure 1B shows the proportion of replicates from the null model, and the dark point is for the highlighted bin.  As a rule, the proportion of replicates from the null decreases with effect size.

We can see how well p-values match these probabilities.  The dark red solid line is the one-tail p-values, and these are miscalibrated.  They clearly overstate the evidence against the null and for an effect.  Bayes factors, in contrast, get this problem exactly right---it is the problem they are designed to solve.  The dashed lines show the probabilities derived from the Bayes factors, and they are spot on.  Of course, we didn't need simulations to show this concordance.  It falls directly from the law of conditional probability.

Some of you might find this demonstration unhelpful because it misses the point of what a p-value is what it does.  I get it.  It's exactly how I feel about others' simulations of Bayes factors.

This blog post is based on my recent PBR paper: Optional Stopping: No Problem for Bayesians.  It shows that Bayes factors solves the problem they are designed to solve even in the presence of optional stopping.

Monday, May 18, 2015

Ben's Letter: Who Raised This Kid?

UPDATE (5/21): With hurt feelings all around, Ben has been excused for the two days.  We are grateful.  It is a tough situation because their primary concern is safety, and I get that.  Hopefully, the hurt feelings will slowly melt, because the camp folks have done right by us over the years.

-------------

ORIGINAL POST:

"I just wrote camp a letter," said Ben.  "Oh God," I thought.  "Please, this is a delicate situation," I said to myself.  "He is just going to make it worse...."

My 16-year-old son has been going to camp for some six or seven years, and he loves it there.  We trusted this camp with our kids and they have delivered year after year.  We have a relationship of sorts.  He is scheduled to be a first-year counselor, and is excited.

My mother-in-law  is 90 and is in failing health.  She won't be able to travel to my nephew's wedding or my daughter's Bar Mitzvah in the coming year, assuming she is still alive then.

The conflict is that after much wrangling over dates, my wife's family is honoring my mother-in-law the first weekend of training for Ben's counselor gig.  Everyone will be there, and we thought it would be a no-brainer for camp to excuse him for the weekend.  He knew everyone; he knew camp; he knew the routines.  But they did not.  And we are very hurt.

We have been going back and forth with them, expressing our hurt and listening to their reasons, and Ben has been cc'd on the emails.

Dear (I redacted the name, it is not important),

I'm writing this E-mail because I don't think my Mom is going to say this - you are completely missing the point.  In your emails you used words such as "Birthday Party" to describe the event that we wish to attend.  Not only is this inaccurate, it completely undertones the value of this event.  This isn't just some "Birthday Party." It’s meant to be the last time my Grandma will ever be able to see her ENTIRE family alive.  It’s about being able to celebrate my Grandma's life while she is still alive, because the scary truth is, if I don't go, the next time I will be in California will probably be her funeral.  To our family, this isn't a "Birthday Party," it is like a Bar/Bat Mitzvah, and to others in our family it is even more important and valuable than one. To define this event as a "Birthday Party" is not only severely incorrect, but just reflects a lack of understanding of what this means to my Mom and my Family.

I understand that missing two days is very inconvenient for the camp, but if I do go the plan is to come back on the 14th. If camp starts on the 21st that gives me 7 days to bond with the other counselor's (whom I already know) and to learn the camp rules.  I also understand that you have to think about the safety of the campers and I truly do respect that, but to tell me that missing two days of content is going to endanger my campers, and that I won't be able to make up those two days of content, is absurd.  If missing two days is truly going to put my campers at risk, then tell us what we’re missing, prove us wrong, because to us it sounds like you're putting bonding time (with people I already know) over seeing my grandma for a possible last time, especially since you said that you would be able to handle it if I missed a few days for a family emergency.  If anything I said at all reflects that I don't understand the gravity of missing camp, then tell me exactly how missing camp will put my campers in danger, because I'm trying to understand your situation but I really don't. To us, your situation sounds like an excuse compared to what could be the last time our entire family is united before my grandma dies.

The only point you've made in this argument that I've seen as valid is you mentioning the contract.  Yes, I signed the contract saying that I would be there, and yes, technically by not going I would NOT be honoring my contract.  But what is more important in life, and dare I say it, in Judaism - honoring a contract or honoring a family?  I personally feel like family is a much more important concept in Judaism than honoring a contract, and I feel like that should also be a value a Jewish Camp respects. It shouldn't be camp policy to turn someone down because they want to try to see their grandma one last time with their entire family.  My family has already tried to change the date. They tried everything before making me aware of this date, and it just won't work any other day.  You mentioned that there would be an exception if there was a family crisis (funeral), but that only further reflected your lack of understanding towards our scenario. In saying this, you send the message that it’s more important to celebrate the life of someone when they are dead, rather than celebrating their life when they are alive.  I feel like this contradicts many important values in Judaism.  Rules and policies shouldn't restrict the celebration of family, or the values of Judaism.

You could tell me that the fact that I'm putting family over camp is my decision and not the camp's decision, but the point that we are trying to make is that it is camp's policy that is forcing me to have to make a decision, and that is disgraceful.  I shouldn't have to choose between my two families, especially if the one forcing the decision is Jewish, but right now I'm being forced to all because in your eyes, two days of bonding is more important than seeing my entire family together one last time.  Family is supposed to be an important value in Judaism and should not be a topic you can deescalate by calling our important gathering a mere "Birthday Party".  I hope this Email both makes our anger, and disappointment towards your decision clear, but also shows how we view your perspective.  If you could help us better understand how your two days of training is more important than seeing a scattered family united one last time before my Grandma dies, then maybe this decision will be easier to make.  If the only way you will excuse us is if we have a family emergency, then consider this a family emergency. That is how important this is.

Sincerely,
Ben Rouder

It is a wonderous feeling when your child is more elegant, logical, articulate, and authentic than you could have imagined.

Sunday, May 17, 2015

The Self-Propagated Myth of Bayesian Unity

Substantive psychologists are really uncomfortable with disagreements in the methodological and statistical communities.  The reason is clear enough----substantive psychologists by-and-large just want to follow the rules and get on with it.

Although our substantive colleagues would prefer if we had unified and uniform set of rules, we methodologists don't abide.  Statistics and methodology are varied fields with important, different points of view that need to be read, understood, and discussed.

Bayesian thought itself is not uniform,.  There are critical, deep, and important differences among us, so much so that behind closed doors we have sharp and negative opinions about what others advocate,  Yet, at least int he psychological press, we have been fairly tame and reticent to critique each other.   We fear our rule-seeking substantive colleagues may use these differences as an excuse to ignore Bayesian methods altogether.  That would be a shame.

In what follows, I give the briefest and most coarsest description to the types of Bayesians out there.  In the interest of being brief and coarse, I am going to do some points-of-view an injustice.  Write me a nice comment if you want to point it out a particular injustice.  My hope is simply to do more good than harm.

Also, I am not taking names.  You all know who you are:

Strategic vs. Complete Bayesians:

The first and most important dimension of difference is whether one uses Bayes Rule completely or strategically.

Complete Bayesians are those that use Bayes rule always, usually in the form of Bayes factors.  They are willing to place probabilities on models themselves and use Bayes rule to update these probabilities in light of data.  The outline of the endeavor is that theories naturally predict constraint in data which are captured by models.  Model comparison provides a mean of assessing competing theoretical statements of constraint, and the appropriate model comparison is by Bayes factors or posterior odds.  In this view, models predict relations among observables and parameters are convenient devices to make conditional statements about these relations.  Statements about theories are made based on predictions about data rather than about parameter values.  This usage follows immediately and naturally from Bayes rule.

Strategic Bayesians are those that use Bayes rule for updating parameters and related quantities, but not for updating beliefs about models themselves.  In this view, parameters and their estimates become the quantities of interest, and the resultants are naturally interpretable in theoretical contexts. These Bayesians stress highest density regions, posterior predictive p-values, and estimation precision.  Strategic Bayesians may argue that the level of specification needed for Bayes factors is difficult to justify in practice especially given the attractiveness of estimation.

The Difference: The difference between Complete and Strategic Bayesians may sound small, but it is quite large.  At stake are the very premise of why we model, what a model is, how it relates to data, what counts as evidence, and what are the roles of parameters and predictions.  Some statisticians, philosophers, and psychologists take these elements very seriously.  I am not sure anyone is willing to die on a hill in battle for these positions, but maybe.

I would argue that the difference between Complete and Strategic Bayesians is the most important one in understanding the diversity of Bayesian thought in the social sciences.   It is also the most difficult and the most papered over.

Subjectivity vs. Objectivity in Analysis

The nature of subjectivity is debated in the Bayesian community.  I have broken out here a few positions that might be helpful.

Subjective Bayesians ask analysts to query their beliefs and represent them as probability statements on parameters and models as part of the process of model specification.  For example, if a researcher believes that an effect should be small in size and positive, they may place a normal on effect size centered at .3 with a standard deviation of .2.  This prior would then provide constraint for posterior beliefs.

A variant to the subjective approach is to consider the beliefs of a generic, reasonable analyst rather than personal beliefs. For example, I might personally have no faith in a finding (or, in my case, most findings), yet I still may assign probabilities to parameters and hypotheses values that I think capture what a reasonable colleague might feel.  This process is familiar and natural---we routinely take the position of others in professional communication.

Objective Bayesians stipulate desirably properties of posteriors and updating factors and choose priors that insure these desired properties hold.  A simple example might be that in the large-sample limit, the Bayesian posterior of a parameter should converge to a true value.  Such a desirada would necessitate priors that have certain support, say all positive reals for a variance parameter or all values between 0 and 1 for a probability parameter.

There are more subtle examples.  Consider a comparison of a null model vs. an alternative model.  It may be desirable to place the following constraint on the Bayes factor.  As the t-value increases without bound, the Bayes factor should favor without bound the alternative.  This constraint is met if a Cauchy prior is placed on effect size, but it is not met if a normal prior is placed on effect size.

There are many other desiderata that have been proposed to place constraints on priors in a variety of situations, and understanding these desiderata and their consequences remains the topic of objective Bayesian development.

The Difference:

My own view is that there is not as much difference between the objective and subjective points of view as there might seem.

1.  Almost all objective criteria yield flexibility that still needs to be subjectively nailed down.  For example, if one uses a Cauchy prior on effect size, one still needs to specify a scale setting.  This specification is subjective.

2.  Objective Bayesian statisticians often value substantive information and are eager to incorporate it when available.  The call to use desiderata is usually made in the absence of such substantive information.

3.  Most subjective Bayesians understand that the desiderata are useful as constraints and most subjective priors adopt some of these properties.

4.  My colleagues and I try to merge and balance subjective and objective considerations in our default priors.  We think these are broadly though not universally useful.  We always recommend they be tuned to reflect reasoned beliefs about phenomena under consideration.  People who accuse us as being too objective may be surprised by the degree of subjectivity we recommend; those who accuse us as being too subjective may be surprised by the desiderata we follow.

Take Home

Bayesians do disagree over when and how to apply Bayes rule, and these disagreements are critical.  They also disagree about the role of belief and more objectively-defined desiderata, but these disagreements seem more overstated, especially in light of the disagreements over how and when Bayes rule should be used.

Sunday, May 10, 2015

Using Git and GitHub to Archive Data

This blog post is for those of you who have never used Git or GitHub.   I use Git and GitHub to archive my behavioral data.    These data are uploaded to GitHub, an open web repository where it may be viewed by anyone at any time without any restrictions.  This upload occurs nightly, that is, the data are available within 24 hours of their creation.  The upload is automatic---no lab personnel is needed to start it or approve it.  The upload is comprehensive in that all data files from all experiments are uploaded, even those that correspond to aborted experimental runs or pilot experiments.  The data are uploaded with time stamps and with an automatically generated log.  The system is versioned so that any changes to data files are  logged, and the new and old versions are saved.  In summary, if we collect it, it is there, and it is transparent.   I call data generated this way as Born Open Data.

Since setting up the born-open-data system, I have gotten a few queries about Git and GitHub, the heart of the system.  Git is the versioning software; GitHub is a place on the web (github.com) where the data are stored.  They work hand in hand.

In this post, I walk through a few steps of setting up GitHub for archiving.  I take the perspective of Kirby, my dog, who wishes to archive the following four photos of himself:

Here are Kirby's steps:

1. The first step is to create a repository on the GitHub server.

1a.  Kirby goes to GitHub (github.com)  and signs up for a free account (last option).  Once the account is set up (with user name KirbyHerby) he is given a screen with a lot of options for exploring GitHub.  He ignores these as they are not relevant for his task.

1b.  To create his first repository on the server, Kirby presses the green button that says + New repository" on the bottom left.

1c. Kirby now has to make some choices about the repository.  He names it data," enters a description of the repository, makes it public,
initializes it with a README and does not specify which files to ignore or a license.  He then presses the green Create repository" button on the bottom, and is given his first view of the repository

Kirby's repository is now at github.com/KirbyHerby/data, and he will bark out this URL to anyone interested.  The repository contains only the README.md file at this point.

2.  The next step is getting a linked copy of this repository on Kirby's local computer.

2a. Kirby  downloads the GitHub application for his operating system (mac.github.com} or windows.github.com), and on installation, chooses to install the command-line tools (trust me, you will use these some day).

2b.  Kirby enters his GitHub username (KirbyHerby") and password.

2c. He next has to create a local repository and link it to the one on the server.   To do so, he chooses to Add repository" and is given a choice to Add," Create," or Clone."  Since the repository already exists at GitHub, he presses Clone."  A list of his repositories shows up, and in this case, it is a short list of one repository, data."   Kirby then selects data" and presses the bottom button Clone repository."  The repository now exists on the local computer under the folder data."   There are two, separate copies of the same repository: one on the GitHub server and one on Kirby's local machine.

3. Kirby wishes add files to the server repository so others may see them.

3a. Kirby first adds the photo files to the local repository as follows: Kirby copies the photos to the files in the usual way, which for Mac-OSX is by using the Finder.  The following screen shot shows Finder window in the foreground and the GitHub client window in the background.  As can be seen, Kirby has added three files, and these show up in both applications.  Kirby has no more need for the Finder and closes it to get a better view of the local repository in the GitHub client window.

3b.  Kirby is now going to save the updated state of the local repository, which is called committing it.  Committing a local action, and can be thought of as a snapshot of the repository at this point in time.  Kirby turns his attention to the bottom part of the screen.  To commit, Kirby must add a log entry, which in this case is, Added three great photos."  The log will contain not only this message, but a description of what files were added, when, and by whom.  This log message is enforced---one cannot make a commit without it.  Finally Kirby presses Commit to master."

3c.  Kirby now has to push his changes to the repository to the GitHub server so everyone may see them.  He can do so by pressing the sync" button.

That's it.  Kirby's additions are now available to everyone at github.com/KirbyHerby/data

Suppose Kirby realizes that he had forgotten his absolutely favorite photo of him hugging his favorite toy, Panda.  So he copies the photo over in Finder, commits a new version of the repository with a new message, and syncs up the local with the GitHub server version.

There is a lot more to Git and GitHub than this.  Git and GitHub are very powerful, so much so that they are the default for open-source software development world wide.  Multiple people may work on multiple parts of the same project.  Git and GitHub have support for branches, tagging versions, merging files, and resolving conflicts.  More about the system may be learned by studying the wonderful Git Book at git-scm.com/book/en/v2.

Finally, you may wonder why Kirby wanted to post these photos.  Well, Kirby doesn't know anything about Bayesian statistics, but he is loyal.  He knows I advocate Bayes factors.  He also knows that others who advocate ROPEs and credible intervals sell their wares with photos of dogs.  Kirby happens to believe that by posting these, he is contributing to my Bayes-factor cause.  After all, he is cuter than Kruschke's puppies and perhaps he is more talented.  He does know Git and GitHub and has his own repository to prove it.