Are Mice Reliable Models for Human Disease Studies?


An article published in the New York Times on February 11 questioned the efficacy of using mice to model human diseases that involve the immune system. The article reports findings of a research paper by Seok et al. published in Proceedings of the National Academy of Sciences on February 10. The researchers assert that years and billions of dollars were wasted studying mice.

So should we scrap mice as a model system for human disease, or is there more to the story? 

The original paper compared the genetic responses of humans and mice following an injury, such as a burn. Biomedical researchers assume that if mice and humans have a similar genetic response to an injury, then treatments that are effective on mice should also be effective on humans. Seok et al. found that mice had a much different genetic response than humans following an injury, and concluded that mouse models are poor representatives of the human inflammatory response, and thus the reason many therapies fail to translate from mice to humans.

The study results are interesting and ask for further analysis and investigation. But experienced mouse geneticists point to one problem with the study, and perhaps the reason it was rejected from prestigious journals such as Nature and Science, is that it only examined one strain of mice, C57BL/6 (B6).

There are hundreds of strains of laboratory mice, each with different genetic compositions and disease susceptibilities. Each strain of mice is comparable to a single human in a population. Just as no one human is representative of an entire population of humans, no one strain of mouse is representative of all mice or much less, humans. When constructing a mouse study, ideally researchers should use several strains of mice to better model a human patient population.

To clarify, consider this example. Acetaminophen, the active ingredient in Tylenol, has variable toxicity in humans likely based on their genetic background. Some people can ingest elevated amounts of acetaminophen and experience no side effects, while other people can ingest a small amount of acetaminophen and suffer liver damage.

Response to acetaminophen has been tested in a number of mouse strains. Some strains, such as CAST, are nearly immune to liver damage from acetaminophen. Other strains, such as CBA, suffer toxic effects from the drug even at low doses. Most strains fall somewhere in between. If a researcher only used the CAST strain to analyze acetaminophen toxicity, he would conclude that acetaminophen is completely safe. If he only used CBA, he would conclude that acetaminophen is completely unsafe.

Seok et al. only tested one strain of mice yet concluded that all mice are poor models of human inflammation. If more strains had been analyzed, there might well be variation in inflammatory response, as seen with acetaminophen response.

To better model human populations using mice, an international collaboration of geneticists produced a reference panel of mice called the “Collaborative Cross.” The Collaborative Cross mixes the genetics of eight founder strains to yield over 300 new mouse strains, collectively capturing nearly 90% of known genetic variation in laboratory mice. The Collaborative Cross recapitulates the genetic diversity seen in human populations.

Researchers studying the inflammatory response to injury could use the Collaborative Cross to identify strains of mice that closely model human inflammatory response. Thus, while the B6 mouse strain appears to be a poor model for studying inflammatory response to injury, the genetic diversity present in the Collaborative Cross promises much better mouse models.

The findings of Seok et al. are solely applicable to the B6 strain of mice in the three models of inflammation they tested. They unduly generalize these findings to mouse models of inflammation in general.

Mouse models are an indispensable tool in biomedical science. We owe a great deal of our understanding of health, disease, behavior and general biology to mouse models. It’s unfortunate that the New York Times chose to publish a story on what appears to be seriously flawed conclusions based upon the available data.

Instead of “Mice Fall Short as Test Subjects for Humans’ Deadly Ills,” a more appropriate title for the New York Times article would have been “The B6 Mouse Strain Poorly Reflects Human Inflammatory Response.”

19 thoughts on “Are Mice Reliable Models for Human Disease Studies?

  1. I think a another good candidate for a title would be “Researchers Over-Interpret Mouse Data”. I have seen lots of hand-wringing about the lack of predictivity of mouse models, but in my experience the ‘failures’ of these models have more to do with the experimental design, how the data is analyzed, and/or how it’s interpreted. The literature is full of examples of improper statistics (e.g., using a t-test instead of ANOVA), poorly designed experiments (such as starting a therapeutic regimen on a tumor before it’s established), and overly broad conclusions as described above. But when things go wrong in the clinic the models get most of the blame.

    • Yes, that title would be fitting as well. I agree there are often mistakes in the actual design or carrying out of the experiment. I think often times when researchers find an exciting result they jump to broad conclusions too quickly before verifying the results.

  2. Nicely stated. We use B6 mice for mechanism studies and outbred CD-1 mice for pre-clinical type research. This collaborative cross mouse model seems great and will be an excellent tool.

    I believe the authors greatly overstated their conclusions because the experiments were not designed to do the type of analysis that they performed. Cells may not have been prepared the same way.

    • Yes, using outbred models and using multiple strains is definitely needed in these types of experiments. That’s an interesting point about cells not being prepared in the same way. It could be that different protocols were using when analyzing human vs mouse gene expression since it was done by different groups. That would be interesting to investigate

  3. Great article! Mice can be powerful models of human disease. While we have to ensure that we optimize the models to more carefully reflect human pathology (and understand how they differ from the human diseases), they have been invaluable in primary & clinical research. The NYT article has the potential to do a lot of harm in the eyes of the public who don’t see the “other side”.

  4. Larger concern for me is that I am also seeing these kinds of responses in my summary statements. Everyone expects the phenotype to fully recapitulate the human. These models are based on evaluating small parts of a much larger condition in humans (e.g. follow a single pathway in liver). Once someone has genomic data in general they have opened Pandora’s box, people mice or whatever. you need to cull that data to the biology you are following in either setting or you are set to fail.

    For those using CD1, they are not the best replacement of inbreds, still some very weird genetic interactions going on there. The diversity outbreds that are being worked on at Jackson (developed from the CC) may be a better option, but only time will tell.

      • But don’t you think that the majority of people who are ‘being careful about which strains they use’ are careful to select the strain that gives them the best outcome for the test they are trying to perform? That has been my experience in mouse immunology at the very least. If you never read human studies, test human samples, or interact with labs doing human research how can one possibly conclude what the ‘best’ mouse strain for modeling humans is?

      • Jeff, I think this is why collaboration is so important. There needs to be crosstalk between human and mouse researchers. This is why conferences are so important- to understand areas outside of the particular pathway one is researching every day. This is also why most academic programs have weekly seminar speakers and journal clubs. In order to perform good science, one must become deeply emerged in the specifics of their project, but also must take time to look more broadly. Many times, great discoveries come from integrating two distantly related fields.

        As far as picking mouse strains, I hope researchers don’t simply pick the types they’re mostly likely to get a positive result from. If they do, this will likely come out in the peer review process and later in their grant applications.

  5. Thank you. As another example:Adult mice are extremely resistent to the toxic and carcinogenic effects of aflatoxin, a potent human carcinogen. This is due to a subunit of glutathione-S-transferase (mGSTA3) that is not found in humans, rats, trout or other aflatoxin susceptible species. By constructing a mGSTA3 knockout mouse, our laboratory has produced a mouse model that is analogous to humans in regard to sensitivity to aflatoxin. The message is that you need to select the mouse model that reflects the human condition.

  6. Pingback: Never Trust a Rat, or a Mouse « Seeing the Sword

  7. In the spirit of fairness can we also agree that the vast majority of papers published in prestigious journals using the mouse only use a single strain, gender, and age of mice to support their claims? A triple knockout, CRE-inducible, GFP expressing C57BL/6 mouse is still a C57BL/6 mouse. Can you imagine how a journal would react if I submitted a paper using a single human donor for all of my experiments but performed each experiment 3 times? Is it not time that we have the same attitudes towards using a single strain of mouse? The common response is that it is too difficult to make transgenics on different backgrounds… Well I can tell you it is pretty difficult to find enough human donors to put together a rigorous study as well but that doesn’t stop many of us.

    • Thanks for your comment, Jeff. Agreed, the findings from a study of one strain of mouse cannot be generally applied to other strains or humans until the experiment is repeated in other strains. I think the difference is that mouse knockout studies in one particular strain are looking more for a mechanism or are looking to fill in missing molecular links in a pathway. It’s generally not a preclinical type of experiment. I would hope that the researchers wouldn’t generalize that finding to all types of mice. If they do, I’ll call them out 🙂

  8. Pingback: Of Mice and Men Again: New Genomic Study Helps Explain why Mouse Models of Acute Inflammation do not Work in Men | Laika's MedLibLog

  9. An informative post that has shifted my view of the research results reported. However, a point you miss in your analysis is the frequency of using the B6 mouse strain in published reports of development of drugs to treat the targeted diseases. If the B6 strain is commonly used, then their main points are probably valid. If not, then their research might be largely irrelevant. Do you know which is true? Thanks for this!

  10. Indeed, Bill, this is an informative post that has contributed to my view about this study.
    (see post: )

    BL, you are right about this. The use of one single mouse strain wouldn’t be important if this was the sole rodent model used. But it is not. More importantly, the B6 model seems to be quite resistant to sepsis. Another post adresses this.
    Mark Wanner of the Jackson Laboratory [quote]: ( )

    “It is now well known that some inbred mouse strains, such as the C57BL/6J (B6 for short) strain used, are resistant to septic shock. Other strains, such as BALB and A/J, are much more susceptible, however. So use of a single strain will not provide representative results.”

    In general, however, all models failed till now (not only the mouse models). So one other shortcoming of this study is that it offers no new insights. (& it only applies to sepsis)

  11. How often do researchers use multiple strains of mice in a single study (or across an entire career)? Most studies I read in nature journals and cell press journals not only use a single strain but also use a selected age range and gender. Maybe it is high time for people to do some soul searching about the intent of their research. Getting published in top journals is a game in science for many people, and doing mechanistic mouse studies focused on a single strain is (still) the way to win.

    We need to get over our obsession with striking results that are only possible with cloned, inbred mouse strains selected for giving strong responses. If I picked a single human donor to test based on a particularly strong response and did 10 experiments on that one person I would be laughed out of the conference.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s