Week 2. Data: objectivity, management, and presentation


Modern science is often called “hypothesis driven”, and “phenomenological studies” may be given lower importance. However, the very existence of a hypothesis introduces the possibility of bias (conscious or not) on the part of the investigator in designing or interpreting experiments, and even in deciding which data are “acceptable” and which should be disregarded. As scientists, we often respect those researchers who can “winnow the wheat from the chaff” and “show the big picture”, but it can be a short step to “overlooking the obvious” and “willful dismissal of contradictory observations”.

Objectivity can be lost in many ways. Data can be dismissed or simply ignored. Specific data values can be “corrected” (without actual fabrication). Control experiments can be omitted or “explained away”. Contradictory observations from other labs, or other sets of experiments, can be ignored. Statistical tests can be avoided, or selectively applied or reported.

Going beyond the data, objectivity can be lost in presenting or testing alternative hypotheses. Sometimes an individual investigator will have a favored hypothesis. However, probably more frequently the “most popular” hypothesis in the field will be uncritically applied to the data. 

    • What criteria or approaches are appropriate to avoid subjective pruning of the data?
    • Who should set the criteria?
    • How should differences in opinion be settled when questions arise about data selection?
    • What is the difference between the attitude towards “pilot experiments” whose results are often dismissed and post-hoc editing of data sets?
    • What criteria are appropriate for deciding whether the data are consistent with the hypothesis to be tested, or lead to its rejection?


Ziman, J. Is Science Losing Its Objectivity? Nature 382:751-74, 1996.

​Data Management

Scientific research relies upon proper record keeping. Yet, training in record keeping is often passive. What are our obligations for managing data?

“Data Book Zen” from Francis Macrina’s text, Scientific Integrity, 3rd edition, 2005, ASM Press

    Useful data records explain:
    • What you did
    • Why you did it
    • How you did it
    • When you did it
    • Where materials are
    • What happened (and what did not)
    • Your interpretations
    • Contributions of others
    • What’s next
    Good data records:
    • Are legible
    • Are well organized
    • Are accurate and complete
    • Allow repetition of experiments
    • Are compliant with granting agency and institutional requirements
    • Are accessible to authorized persons, stored properly, and appropriately backed up
    • Are the ultimate record of your scientific contributions

Are these appropriate standards?
How do you achieve them?
Who is responsible for data management?
     ...the institution?
     ...the PI?
     ...the laboratory worker?

What are the NIH requirements for data management?

Data Presentation

How should data be prepared for publication? Is it manipulation to “photoshop out” a spurious smudge in an image of a western blot? Should a gel be trimmed to show only the bands of interest to save space?


Rossner M and Yamada KM. What’s in a picture? Temptation of image manipulation. J. Cell Biol 2004; 166:11-15.

It May Look Authentic; Here's How to Tell It Isn't
By Nicholas Wade, January 24, 2006 NY Times
Click here to open PDF File

Nature editorial: Not picture-perfect

Nature: Guide for digital images

Sample Scenarios

Data Objectivity (from a former DBBS student)
Cast of characters:
Dr. Bane – PI of the lab
Dr. Clark – Semi-evil post-doc
Dan – generally good, but naïve (or perhaps greedy) grad student

Dr. Bane’s lab is getting ready to write a big, really important, earth shaking paper that will guarantee funding for the next 10 years. But it has to go out quickly because another lab located on the east coast is about to publish the same data. The #2 lab isn’t going to get any grant money, and Stockholm isn’t going to call #2.

There’s one final experiment to do. It’s a cell culture experiment that involves measuring regulation of CMV-luciferase expression. The data are:
Well #
b-gal (for normalization)
Wells 1-3 are treated with growth factor. Wells 4-6 are treated with vehicle. Dr. Clark feels that the data from well #2 should be thrown out because it doesn’t fit nicely. Other data implies that GF treatment activates the reporter, but keeping well #2 in the analysis will indicate repression. What do you do? What if well #2 had a fungal contamination? Would that change your decision? Would the knowledge that fungal metabolites alter transcription of the CMV promoter affect your decision? Remember there isn’t enough time to do the experiment again.

The paper also involves analysis of a mouse treated with experimental gene therapy that could have profound impact in cardiovascular disease. The research was funded under an agreement between the university and PI and the neighborhood giant chemical and pharmaceutical company. The PI has applied for a patent on the treatment, and revenue will be shared between the company, the university, the PI, and the grad student.

Mouse aortas from treated and control mice are evaluated by histology. The grad student scores the data unblinded and bases his decisions on whether the arterial wall looks thicker. Dr. Clark, the post-doc, has been asked to help evaluate the sections. He scores them single blinded, having his friend down the hall judge which sections have thicker arterial walls. Does Dan, the grad student, lose his objectivity because he stands to gain from a positive result? Or is Dan just sloppy? Suppose Dr. Clark had suggested that Dan actually measure arterial wall thickness and do statistical analysis. Would that change your opinion of Dan’s method and objectivity? What if measuring the walls resulted in the same outcome as Dan’s unblinded scoring method?
From Onlineethics.org
Data Reduction Techniques

by Caroline Whitbeck, Ph.D.

You are working on a team of undergraduates headed by a graduate student that is attempting to replicate an experiment obtained by another group the previous spring. Your supervising professor needs your experiment completed in time for an upcoming conference at which the results are to be presented. You rigorously perform the tests and collect the data, which is then subjected to a series of reduction and transformation programs on the computer. The students find that each data reduction process distorts the results to the extent that they no longer match the subtle phenomena that were observed. These were the same phenomena that the group who obtained the earlier results chose to consider negligible.

What do you do and how do you go about it?

Follow us: