Chapter 4: Observational Research

Important Terms

You should know the definitions of the following terms. You should also be able to apply these concepts (i.e., recognize examples of them in several contexts and use them to critically evaluate a study).

case study
construct validity
demand characteristics
descriptive observations
deviant-case analysis
ecological function
ecological validity
force-choice test
internal validity
interobserver reliability
motivated forgetting
naturalistic observation
relational observations
response acquiescence
response styles
response deviation
social desirability
stratified sample
subject (participant) roles
unobtrusive measures
volunteer problem



We have seen that science is concerned with accurately describing a phenomenon (identifying the important dimensions and relevant variables), specifying the relationships between two or more variables (e.g., the effect of an IV on a DV), and explaining why these relationships exist (developing theories that organize our observations and generate predictions about future observations). Thus, the first step in the scientific process is observing/describing the phenomenon.

As the text's examples of perceptual illusions illustrate, although it may be that "seeing is believing," it is certainly not the case that our perceptions always bear an accurate correspondence to objective reality. What amazes me to this day is how compelling these illusions are--even when they have been explained (so we understand, or "know" their basis), we can't overcome them--our brain persists in misperceiving these stimuli. Thus, as scientists, we must be constantly vigilant of the various threats to the validity (i.e., accuracy) of our observations.

We saw in Chapter 3 that reliability (consistency) of measurement is essential to scientific observation. Reliability is a necessary, but not a sufficient, condition for validity (accuracy) of measurement. That is, reliability is a "prerequisite" of validity (a measure can't be valid if it is not consistent), but it is possible for a scale to be reliable, but still not valid. For example, we could develop an "algebra test," that would yield very similar scores each time it is administered, but if the test purported to measure "reading comprehension," then although the test is reliable, it is not an accurate measure of reading comprehension.

So we need to be careful to insure the validity as well as the reliability of our observations. I like to play darts, so I'll give a "dart board" example of the relationship between reliability and validity. Think of observation/measurement as trying to "hit the bullseye" on the board (i.e, that's the variable, or construct we are trying to measure). In the worst-case scenario, a measure is neither reliable nor valid--my friend, Jody, illustrates this: her darts land all over the board (if they hit it at all!), and if she hits the bullseye, it's a random event! Next, a measure can be reliable, yet still be invalid--my friend, Ray, is very consistent: all of his darts land close to each other (near the same place), but the problem is that they miss the bullseye! What we strive for is a measure that is both reliable and valid: Now, when I throw the darts (of course!), most of them consistently hit the target (bullseye), so I am both consistent and accurate--thereby winning the game whenever we play (well, ok--maybe I'm stretching the truth a bit, but hopefully you get the point!).

Your authors touch on this issue in Chapter 3, where they discuss the "predictive validity" of measurement scales (a good test will yield strong, positive correlations between scores on the test and scores on some relevant behavior the test should be able to predict). For example, we become more confident of the SAT if it reliably predicts college GPA (high SAT scores are related to high GPA). The present chapter discusses some other aspects of validity, but they all relate to the accuracy--or truth--of our observations.


Much research in psychology can be classified as "descriptive observation," that is, empirical data that have been obtained from systematic observation. This section describes several examples of observational research and discusses issues to insure the reliability and validity of these observations.



Despite it's advantages, observational research is limited in that it is purely descriptive, and does not allow inferences about cause-effect relationships. We do not have the control found in an experiment, so there are a host of uncontrolled extraneous variables that could account for the relationships observed. Thus, Descriptive Research tends to be low on internal validity. Sometimes we can't be confident of the reliability of the observations, especially if they are difficult to replicate by other observers. The tendency to anthropomorphize (attributing human characteristics to animals or inanimate objects) is hard to resist in many observational studies.

The remainder of this chapter considers other sources of error in observation, as well as ways to mitigate against these problems.