Ch.
9: Complex Designs
Important
Terms
You should
know the definitions of the following terms. You should also be able to apply
these concepts (i.e., recognize examples of them in several contexts and use
them to critically evaluate a study, as well as apply them in the design of
your research proposal).
block randomization
cell means relevant to interaction effects
crossover interaction
df between, df within, df total
interaction effects
main effects
marginal means relevant to main effects
matched-groups design
mixed designs
random-groups designs
SS between, SS within, SS total
treatment x treatments x subjects
2 x 2 between groups factorial design
3 x 3 between groups factorial design
2 x 2 within groups factorial design
2 x 2 mixed model design
2 x 2 x 2 factorial design
FACTORIAL DESIGNS
The reasons for employing factorial designs were discussed in Chapter 6.
I've pasted my lecture notes about this below:
-
Adding IV's
- This is where things
really start to get interesting from a scientific perspective (and
my own!). So far we have mainly talked about experiments which study the
effects of one IV on one DV. But human behavior is much more complex
than can be understood by research that only looks at one IV-DV relationship.
- For example, exam
peformance is likely to be influenced by a multitude of variables, and
"incentive" is only one of them. Exam performance is likely
to be complexly determined by more than one causal variable, and our research
should strive to reflect this complexity. In other words, our subject
matter is too complex for us to hope to make progress by simple "one-shot"
studies involving only one IV.
- The solution to this
problem: include two or more IV's in the experiment (see
above discussion of extraneous variables). Your authors discuss several
reasons besides the above for complex experimental designs:
- Greater
efficiency/economy: It's easier to do one experiment including
two IV's than to do two separate experiments--it's like getting "two
for the price of one" (though not quite, since complex designs
require more participants)
- Greater
experimental control is obtained over extraneous variables,
because these are more likely to be constant across conditions in
a single experiment conducted at one time than in two separate experiments
conducted at different times
- Greater
generality of results: if it can be shown that the effects
of one IV on the DV are the same at both levels of the second IV,
then this tells us that it's causal effects generalize to multiple
conditions/situations.
- Greater
"interest-value" of the results: the concept of
interaction effects among multiple IV's is an extremely
important "bonus" (at no extra charge!) of a complex experimental
design (and note that it is impossible to uncover interaction
effects in single-IV experiments).
- An interaction
between IV's means that the effect of one IV depends on
the level of the second IV. In other words, the effect of IVa may be different
at level one of IVb than at level 2 of IVb.
- Interactions are
of great interest to psychologists (and me in particular--indeed, colleagues
call me "Mr. Interaction," because my own research is always
multifactorial--I'm always looking for interesting interaction effects!).
They get at the complexity of behavior discussed above. As one researcher
put it: "It's a pretzel-shaped world we live in, so we need to develop
pretzel-shaped experimental designs to understand it."
-
The "Simplest" Complex
Design: A 2 x 2 Between-Groups Factorial
- The abbreviation, "2
x 2," indicates that there are two between-groups "factors"
(that is, two IV's: "Factor A" and "Factor B"), with
two levels of each factor/IV. A "factorial" design means that
all possible combinations of the levels of each IV/factor are created.
By multiplying the number of levels of each factor, we can determine how
many possible combinations of the two levels of each factor there are
(i.e., 2 x 2 = 4 possible combinations). The four possible combinations
are illustrated in the four "cells" of Table 10-1 below:
Table 10-1. The
"Basics" of a 2 x 2 Factorial Design
(Marginal Means reflect
Main Effects of Factors A & B)
(Cell Means reflect the A x B
Interaction Effect)
|
Factor
B
|
Factor
A
|
| |
A1
|
A2
|
|
|
B1
|
(A1B1)
Cell Mean
|
(A2B1)
Cell Mean
|
B1
Marginal
Mean
|
|
B2
|
(A1B2)
Cell Mean
|
(A2B2)
Cell Mean
|
B2
Marginal
Mean
|
| |
A1
Marginal Mean
|
A2
Marginal Mean
|
|
- Note that since this is a completely
between-groups design, the experiment would be conducted on four independent
groups of participants, i.e., each participant will be randomly assigned
to one of the four cells shown above. Thus, if we had 40 total participants,
10 participants would be assigned to each of the four "conditions"
of a particular combination of the levels of the two factors.
- The appropriate analysis of
the DV for such a design would be a 2 x 2 factorial ANOVA, which
will test for significant differences between the "Marginal Means,"
that is, the two "column" means (the differences between levels
A1 vs. A2
of Factor A) and the two "row"
means (the differences between levels B1
vs. B2 of
Factor B). These are called the "Simple Main Effects"
of each factor, because they reflect the overall differences between the
levels of each factor, independently of the the levels of the other.
- This ANOVA will also test for
significant differences between the "Cell Means," that
is, the four means relevant to the "A
x B Interaction Effect." As
noted above, this test will determine whether the effects of the IV's are
independent of each other, or whether the effects of one IV
depends on the level of the other factor.
- For the above 2 x 2 factorial
design with a total of 40 participants (N = 10 participants per cell),
a "Source Table" will be produced containing three F-ratios
("Fob"), one to test for the significance of each
Main Effect, and one to test for the Interaction Effect (where df total
= Ntot -1; df for each main effect = # of levels - 1; df for the the interaction
= dfa x dfb; and df-error = Ntot - # of cells):
|
Source
|
df
|
SS
|
MS
|
F
|
p
|
| Between-Groups |
|
|
|
|
|
|
-Factor A Main Effect |
1
|
|
|
Fob
|
|
| -Factor
B Main Effect |
1
|
|
|
Fob
|
|
| -A
x B Interaction |
1
|
|
|
Fob
|
|
| Within-Groups |
|
|
|
|
|
| -Error (Residual) |
36
|
|
|
|
|
| Total |
39
|
|
|
|
|
- If p < .05, then
the effect is "statistically significant," and the
researcher would then interpret the differences between the marginal means
and/or cell means. There are a variety of possible outcomes
of such an analysis, ranging from all three effects being significant, through
one or both main effects being significant, only the interaction effect
being significant, to none of the three effects being significant.
We turn next to a consideration of some possible outcomes. In all
hypothetical examples below, assume that Factor A is Sex of participant
(Female vs. Male), Factor B is "Provocation" (~Angry vs. Angry),
and the DV is amount of aggression exhibited by the participant.
- Example 1: Both main
effects are significant, and the A x B interaction is not significant:
Figure 10-6. Additive
Main Effects & No Interaction Effect
|
Provocation
(Factor B)
|
Sex
(Factor A)
|
| |
Females
(A1)
|
Males
(A2)
|
|
|
~Angry
(B1)
|
20
(A1B1)
|
40
(A2B1)
|
30
|
|
Angry
(B2)
|
80
(A1B2)
|
100
(A2B2)
|
90
|
| |
50
|
70
|
|
- The main effects above are
"additive," because the interaction is not significant. That
is, their effects on the DV are independent of each other. This can be
seen by starting with the above cell mean of "20." The other
three cell means can be computed simply by adding the amount of each main
effect (a 20-unit difference for Factor A and a 60-unit difference for
Factor B). Thus, the cell mean for A2B1 is 20 + 20 = 40, the cell mean
for A1B2 is 20 + 60 = 80; the cell mean for A2B2 is 80 + 20 = 100 (or,
alternatively, 40 + 60 = 100).
- The above four cell means
are graphed in the figure shown below. Note that the lines are parallel,
indicating the absence of an interaction. Both lines show the same 20-unit
main effect of Factor A (i.e, they show a 20-unit increase between A1
and A2 at both levels of Factor B), and the distance between the lines
shows the same 60-unit main effect for Factor B (i.e., they show a 60-unit
increase between B1 and B2 at both levels of Factor A). Note that the
lower-case letters (called "subscripts") next to each cell mean
indicate which means are significantly different (means with
the different subscripts are significantly different, means with the same
subscript are not significantly different). In this example, all
four cell means are significantly different from each other.
- The intepretation of this
would be as follows: Both female and male participants were significantly
more aggressive when angry than when not angry, and men were significantly
more aggressive than were women both when they were angry and when they
were not angry. The highest level of aggression was shown by angry men,
and the least aggression was shown by non-angry women. Finally, angry
women were more aggressive than non-angry men (this last comparison shows
that the anger variable is more important a determinant of aggression
than is the sex variable).
- Example 2: Both main
effects are significant and the A x B interaction is also significant. In
this case, the main effects are "nonadditive," that is, the effect
of Factor A (Sex Differences) depends on the level of Factor B (Anger
Condition):
Figure 10-8. Nonadditive
Main Effects & an A x B
Interaction Effect
|
Provocation
(Factor B)
|
Sex
(Factor A)
|
| |
Females
(A1)
|
Males
(A2)
|
|
|
~Angry
(B1)
|
30
(A1B1)
|
30
(A2B1)
|
30
|
|
Angry
(B2)
|
70
(A1B2)
|
110
(A2B2)
|
90
|
| |
50
|
70
|
|
- Note that the main effects
above are identical to those shown in Example 1, but one cannot
compute the other three cells by simply beginning with the cell mean of
"30" for A1B1 and adding the main effects of the two factors.
(e.g., for A2B1, 30 + 20 does not equal the value of "30" shown
in cell A2B1, nor does 30 + 60 equal the value of "70" shown
in cell A1B2, etc.).
- The above cell means are graphed
in the figure shown below. Note that the lines are not parallel,
indicating the presence of a significant interaction effect. It can be
seen that the "effects" of Factor A (Sex Differences) is different
depending on the level of Factor B (Anger Condition). Specifically,
the main effect for factor A only occurs at Level 2 of factor
B (i.e., when participants were angry). At Level 1 of Factor B (when they
were not angry), there is no difference between Level 1 and Level 2 of
Factor A:
- The interpretation of the
above interaction is as follows: When angry, men were significantly more
aggressive than women. However, when not angry, there was no difference
in aggression between men and women. Thus, the sex differences in aggression
are only observed when participants are angry, and they disappear when
participants are not angry. Note that in this example, the main effect
of Anger still does occur for both men and women (i.e., both men
and women were significantly more aggressive when angry than when not
angry). However, the effects of anger appear to be greater in men than
in women.
- Example 3: Both main
effects are significant, but again they are "nonadditive," because
there is also a significant "Crossover" A x B Interaction Effect:
Figure 10-10.
Nonadditive Main Effects & a "Crossover" A
x B Interaction Effect
|
Provocation
(Factor B)
|
Sex
(Factor A)
|
| |
Females
(A1)
|
Males
(A2)
|
|
|
~Angry
(B1)
|
60
(A1B1)
|
00
(A2B1)
|
30
|
|
Angry
(B2)
|
40
(A1B2)
|
140
(A2B2)
|
90
|
| |
50
|
70
|
|
- Note that the main effects
are identical to those in the first two examples, and they are nonadditive
in the same way as shown in Example 2 (i.e., one cannot compute the other
three cell means by starting with "60" and adding the relevant
main effects).
- The above four cell means
are graphed in the figure shown below. Note here that not only are the
lines non-parallel, they actually cross over each other. As your
authors point out, crossover interactions are particularly compelling,
in that with a crossover interaction such as this one, it can be seen
that not only do the effects of Sex depend on the Anger level, but even
more interesting is that the gender differences are the opposite
when participants are angry compared to when they are not angry:

- The interpretation of the
above interaction is as follows: The effects of anger depend on the the
sex of participants. That is, while men were significantly more aggressive
when angry than when not angry, women were not more aggressive
when angry than when not angry. Thus, the main effect of anger only occurs
for male participant (it disappears for female participants). Further,
sex differences depend on the anger condition. When angry, men were significantly
more aggressive than women. However, just the opposite occured when participants
were not angry: here women were significantly more aggressive than were
men.
- Example 4: Neither
main effect is significant (so obviously, this example differs from
the first three), but there is a significant "Classic" crossover
A x B interaction effect:
Figure 10-10. No
Main Effects & a "Classic" Crossover A
x B Interaction Effect
|
Provocation
(Factor B)
|
Sex
(Factor A)
|
| |
Females
(A1)
|
Males
(A2)
|
|
|
~Angry
(B1)
|
20
(A1B1)
|
60
(A2B1)
|
40
|
|
Angry
(B2)
|
60
(A1B2)
|
20
(A2B2)
|
40
|
| |
40
|
40
|
|
- This is a "classic"
crossover interaction, because the "effects" of each factor
"cancel each other out," thereby making the main effects nonsignificant.
This type of interaction is the most compelling of all, and really shows
the value of complex factorial experimental designs. Without the factorial
combination of A and B, one might erroneously conclude that there are
no sex differences in aggression and that anger level has no effect on
aggression.
- Of course, as the figure below
shows, there are significant sex differences, but the direction
of the difference depends on the anger condition (indeed, here also they
are opposite differences at the two levels of anger, statistically cancelling
each other out in the computation of the main effect). Also, there are
significant anger differences, but here also, the effects of anger depends
on the sex of the participant (again, they are opposite at the two levels
of sex, statistically cancelling each other out in the computation of
the main effect).

- The interpretation of the
above interaction is as follows: The direction of the sex differences
depended on the anger level. When angry, women were significantly more
aggressive than men. However, when not angry, just the opposite occurred:
men were significantly more aggressive than women. Similarly, the effects
of anger depended on the sex of the participant. Women were significantly
more aggressive when angry than when not angry. However, men showed the
opposite effect of anger. Men were significantly more aggressive when
not angry than when they were angry (of course, these results are hypothetical,
so this last difference doesn't make alot of sense in this example!).
-
Adding Treatment
Levels: A 3 x 3 Factorial Design
- This design is just like the
2 x 2 Design discussed above, except that now we have three levels of
each factor. An example of the nine cells in a 3 x 3 design is shown below.
Assume a total of 90 participants (N = 10 participants per cell),
that Factor A is "Number of Roomates," and Factor B is "Size
of Room," and the DV is satisfaction with residence hall life.
|
Room
Size
(Factor B)
|
Number
of Roomies
(Factor A)
|
| |
One
(A1)
|
Two
(A2)
|
Three
(A2)
|
|
|
Small
(B1)
|
(A1B1)
|
(A2B1)
|
(A3B1)
|
B1
Marginal Mean
|
|
Medium
(B2)
|
(A1B2)
|
(A2B2)
|
(A3B2)
|
B2
Marginal Mean
|
|
Large
(B3)
|
(A1B3)
|
(A2B3)
|
(A3B3)
|
B3
Marginal Mean
|
|
|
A1
Marginal
Mean
|
A2
Marginal Mean
|
A3
Marginal Mean
|
|
- The source table for the 3 x
3 ANOVA would look the same as in the 2 x 2, except that the dfs would change:
|
Source
|
df
|
SS
|
MS
|
F
|
p
|
| Between-Groups |
|
|
|
|
|
|
-Factor A Main Effect |
2
|
|
|
Fob
|
|
| -Factor
B Main Effect |
2
|
|
|
Fob
|
|
| -A
x B Interaction |
4
|
|
|
Fob
|
|
| Within-Groups |
|
|
|
|
|
| -Error (Residual) |
81
|
|
|
|
|
| Total |
89
|
|
|
|
|
- Also note, that were we to obtain
significant main effects for each of these factors, we would have to conduct
comparisons among the three marginal means to determine which ones are significantly
different (i.e., we would use subscripts, just like we use for comparing
cell means for interaction effects).
-
Adding More Factors:
A 2 x 2 x 2 Factorial Design
| |
High Self
Esteem
|
|
Low Self Esteem
|
| |
High Competence
|
Low Competence
|
|
High Competence
|
Low Competence
|
|
Males
|
29.9
|
31.1
|
Males
|
27.4
|
44.7
|
|
Females
|
22.7
|
48.7
|
Females
|
35.0
|
41.5
|
Main Effects Marginal Means--These
are computed by "collapsing" across the other two factors:
- Self Esteem Marginal Means (p
> .05):
|
High
= 29.9 + 22.7 + 31.1 + 48.7 = (132.4) / 4 = 33.10
|
|
Low
= 27.5 + 35 + 44.7 + 41.5 = (148.6) / 4 = 37.15
|
- Sex of Participant Marginal Means
(p > .05):
|
Males
= 29.9 + 31.1+ 27.4 + 44.7 = (133.1) / 4 = 33.28
|
|
Females
= 22.7 + 48.7+ 35 + 41.5 = (147.9) / 4 = 36.98
|
- Perceived Competence Level (p
< .05):
|
Low =
31.1 + 48.7 + 44.7 + 41.5 = (166) / 4 = 41.50
|
|
High =
29.9 + 22.7 + 27.4 + 35 = (115) / 4 = 28.75
|
Self Esteem x Competence
x Sex Three-way Interaction (p < .01):

-
Control in Between-Subjects
Factorial Designs
- Random Assignment
to Conditions
- Matching on Participant
Variables
Both of the above control methods
maintain equivalence between groups. Both were discussed in Chapter 6, so I
won't repeat that material.
COMPLEX WITHIN-SUBJECTS
DESIGNS
-
A Complex Within-Subjects
Experiment: A 2 x 2 Within-Groups Design
- Treatment x Treatment x Participant
Designs:
- Both Factors manipulated within-groups
- All participants receive all
levels of both IV's
|
Source
|
df
|
SS
|
MS
|
F
|
p
|
| Factor
A Main Effect |
|
|
|
Fob
|
|
| Error A |
|
|
|
|
|
| Factor
B Main Effect |
|
|
|
Fob
|
|
| Error B |
|
|
|
|
|
| A
x B Interaction |
|
|
|
Fob
|
|
| -Error A, B |
|
|
|
|
|
| Total |
|
|
|
|
|
- Control in Complex
Within-Subjects Designs
- These control methods
were also discussed in Chapter 6 as a means of avoiding carry-over effects.
- Counterbalancing, Latin Squares
used to distribute order effects evenly across conditions.
- Block Randomization: Test conditions
in a random order; Complete all conditions, then use a new random order.
MIXED DESIGNS
-
Example of a Mixed
Design: A 2 x 2 Mixed Model Design
|
Source
|
df
|
SS
|
MS
|
F
|
p
|
| Factor
A Main Effect |
|
|
|
Fob
|
|
| Error A |
|
|
|
|
|
| Factor
B Main Effect |
|
|
|
Fob
|
|
| Error B |
|
|
|
|
|
| A
x B Interaction |
|
|
|
Fob
|
|
| Error B |
|
|
|
|
|
| Total |
|
|
|
|
|