Composition Effect Kao Roll: Difference between revisions

From Theory Wiki
Jump to navigation Jump to search
Idoroll (talk | contribs)
 
Idoroll (talk | contribs)
No edit summary
Line 1: Line 1:
#REDIRECT [[Composition Effect Kao Roll - old, please keep]]
== The Composition Effect - What is the Source of Difficulty in Problems which Require Application of Several Skills?  ==
Ido Roll, Yvonne Kao, Kenneth E. Koedinger
 
=== Abstract ===
 
 
This study found that the presence of distracters creates
significant difficulty for students solving geometry area
problems, but that practice on composite area problems
improves students’ ability to ignore distracters.  In addition,
this study found some support for the hypothesis that the
increased spatial processing demands of a complex diagram
can negatively impact performance and could be a source of a
composition effect in geometry.
 
=== Glossary ===
 
- Composite problems: Problems which require the application of several skills, such as solving 3x+6=0 for x.
 
- Single-step problems: Problems which require the application of a single skill, such as y+6=0 or 3x=-6
 
- DFA (Difficulty Factor Analysis): A test that includes pairs of items varying along one dimension only. It allows to evaluate the difficulty level of the single dimensions along which the problems differ.
 
- The Composition Effect: The effect according to which composite problems are harder than a set of single-step problems using the same skills.
 
 
=== Research question ===
 
What is the main source of difficulty in composite problems?
 
 
=== Background and Significance ===
 
Although much work has been done to improve students’
math achievement in the United States, geometry
achievement appears to be stagnant.  While the 2003 TIMSS
found significant gains in U.S. eighth-graders’ algebra
performance between 1999 and 2003, it did not find a
significant improvement on geometry items between 1999
and 2003 (Gonzales et al., 2004).  Furthermore, of the five
mathematics content areas assessed by TIMSS, geometry
was the weakest for U.S. eighth-graders (Mullis, Martin,
Gonzales, & Chrostowski, 2004).  While students have
often demonstrated reasonable skill in “basic, one-step
problems,” (Wilson & Blank, 1999, p. 41) they often
struggle with extended, multi-step problems in which they
have to construct a free response, rather than selecting a
multiple-choice item.  Thus it is our goal to examine the
sources of difficulty in multi-step geometry problems and to
determine how to address these difficulties during
instruction.
Heffernan and Koedinger (1997) found a composition
effect in multi-step algebraic story problems—the
probability of correctly completing the multi-step problem
was less than the product of the probability of correctly
completing each of the subproblems, P(Composite) <
P(Subproblem A) × P(Subproblem B).  They suggested that
this difference could be due to an exponential increase in the
number of possible problem-solving actions as a problem
became more complex, or it could be due to a missing or
over-specialized knowledge component, such as students
not understanding that whole subexpressions could be
manipulated like single numbers or variables.  Our research
questions are: is there a composition effect in multi-step
geometry area problems, e.g., a problem in which the
student must subtract the area of an inner shape from the
area of an outer shape to find the area of a shaded region,
and if so, what might be the source of the composition
effect?  Would it be a missing or over-specialized
knowledge component, as concluded by Heffernan and
Koedinger, or would it be a combinatorial search?
In order to answer these questions, we first needed to
assess the difficulty of a single-step area problem. 
Koedinger and Cross (2000) found that the presence of
distracter numbers on parallelogram problems—the length
of a side was given in addition to the lengths of the height
and base—significantly increased the difficulty of the
problems due to students’ shallow application of area
knowledge.  In particular, students seemed to have over-
generalized a procedure for finding the area of rectangles—
multiplying the lengths of adjacent sides—to
parallelograms.  In addition, Koedinger and Cross
conjectured that non-standard orientations for shapes—non-
horizontal bases and non-vertical heights—would also
expose students’ shallow knowledge.  Given that a multi-
step area problem inherently contains distracters and often
features shapes that are rotated from their standard
orientations, it will be important for us to follow Koedinger
and Cross’s lead and get a baseline measure of how
distracters and orientation affect performance on single-step
area problems.  Then we will study how combining single-
step area problems into a typical, “find the shaded area”
composite area problem effects performance.  In these types
of problems, students are required to perform three steps:
calculate the area of the outer shape, calculate the area of
the inner shape, and then subtract the values of the two
areas. 
We believe that we will find a composition effect in these
types of geometry area problems.  One possible source of
the effect is the additional spatial-processing demands
placed by a more complex diagram.  Koedinger and
Anderson (1990) found that a hallmark of geometry
expertise was the ability to parse a complex diagram into
perceptual chunks that could be used to guide a search of
problem-solving schemas.  Geometry novices most likely
are not able to parse complex diagrams into meaningful
perceptual chunks quickly or efficiently and thus increasing
diagram complexity could result in increased problem
difficulty.  This explanation would be more consistent with
the combinatorial search explanation for the composition
effect than the missing-skill explanation favored by
Heffernan and Koedinger.  This conjecture leads to an
interesting prediction: in contrast to the composition effect
found by Heffernan and Koedinger, the probability of
correctly completing a composite problem should be greater
than the product of the probability of correctly completing
each of its three subproblems.  This is because in
completing a single composite problem, the act of parsing
the complex diagram need only be performed once whereas
it needs to be performed at least twice when completing the
three subproblems separately.  This prediction has two
corollaries: performance on the Outer Shape subproblem
should be lower than performance on a mathematically
equivalent problem using a simple diagram, and that the
probability of correctly completing a composite problem
should be equal to the product of the probabilities of
correctly completing the Subtraction subproblem, the Inner
Shape subproblem, and a simple-diagram equivalent of the
Outer Shape subproblem.
 
=== Independent Variables ===
 
An instruction in the form of solved-example, targeting a common misconception - identifying base and hight in a cluttered environment.
 
=== Dependent variables ===
 
Three tests are used in the study:
- Pre-test: given before all instruciton
- Mid-test: given after students learned about single-step problems and before composite problems
- Post-test: after students have learned and practice all material.
 
The tests include the following items. Some of which are [[transfer]] items, evaluating robust learning, since they require and adaptive application of the knowledge learned and practiced in class.
 
* Simple diagram:
*# no distractors, canonical orientation
*# distractors,    canonical orientation
*# no distractors, tilted orientation
*# distractors,    tilted orientation
* Complex diagram:
*# Given complex diagram, ask for skill A
*# Given complex diagram, ask for skill B
*# Given steps A and B,  ask for skills C (which requires A and B)
*# Given complex diagram, ask for C (which requires A and B)
 
=== Hypothesis ===
 
1. Adding distracters to a basic area problem and rotating
the figure from its standard orientation will make the
problem more difficult.
2. We will find a composition effect in area problems, in
that the probability of correctly completing a
composite problem is not equal to the product of the
probabilities of correctly completing the three
subskills:
P(Composite) ≠ P(Outer) × P(Inner) × P(Subtract).
3. P(Composite) > P(Outer) × P(Inner) × P(Subtract), 
P(Outer) < P(Simple Outer Eq.), and P(Composite) =
P(Simple Outer Eq.) × P(Inner) × P(Subtract) due to
the demands of spatially parsing the diagram.
 
=== Findings ===
 
 
An alpha value of .05 was used for all statistical tests.
==== Comparison of Mid-test and Post-test Performance ====
Scores on the pretest were at floor, ranging from 0 to 50%
correct (M = 14.94%, SD = 13.61%).  Pre-test scores did not
correlate significantly with either mid-test scores or post-
test scores.  Thus we did not analyze the pretest further.
Performance on the mid-test and the post-test were
significantly correlated (r2 = 0.239, p < .001).  A paired t-
test found significant gains in overall performance from
mid-test to post-test, t(65) = 3.115, p = 0.003, 2-tailed, with
highly significant gains on Simple problems, t(65) = 3.104,
p = 0.003, 2-tailed, and significant gains on Complex
problems, t(65) = 2.308, p = 0.024. Participants performed
better on the Simple problems than on the Complex
problems.  This difference was significant at mid-test, t(65)
= 2.214, p = .030, 2-tailed, and at post-test, t(65) = 2.355, p
= .022, two-tailed.  These results are presented in Table 1. 
==== Effects of Distracters, Orientation, and Shape on
Simple Problem Performance ====
We used a binary logistic regression analysis to predict the
probability that students would answer a Simple problem
correctly on the mid-test and the post-test.  Distracters,
Rotation, and three dummy variables coding diagram
shape—parallelogram, pentagon, trapezoid, or triangle—
were entered into the equation as predictors.
Table 1: Mean performance on mid-test and post-test by
diagram type1
Mid-test (%) Post-test (%) Gain (%)
Type M SD M SD M SD
Overall
65.34 27.79 75.41 23.66 10.07** 26.26
Simple
68.56 30.79 79.17 23.03 10.61** 27.76
Complex
61.74 30.13 71.21 31.08 9.47* 33.33
Simple-
Complex 6.82* 25.03 7.96* 27.44 
At mid-test, the full model differed significantly from a
model with intercept only, χ2 (5, N = 264) = 17.884,
Nagelkerke R2 = .092, p = .003.  Overall, this model had a
success rate of 69.3%, correctly classifying 90.1% of correct
responses and 24.1% of incorrect responses.  The presence
of distracters was a significant predictor of problem
accuracy at mid-test.  Students were only .541 times as
likely to respond correctly when the problem contained
distracters.  Shape was also a significant predictor.  Students
were only .484 times as likely to respond correctly when the
problem involved a pentagon over a triangle.  Rotation was
not a significant predictor of problem accuracy.
At post-test, the full model differed significantly from a
model with intercept only, χ2 (5, N = 264) = 15.533,
Nagelkerke R2 = .089,  p = .008.  Overall, this model had a
success rate of 79.2%, correctly classifying 100% of the
correct responses, but 0% of incorrect responses. 
Distracters were no longer a significant predictor at post-
test, and Rotation remained a non-significant predictor. 
Shape remained a significant predictor of problem accuracy. 
Students were only .335 times as likely to respond correctly
when the problem involved a pentagon over a triangle.
Effects of Skill Composition and Shape on Complex
Problem Performance
We used a binary logistic regression analysis to predict
the probability that students would answer a Complex
problem correctly on the mid-test and the post-test. Whether
the problem required an Outer calculation, an Inner
calculation, or Subtraction were entered into the equation as
predictors.  As before, three dummy variables coding
diagram shape were entered as well.  At mid-test, the full
model differed significantly from a model with intercept
only, χ2 (6, N = 264) = 12.862, Nagelkerke R2 = .065, p =
.045.  Overall, this model had a success rate of 64.0%,
                                                         
1
*p < .05, **p < 0.01
correctly classifying 85.9% of correct responses and 28.7%
of incorrect responses.  At post-test, the full model also
differed significantly from a model with intercept only, χ2
(6, N = 264) = 25.019, Nagelkerke R2 = .129,  p < .001. 
Overall, this model had a success rate of 70.5%, correctly
classifying 95.7% of correct responses and 7.9% of
incorrect responses.  In both models, the Outer calculation
was the only significant predictor of a correct response. 
When the problem required an Outer calculation—in the
Outer and Composite conditions—students were only .439
and .258 times as likely to respond correctly at mid-test and
post-test, respectively.
Predictors of Performance on Composite Problems
We used a binary logistic regression to predict the
probability that a student would answer a Composite
Complex problem correctly on the mid-test and the post-
test, given his/her success on: Inner, Outer, and Subtract, the
Distracters+Rotation (DR) Simple problem that is
mathematically-equivalent to Outer; and the shape of the
diagram—parallelogram, pentagon, trapezoid, or triangle—
coded using three dummy variables.  This model differed
significantly from a model with intercept only for both mid-
test, χ2 (7, N = 66) = 26.567, Nagelkerke R2 = .442, p <
.001, and post-test, χ2 (7, N = 66) = 30.466, Nagelkerke R2 =
.495, p < .001.  The mid-test model had overall success rate
of 75.8%, correctly classifying 75.0% of correct responses
and 76.5% of incorrect responses.  The post-test model had
an overall success rate of 77.3%, correctly classifying
83.8% of correct responses and 69.0% of incorrect
responses.
The significant predictor variables in the model changed
from mid-test to post-test.  At mid-test, Subtract and DR
success were significant predictors of a correct response on
Composite problems.  Students who answered Subtract
correctly were 9.168 times more likely to answer Composite
correctly.  Students who answered DR correctly were 5.891
times more likely to answer Composite correctly.  At post-
test, Subtract and DR success remained significant
predictors with odds ratios of 20.532 and 9.277,
respectively.  In addition, Outer success became a
significant predictor.  Students who answered Outer
correctly were 6.366 times more likely to answer Composite
correctly.  Shape was not a significant predictor at either
mid-test or post-test. These results are presented in Table 4.
Assessing the Difficulty of Spatial Parsing We took
Accuracy(Outer) × Accuracy(Inner) × Accuracy(Subtract)
and compared this to Accuracy(Composite) for each student
using a paired, 2-tailed t-test.  This difference was
significant at mid-test, t(65) = 2.193, p = .032.  Students
performed better on Composite (M = 48.48%, SD = 50.36%)
than was predicted by the product (M = 33.33%, SD =
47.52%).  This difference was no longer significant at post-
test.  In contrast, Accuracy(DR) × Accuracy(Inner) ×
Accuracy(Subtract) did not differ significantly from
Accuracy(Composite) at either mid-test or post-test.  Paired,
2-tailed t-tests found that performance did not differ
significantly between the Outer and DR at either mid-test,
t(65) = -.851, p = .398, or post-test, t(65) = -1.356, p = .180.
Discussion
We will return to our original hypotheses to begin the
discussion.  It was clear that distracters had a negative
impact on Simple performance at mid-test, although this
effect had largely disappeared by post-test.  Although we
did not find significant effects of Rotation on Simple
performance, we did find evidence that many students
simply rotated the paper until they were viewing the figure
in a standard orientation, effectively negating our Rotation
manipulation.  Thus, our first hypothesis is partially
supported.
We did find a composition effect in area problems, and
the probability of success on Composite problems could not
be predicted by simply multiplying the probabilities of
success for the three subproblems.  Thus, our second
hypothesis is supported.
We only found partial support for our hypothesis that the
source of the composition effect is due to the diagram
parsing load.  We did find that the probability of success on
Composite problems was greater than the product of
probabilities for the three subproblems, but only at mid-test. 
In addition, there were no significant differences between
Outer and DR at either mid-test or post-test, although we
feel it is worth noting that the data are trending in the
predicted direction, with Outer being more difficult than DR
on both mid-test and post-test.  Finally, our P(Composite) =
P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a
good job of predicting actual performance on Composite
problems.
To conclude our discussion, we would like to address the
differences between performance at mid-test and
performance at post-test and the implications for designing
instructional interventions.  First, we would like to note that
instruction in the Area Composition unit of the Geometry
Cognitive Tutor was able to improve performance on all
skills, not just skills new to composite problems.  This
suggests that students may not have fully mastered the
single-step area skills prior to beginning Area Composition,
but that Area Composition continues to provide practice on
these skills.  Furthermore, the single-step skill practice in
the Area Composition unit seems particularly effective at
removing the effects of distracters on performance.  This
makes a great deal of intuitive sense if you consider
composite area problems to inherently contain distracters. 
Second, although we did not find strong support for our
contention that spatial parsing is difficult for students, we
feel that training students to quickly identify important
perceptual chunks can still have a positive impact on
performance.  If, for example, students were trained to look
 
=== Explanation ===
 
 
We will return to our original hypotheses to begin the
discussion.  It was clear that distracters had a negative
impact on Simple performance at mid-test, although this
effect had largely disappeared by post-test.  Although we
did not find significant effects of Rotation on Simple
performance, we did find evidence that many students
simply rotated the paper until they were viewing the figure
in a standard orientation, effectively negating our Rotation
manipulation.  Thus, our first hypothesis is partially
supported.
We did find a composition effect in area problems, and
the probability of success on Composite problems could not
be predicted by simply multiplying the probabilities of
success for the three subproblems.  Thus, our second
hypothesis is supported.
We only found partial support for our hypothesis that the
source of the composition effect is due to the diagram
parsing load.  We did find that the probability of success on
Composite problems was greater than the product of
probabilities for the three subproblems, but only at mid-test. 
In addition, there were no significant differences between
Outer and DR at either mid-test or post-test, although we
feel it is worth noting that the data are trending in the
predicted direction, with Outer being more difficult than DR
on both mid-test and post-test.  Finally, our P(Composite) =
P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a
good job of predicting actual performance on Composite
problems.
To conclude our discussion, we would like to address the
differences between performance at mid-test and
performance at post-test and the implications for designing
instructional interventions.  First, we would like to note that
instruction in the Area Composition unit of the Geometry
Cognitive Tutor was able to improve performance on all
skills, not just skills new to composite problems.  This
suggests that students may not have fully mastered the
single-step area skills prior to beginning Area Composition,
but that Area Composition continues to provide practice on
these skills.  Furthermore, the single-step skill practice in
the Area Composition unit seems particularly effective at
removing the effects of distracters on performance.  This
makes a great deal of intuitive sense if you consider
composite area problems to inherently contain distracters. 
Second, although we did not find strong support for our
contention that spatial parsing is difficult for students, we
feel that training students to quickly identify important
perceptual chunks can still have a positive impact on
performance.  If, for example, students were trained to look for a “T” or “L” structure and map the segments onto the
base and height of a shape, students might be less prone to
developing shallow knowledge about area. 
 
 
=== Descendents ===
 
 
 
=== Annotated bibliography ===
 
Gonzales, P., Guzman, J. C., Partelow, L., Pahlke, E.,
Jocelyn, L., Kastberg, D., & Williams, T. (2004).
Highlights From the Trends in International Mathematics
and Science Study: TIMSS 2003. Washington, DC:
National Center for Education Statistics.
Heffernan, N. T. & Koedinger, K. R. (1997). The
composition effect in symbolizing: The role of symbol
production vs. text comprehension. In Proceedings of the
Nineteenth Annual Conference of the Cognitive Science
Society,307-312. Hillsdale, NJ: Erlbaum.
Koedinger, K. R., & Anderson, J. R. (1990). Abstract
planning and perceptual chunks: Elements of expertise in
geometry. Cognitive Science, 14, 511-550. 
Koedinger, K. R., Anderson, J. R., Hadley, W. H., & Mark,
M. (1997). Intelligent tutoring goes to school in the big
city. International Journal of Artificial Intelligence in
Education, 8, 30-43.
Koedinger, K. R., & Cross, K. (2000). Making informed
decisions in educational technology design: Toward meta-
cognitive support in a cognitive tutor for geometry.  In
Proceedings of the Annual Meeting of the American
Educational Research Association (AERA), New Orleans,
LA.
Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., &
Chrostowski, S. J. (2004). Findings from IEA’s Trends in
International Mathematics and Science Study at the
fourth and eighth grades. Chestnut Hill, MA: TIMSS &
PIRLS International Study Center, Boston College.
Wilson, L. D. & Blank, R. K. (1999). Improving
mathematics education using results from NAEP and
TIMSS.  Washington, DC: Council of Chief State School
Officers, State Education Assessment Center. Bransford (2000). How people learn: brain, mind, experience, and school National Academy Press.
Heffernan, N.T., & Koedinger, K.R. (1997) The composition effect in symbolizing: The role of symbol production vs. text comprehension. in proceedings of Nineteenth Annual Conference of the Cognitive Science Society, 307-12. Hillsdale, NJ: Erlbaum.
Koedinger, K.R., & Anderson, J.R. (1997). Intelligent Tutoring Goes to School in the Big City. International Journal of Artificial Intelligence in Education 8, 30-43
Koedinger, K. R. & Cross, K. (2000).  Making informed decisions in educational technology design: Toward meta-cognitive support in a cognitive tutor for geometry.  Presented at the annual meeting of the American Educational Research Association, New Orleans, LA. 
Owen, E., & Sweller, J. (1985).  What do students learn while solving mathematics problems?  Journal of Educational Psychology, 77, 272-284.
Simon, H. A., & Lea, G. (1974).  Problem solving and rule induction: A unified view.  In L. W. Gregg (Ed.), Knowledge and cognition. Hillsdale, NJ: Erlbaum.
 
 
[[Category:Empirical Study]]

Revision as of 17:30, 31 March 2007

The Composition Effect - What is the Source of Difficulty in Problems which Require Application of Several Skills?

Ido Roll, Yvonne Kao, Kenneth E. Koedinger

Abstract

This study found that the presence of distracters creates significant difficulty for students solving geometry area problems, but that practice on composite area problems improves students’ ability to ignore distracters. In addition, this study found some support for the hypothesis that the increased spatial processing demands of a complex diagram can negatively impact performance and could be a source of a composition effect in geometry.

Glossary

- Composite problems: Problems which require the application of several skills, such as solving 3x+6=0 for x.

- Single-step problems: Problems which require the application of a single skill, such as y+6=0 or 3x=-6

- DFA (Difficulty Factor Analysis): A test that includes pairs of items varying along one dimension only. It allows to evaluate the difficulty level of the single dimensions along which the problems differ.

- The Composition Effect: The effect according to which composite problems are harder than a set of single-step problems using the same skills.


Research question

What is the main source of difficulty in composite problems?


Background and Significance

Although much work has been done to improve students’ math achievement in the United States, geometry achievement appears to be stagnant. While the 2003 TIMSS found significant gains in U.S. eighth-graders’ algebra performance between 1999 and 2003, it did not find a significant improvement on geometry items between 1999 and 2003 (Gonzales et al., 2004). Furthermore, of the five mathematics content areas assessed by TIMSS, geometry was the weakest for U.S. eighth-graders (Mullis, Martin, Gonzales, & Chrostowski, 2004). While students have often demonstrated reasonable skill in “basic, one-step problems,” (Wilson & Blank, 1999, p. 41) they often struggle with extended, multi-step problems in which they have to construct a free response, rather than selecting a multiple-choice item. Thus it is our goal to examine the sources of difficulty in multi-step geometry problems and to determine how to address these difficulties during instruction. Heffernan and Koedinger (1997) found a composition effect in multi-step algebraic story problems—the probability of correctly completing the multi-step problem was less than the product of the probability of correctly completing each of the subproblems, P(Composite) < P(Subproblem A) × P(Subproblem B). They suggested that this difference could be due to an exponential increase in the number of possible problem-solving actions as a problem became more complex, or it could be due to a missing or over-specialized knowledge component, such as students not understanding that whole subexpressions could be manipulated like single numbers or variables. Our research questions are: is there a composition effect in multi-step geometry area problems, e.g., a problem in which the student must subtract the area of an inner shape from the area of an outer shape to find the area of a shaded region, and if so, what might be the source of the composition effect? Would it be a missing or over-specialized knowledge component, as concluded by Heffernan and Koedinger, or would it be a combinatorial search? In order to answer these questions, we first needed to assess the difficulty of a single-step area problem. Koedinger and Cross (2000) found that the presence of distracter numbers on parallelogram problems—the length of a side was given in addition to the lengths of the height and base—significantly increased the difficulty of the problems due to students’ shallow application of area knowledge. In particular, students seemed to have over- generalized a procedure for finding the area of rectangles— multiplying the lengths of adjacent sides—to parallelograms. In addition, Koedinger and Cross conjectured that non-standard orientations for shapes—non- horizontal bases and non-vertical heights—would also expose students’ shallow knowledge. Given that a multi- step area problem inherently contains distracters and often features shapes that are rotated from their standard orientations, it will be important for us to follow Koedinger and Cross’s lead and get a baseline measure of how distracters and orientation affect performance on single-step area problems. Then we will study how combining single- step area problems into a typical, “find the shaded area” composite area problem effects performance. In these types of problems, students are required to perform three steps: calculate the area of the outer shape, calculate the area of the inner shape, and then subtract the values of the two areas. We believe that we will find a composition effect in these types of geometry area problems. One possible source of the effect is the additional spatial-processing demands placed by a more complex diagram. Koedinger and Anderson (1990) found that a hallmark of geometry expertise was the ability to parse a complex diagram into perceptual chunks that could be used to guide a search of problem-solving schemas. Geometry novices most likely are not able to parse complex diagrams into meaningful perceptual chunks quickly or efficiently and thus increasing diagram complexity could result in increased problem difficulty. This explanation would be more consistent with the combinatorial search explanation for the composition effect than the missing-skill explanation favored by Heffernan and Koedinger. This conjecture leads to an interesting prediction: in contrast to the composition effect found by Heffernan and Koedinger, the probability of correctly completing a composite problem should be greater than the product of the probability of correctly completing each of its three subproblems. This is because in completing a single composite problem, the act of parsing the complex diagram need only be performed once whereas it needs to be performed at least twice when completing the three subproblems separately. This prediction has two corollaries: performance on the Outer Shape subproblem should be lower than performance on a mathematically equivalent problem using a simple diagram, and that the probability of correctly completing a composite problem should be equal to the product of the probabilities of correctly completing the Subtraction subproblem, the Inner Shape subproblem, and a simple-diagram equivalent of the Outer Shape subproblem.

Independent Variables

An instruction in the form of solved-example, targeting a common misconception - identifying base and hight in a cluttered environment.

Dependent variables

Three tests are used in the study: - Pre-test: given before all instruciton - Mid-test: given after students learned about single-step problems and before composite problems - Post-test: after students have learned and practice all material.

The tests include the following items. Some of which are transfer items, evaluating robust learning, since they require and adaptive application of the knowledge learned and practiced in class.

  • Simple diagram:
    1. no distractors, canonical orientation
    2. distractors, canonical orientation
    3. no distractors, tilted orientation
    4. distractors, tilted orientation
  • Complex diagram:
    1. Given complex diagram, ask for skill A
    2. Given complex diagram, ask for skill B
    3. Given steps A and B, ask for skills C (which requires A and B)
    4. Given complex diagram, ask for C (which requires A and B)

Hypothesis

1. Adding distracters to a basic area problem and rotating the figure from its standard orientation will make the problem more difficult. 2. We will find a composition effect in area problems, in that the probability of correctly completing a composite problem is not equal to the product of the probabilities of correctly completing the three subskills: P(Composite) ≠ P(Outer) × P(Inner) × P(Subtract). 3. P(Composite) > P(Outer) × P(Inner) × P(Subtract), P(Outer) < P(Simple Outer Eq.), and P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) due to the demands of spatially parsing the diagram.

Findings

An alpha value of .05 was used for all statistical tests.

Comparison of Mid-test and Post-test Performance

Scores on the pretest were at floor, ranging from 0 to 50% correct (M = 14.94%, SD = 13.61%). Pre-test scores did not correlate significantly with either mid-test scores or post- test scores. Thus we did not analyze the pretest further. Performance on the mid-test and the post-test were significantly correlated (r2 = 0.239, p < .001). A paired t- test found significant gains in overall performance from mid-test to post-test, t(65) = 3.115, p = 0.003, 2-tailed, with highly significant gains on Simple problems, t(65) = 3.104, p = 0.003, 2-tailed, and significant gains on Complex problems, t(65) = 2.308, p = 0.024. Participants performed better on the Simple problems than on the Complex problems. This difference was significant at mid-test, t(65) = 2.214, p = .030, 2-tailed, and at post-test, t(65) = 2.355, p = .022, two-tailed. These results are presented in Table 1. ==== Effects of Distracters, Orientation, and Shape on Simple Problem Performance ==== We used a binary logistic regression analysis to predict the probability that students would answer a Simple problem correctly on the mid-test and the post-test. Distracters, Rotation, and three dummy variables coding diagram shape—parallelogram, pentagon, trapezoid, or triangle— were entered into the equation as predictors.

Table 1: Mean performance on mid-test and post-test by diagram type1

Mid-test (%) Post-test (%) Gain (%) 

Type M SD M SD M SD Overall

65.34 27.79 75.41 23.66 10.07** 26.26 

Simple

68.56 30.79 79.17 23.03 10.61** 27.76 

Complex

61.74 30.13 71.21 31.08 9.47* 33.33 

Simple- Complex 6.82* 25.03 7.96* 27.44

At mid-test, the full model differed significantly from a model with intercept only, χ2 (5, N = 264) = 17.884, Nagelkerke R2 = .092, p = .003. Overall, this model had a success rate of 69.3%, correctly classifying 90.1% of correct responses and 24.1% of incorrect responses. The presence of distracters was a significant predictor of problem accuracy at mid-test. Students were only .541 times as likely to respond correctly when the problem contained distracters. Shape was also a significant predictor. Students were only .484 times as likely to respond correctly when the problem involved a pentagon over a triangle. Rotation was not a significant predictor of problem accuracy. At post-test, the full model differed significantly from a model with intercept only, χ2 (5, N = 264) = 15.533, Nagelkerke R2 = .089, p = .008. Overall, this model had a success rate of 79.2%, correctly classifying 100% of the correct responses, but 0% of incorrect responses. Distracters were no longer a significant predictor at post- test, and Rotation remained a non-significant predictor. Shape remained a significant predictor of problem accuracy. Students were only .335 times as likely to respond correctly when the problem involved a pentagon over a triangle. Effects of Skill Composition and Shape on Complex Problem Performance We used a binary logistic regression analysis to predict the probability that students would answer a Complex problem correctly on the mid-test and the post-test. Whether the problem required an Outer calculation, an Inner calculation, or Subtraction were entered into the equation as predictors. As before, three dummy variables coding diagram shape were entered as well. At mid-test, the full model differed significantly from a model with intercept only, χ2 (6, N = 264) = 12.862, Nagelkerke R2 = .065, p = .045. Overall, this model had a success rate of 64.0%,

1

*p < .05, **p < 0.01 

correctly classifying 85.9% of correct responses and 28.7% of incorrect responses. At post-test, the full model also differed significantly from a model with intercept only, χ2 (6, N = 264) = 25.019, Nagelkerke R2 = .129, p < .001. Overall, this model had a success rate of 70.5%, correctly classifying 95.7% of correct responses and 7.9% of incorrect responses. In both models, the Outer calculation was the only significant predictor of a correct response. When the problem required an Outer calculation—in the Outer and Composite conditions—students were only .439 and .258 times as likely to respond correctly at mid-test and post-test, respectively. Predictors of Performance on Composite Problems We used a binary logistic regression to predict the probability that a student would answer a Composite Complex problem correctly on the mid-test and the post- test, given his/her success on: Inner, Outer, and Subtract, the Distracters+Rotation (DR) Simple problem that is mathematically-equivalent to Outer; and the shape of the diagram—parallelogram, pentagon, trapezoid, or triangle— coded using three dummy variables. This model differed significantly from a model with intercept only for both mid- test, χ2 (7, N = 66) = 26.567, Nagelkerke R2 = .442, p < .001, and post-test, χ2 (7, N = 66) = 30.466, Nagelkerke R2 = .495, p < .001. The mid-test model had overall success rate of 75.8%, correctly classifying 75.0% of correct responses and 76.5% of incorrect responses. The post-test model had an overall success rate of 77.3%, correctly classifying 83.8% of correct responses and 69.0% of incorrect responses. The significant predictor variables in the model changed from mid-test to post-test. At mid-test, Subtract and DR success were significant predictors of a correct response on Composite problems. Students who answered Subtract correctly were 9.168 times more likely to answer Composite correctly. Students who answered DR correctly were 5.891 times more likely to answer Composite correctly. At post- test, Subtract and DR success remained significant predictors with odds ratios of 20.532 and 9.277, respectively. In addition, Outer success became a significant predictor. Students who answered Outer correctly were 6.366 times more likely to answer Composite correctly. Shape was not a significant predictor at either mid-test or post-test. These results are presented in Table 4.

Assessing the Difficulty of Spatial Parsing We took Accuracy(Outer) × Accuracy(Inner) × Accuracy(Subtract) and compared this to Accuracy(Composite) for each student using a paired, 2-tailed t-test. This difference was significant at mid-test, t(65) = 2.193, p = .032. Students performed better on Composite (M = 48.48%, SD = 50.36%) than was predicted by the product (M = 33.33%, SD = 47.52%). This difference was no longer significant at post- test. In contrast, Accuracy(DR) × Accuracy(Inner) × Accuracy(Subtract) did not differ significantly from Accuracy(Composite) at either mid-test or post-test. Paired, 2-tailed t-tests found that performance did not differ significantly between the Outer and DR at either mid-test, t(65) = -.851, p = .398, or post-test, t(65) = -1.356, p = .180. Discussion We will return to our original hypotheses to begin the discussion. It was clear that distracters had a negative impact on Simple performance at mid-test, although this effect had largely disappeared by post-test. Although we did not find significant effects of Rotation on Simple performance, we did find evidence that many students simply rotated the paper until they were viewing the figure in a standard orientation, effectively negating our Rotation manipulation. Thus, our first hypothesis is partially supported. We did find a composition effect in area problems, and the probability of success on Composite problems could not be predicted by simply multiplying the probabilities of success for the three subproblems. Thus, our second hypothesis is supported. We only found partial support for our hypothesis that the source of the composition effect is due to the diagram parsing load. We did find that the probability of success on Composite problems was greater than the product of probabilities for the three subproblems, but only at mid-test. In addition, there were no significant differences between Outer and DR at either mid-test or post-test, although we feel it is worth noting that the data are trending in the predicted direction, with Outer being more difficult than DR on both mid-test and post-test. Finally, our P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a good job of predicting actual performance on Composite problems. To conclude our discussion, we would like to address the differences between performance at mid-test and performance at post-test and the implications for designing instructional interventions. First, we would like to note that instruction in the Area Composition unit of the Geometry Cognitive Tutor was able to improve performance on all skills, not just skills new to composite problems. This suggests that students may not have fully mastered the single-step area skills prior to beginning Area Composition, but that Area Composition continues to provide practice on these skills. Furthermore, the single-step skill practice in the Area Composition unit seems particularly effective at removing the effects of distracters on performance. This makes a great deal of intuitive sense if you consider composite area problems to inherently contain distracters. Second, although we did not find strong support for our contention that spatial parsing is difficult for students, we feel that training students to quickly identify important perceptual chunks can still have a positive impact on performance. If, for example, students were trained to look

Explanation

We will return to our original hypotheses to begin the discussion. It was clear that distracters had a negative impact on Simple performance at mid-test, although this effect had largely disappeared by post-test. Although we did not find significant effects of Rotation on Simple performance, we did find evidence that many students simply rotated the paper until they were viewing the figure in a standard orientation, effectively negating our Rotation manipulation. Thus, our first hypothesis is partially supported. We did find a composition effect in area problems, and the probability of success on Composite problems could not be predicted by simply multiplying the probabilities of success for the three subproblems. Thus, our second hypothesis is supported. We only found partial support for our hypothesis that the source of the composition effect is due to the diagram parsing load. We did find that the probability of success on Composite problems was greater than the product of probabilities for the three subproblems, but only at mid-test. In addition, there were no significant differences between Outer and DR at either mid-test or post-test, although we feel it is worth noting that the data are trending in the predicted direction, with Outer being more difficult than DR on both mid-test and post-test. Finally, our P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a good job of predicting actual performance on Composite problems. To conclude our discussion, we would like to address the differences between performance at mid-test and performance at post-test and the implications for designing instructional interventions. First, we would like to note that instruction in the Area Composition unit of the Geometry Cognitive Tutor was able to improve performance on all skills, not just skills new to composite problems. This suggests that students may not have fully mastered the single-step area skills prior to beginning Area Composition, but that Area Composition continues to provide practice on these skills. Furthermore, the single-step skill practice in the Area Composition unit seems particularly effective at removing the effects of distracters on performance. This makes a great deal of intuitive sense if you consider composite area problems to inherently contain distracters. Second, although we did not find strong support for our contention that spatial parsing is difficult for students, we feel that training students to quickly identify important perceptual chunks can still have a positive impact on performance. If, for example, students were trained to look for a “T” or “L” structure and map the segments onto the base and height of a shape, students might be less prone to developing shallow knowledge about area.


Descendents

Annotated bibliography

Gonzales, P., Guzman, J. C., Partelow, L., Pahlke, E., Jocelyn, L., Kastberg, D., & Williams, T. (2004). Highlights From the Trends in International Mathematics and Science Study: TIMSS 2003. Washington, DC: National Center for Education Statistics. Heffernan, N. T. & Koedinger, K. R. (1997). The composition effect in symbolizing: The role of symbol production vs. text comprehension. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society,307-312. Hillsdale, NJ: Erlbaum. Koedinger, K. R., & Anderson, J. R. (1990). Abstract planning and perceptual chunks: Elements of expertise in geometry. Cognitive Science, 14, 511-550. Koedinger, K. R., Anderson, J. R., Hadley, W. H., & Mark, M. (1997). Intelligent tutoring goes to school in the big city. International Journal of Artificial Intelligence in Education, 8, 30-43. Koedinger, K. R., & Cross, K. (2000). Making informed decisions in educational technology design: Toward meta- cognitive support in a cognitive tutor for geometry. In Proceedings of the Annual Meeting of the American Educational Research Association (AERA), New Orleans, LA. Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., & Chrostowski, S. J. (2004). Findings from IEA’s Trends in International Mathematics and Science Study at the fourth and eighth grades. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Wilson, L. D. & Blank, R. K. (1999). Improving mathematics education using results from NAEP and TIMSS. Washington, DC: Council of Chief State School Officers, State Education Assessment Center. Bransford (2000). How people learn: brain, mind, experience, and school National Academy Press. Heffernan, N.T., & Koedinger, K.R. (1997) The composition effect in symbolizing: The role of symbol production vs. text comprehension. in proceedings of Nineteenth Annual Conference of the Cognitive Science Society, 307-12. Hillsdale, NJ: Erlbaum. Koedinger, K.R., & Anderson, J.R. (1997). Intelligent Tutoring Goes to School in the Big City. International Journal of Artificial Intelligence in Education 8, 30-43 Koedinger, K. R. & Cross, K. (2000). Making informed decisions in educational technology design: Toward meta-cognitive support in a cognitive tutor for geometry. Presented at the annual meeting of the American Educational Research Association, New Orleans, LA. Owen, E., & Sweller, J. (1985). What do students learn while solving mathematics problems? Journal of Educational Psychology, 77, 272-284. Simon, H. A., & Lea, G. (1974). Problem solving and rule induction: A unified view. In L. W. Gregg (Ed.), Knowledge and cognition. Hillsdale, NJ: Erlbaum.